Science.gov

Sample records for activities logistic regression

  1. Logistic Regression

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.

  2. Trashball: A Logistic Regression Classroom Activity

    ERIC Educational Resources Information Center

    Morrell, Christopher H.; Auer, Richard E.

    2007-01-01

    In the early 1990s, the National Science Foundation funded many research projects for improving statistical education. Many of these stressed the need for classroom activities that illustrate important issues of designing experiments, generating quality data, fitting models, and performing statistical tests. Our paper describes such an activity on…

  3. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  4. Steganalysis using logistic regression

    NASA Astrophysics Data System (ADS)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  5. Multinomial logistic regression ensembles.

    PubMed

    Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J

    2013-05-01

    This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model. PMID:23611203

  6. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  7. Fungible weights in logistic regression.

    PubMed

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record PMID:26651981

  8. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  9. Application of frequency ratio and logistic regression to active rock glacier occurrence in the Andes of San Juan, Argentina

    NASA Astrophysics Data System (ADS)

    Angillieri, María Yanina Esper

    2010-01-01

    This study employs statistical modeling techniques and geomorphological mapping to analyze the distribution of active rock glaciers in relation to altitude, aspect, slope, lithology and solar radiation using optical remote sensing techniques with GIS. The study area includes a portion of the Dry Andes of the Cordillera Frontal of San Juan around 30°S latitude, where few geomorphological studies have been conducted. Over 155 rock glaciers have been identified, and 85 are considered active. The relationship between the variables and the rock glaciers distribution was analyzed using the frequency ratio method and logistic regression models. The analytical results show that elevations > 3824 m a.s.l., a south-facing or east-facing aspect, areas with relatively low solar radiation, and slope between 2° and 20° favor the existence of the rock glaciers, and demonstrate that lithology and slope exert major influences.

  10. Satellite rainfall retrieval by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  11. A Logistic Regression Model for Personnel Selection.

    ERIC Educational Resources Information Center

    Raju, Nambury S.; And Others

    1991-01-01

    A two-parameter logistic regression model for personnel selection is proposed. The model was tested with a database of 84,808 military enlistees. The probability of job success was related directly to trait levels, addressing such topics as selection, validity generalization, employee classification, selection bias, and utility-based fair…

  12. Predicting Social Trust with Binary Logistic Regression

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  13. Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

    NASA Astrophysics Data System (ADS)

    Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

    2014-12-01

    Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.

  14. Comparative study of two sparse multinomial logistic regression models in decoding visual stimuli from brain activity of fMRI

    NASA Astrophysics Data System (ADS)

    Song, Sutao; Chen, Gongxiang; Zhan, Yu; Zhang, Jiacai; Yao, Li

    2014-03-01

    Recently, sparse algorithms, such as Sparse Multinomial Logistic Regression (SMLR), have been successfully applied in decoding visual information from functional magnetic resonance imaging (fMRI) data, where the contrast of visual stimuli was predicted by a classifier. The contrast classifier combined brain activities of voxels with sparse weights. For sparse algorithms, the goal is to learn a classifier whose weights distributed as sparse as possible by introducing some prior belief about the weights. There are two ways to introduce a sparse prior constraints for weights: the Automatic Relevance Determination (ARD-SMLR) and Laplace prior (LAP-SMLR). In this paper, we presented comparison results between the ARD-SMLR and LAP-SMLR models in computational time, classification accuracy and voxel selection. Results showed that, for fMRI data, no significant difference was found in classification accuracy between these two methods when voxels in V1 were chosen as input features (totally 1017 voxels). As for computation time, LAP-SMLR was superior to ARD-SMLR; the survived voxels for ARD-SMLR was less than LAP-SMLR. Using simulation data, we confirmed the classification performance for the two SMLR models was sensitive to the sparsity of the initial features, when the ratio of relevant features to the initial features was larger than 0.01, ARD-SMLR outperformed LAP-SMLR; otherwise, LAP-SMLR outperformed LAP-SMLR. Simulation data showed ARD-SMLR was more efficient in selecting relevant features.

  15. Model selection for logistic regression models

    NASA Astrophysics Data System (ADS)

    Duller, Christine

    2012-09-01

    Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.

  16. Transfer Learning Based on Logistic Regression

    NASA Astrophysics Data System (ADS)

    Paul, A.; Rottensteiner, F.; Heipke, C.

    2015-08-01

    In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.

  17. Supporting Regularized Logistic Regression Privately and Efficiently.

    PubMed

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  18. Supporting Regularized Logistic Regression Privately and Efficiently

    PubMed Central

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  19. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  20. Logistic Regression: Going beyond Point-and-Click.

    ERIC Educational Resources Information Center

    King, Jason E.

    A review of the literature reveals that important statistical algorithms and indices pertaining to logistic regression are being underused. This paper describes logistic regression in comparison with discriminant analysis and linear regression, and suggests that some techniques only accessible through computer syntax should be consulted in…

  1. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    PubMed Central

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most

  2. Sparse logistic regression with Lp penalty for biomarker identification.

    PubMed

    Liu, Zhenqiu; Jiang, Feng; Tian, Guoliang; Wang, Suna; Sato, Fumiaki; Meltzer, Stephen J; Tan, Ming

    2007-01-01

    In this paper, we propose a novel method for sparse logistic regression with non-convex regularization Lp (p <1). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional dataset such as gene expression. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an Lp and elastic net (Le) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with that in the literature. Our computational results show that Lp Logistic regression (p <1) outperforms the L1 logistic regression and SCAD SVM. Software is available upon request from the first author. PMID:17402921

  3. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  4. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  5. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  6. Computing measures of explained variation for logistic regression models.

    PubMed

    Mittlböck, M; Schemper, M

    1999-01-01

    The proportion of explained variation (R2) is frequently used in the general linear model but in logistic regression no standard definition of R2 exists. We present a SAS macro which calculates two R2-measures based on Pearson and on deviance residuals for logistic regression. Also, adjusted versions for both measures are given, which should prevent the inflation of R2 in small samples. PMID:10195643

  7. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  8. Residuals and regression diagnostics: focusing on logistic regression.

    PubMed

    Zhang, Zhongheng

    2016-05-01

    Up to now I have introduced most steps in regression model building and validation. The last step is to check whether there are observations that have significant impact on model coefficient and specification. The article firstly describes plotting Pearson residual against predictors. Such plots are helpful in identifying non-linearity and provide hints on how to transform predictors. Next, I focus on observations of outlier, leverage and influence that may have significant impact on model building. Outlier is such an observation that its response value is unusual conditional on covariate pattern. Leverage is an observation with covariate pattern that is far away from the regressor space. Influence is the product of outlier and leverage. That is, when influential observation is dropped from the model, there will be a significant shift of the coefficient. Summary statistics for outlier, leverage and influence are studentized residuals, hat values and Cook's distance. They can be easily visualized with graphs and formally tested using the car package. PMID:27294091

  9. Multinomial logistic regression-based feature selection for hyperspectral data

    NASA Astrophysics Data System (ADS)

    Pal, Mahesh

    2012-02-01

    This paper evaluates the performance of three feature selection methods based on multinomial logistic regression, and compares the performance of the best multinomial logistic regression-based feature selection approach with the support vector machine based recurring feature elimination approach. Two hyperspectral datasets, one consisting of 65 features (DAIS data) and other with 185 features (AVIRIS data) were used. Result suggests that a total of between 15 and 10 features selected by using the multinomial logistic regression-based feature selection approach as proposed by Cawley and Talbot achieve a significant improvement in classification accuracy in comparison to the use of all the features of the DAIS and AVIRIS datasets. In addition to the improved performance, the Cawley and Talbot approach does not require any user-defined parameter, thus avoiding the requirement of a model selection stage. In comparison, the other two multinomial logistic regression-based feature selection approaches require one user-defined parameter and do not perform as well as the Cawley and Talbot approach in terms of (i) the number of features required to achieve classification accuracy comparable to that achieved using the full dataset, and (ii) the classification accuracy achieved by the selected features. The Cawley and Talbot approach was also found to be computationally more efficient than the SVM-RFE technique, though both use the same number of selected features to achieve an equal or even higher level of accuracy than that achieved with full hyperspectral datasets.

  10. Detecting Differential Item Functioning Using Logistic Regression Procedures.

    ERIC Educational Resources Information Center

    Swaminathan, Hariharan; Rogers, H. Jane

    1990-01-01

    A logistic regression model for characterizing differential item functioning (DIF) between two groups is presented. A distinction is drawn between uniform and nonuniform DIF in terms of model parameters. A statistic for testing the hypotheses of no DIF is developed, and simulation studies compare it with the Mantel-Haenszel procedure. (Author/TJH)

  11. Hierarchical Logistic Regression: Accounting for Multilevel Data in DIF Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Finch, W. Holmes

    2010-01-01

    The purpose of this study was to examine the performance of differential item functioning (DIF) assessment in the presence of a multilevel structure that often underlies data from large-scale testing programs. Analyses were conducted using logistic regression (LR), a popular, flexible, and effective tool for DIF detection. Data were simulated…

  12. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    NASA Astrophysics Data System (ADS)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The

  13. Modeling urban growth with geographically weighted multinomial logistic regression

    NASA Astrophysics Data System (ADS)

    Luo, Jun; Kanala, Nagaraj Kapi

    2008-10-01

    Spatial heterogeneity is usually ignored in previous land use change studies. This paper presents a geographically weighted multinomial logistic regression model for investigating multiple land use conversion in the urban growth process. The proposed model makes estimation at each sample location and generates local coefficients of driving factors for land use conversion. A Gaussian function is used for determine the geographic weights guarantying that all other samples are involved in the calibration of the model for one location. A case study on Springfield metropolitan area is conducted. A set of independent variables are selected as driving factors. A traditional multinomial logistic regression model is set up and compared with the proposed model. Spatial variations of coefficients of independent variables are revealed by investigating the estimations at sample locations.

  14. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  15. Comparison of Logistic Regression and Linear Regression in Modeling Percentage Data

    PubMed Central

    Zhao, Lihui; Chen, Yuhuan; Schaffner, Donald W.

    2001-01-01

    Percentage is widely used to describe different results in food microbiology, e.g., probability of microbial growth, percent inactivated, and percent of positive samples. Four sets of percentage data, percent-growth-positive, germination extent, probability for one cell to grow, and maximum fraction of positive tubes, were obtained from our own experiments and the literature. These data were modeled using linear and logistic regression. Five methods were used to compare the goodness of fit of the two models: percentage of predictions closer to observations, range of the differences (predicted value minus observed value), deviation of the model, linear regression between the observed and predicted values, and bias and accuracy factors. Logistic regression was a better predictor of at least 78% of the observations in all four data sets. In all cases, the deviation of logistic models was much smaller. The linear correlation between observations and logistic predictions was always stronger. Validation (accomplished using part of one data set) also demonstrated that the logistic model was more accurate in predicting new data points. Bias and accuracy factors were found to be less informative when evaluating models developed for percentage data, since neither of these indices can compare predictions at zero. Model simplification for the logistic model was demonstrated with one data set. The simplified model was as powerful in making predictions as the full linear model, and it also gave clearer insight in determining the key experimental factors. PMID:11319091

  16. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design. PMID:21091052

  17. Using and Interpreting Logistic Regression: A Guide for Teachers and Students.

    ERIC Educational Resources Information Center

    Lottes, Ilsa L.; And Others

    1996-01-01

    Defines and illustrates basic concepts of dichotomous logistic regression (DLR) and presents strategies for teaching these concepts. Strategies include using analogies between ordinary least squares regression and logistic regression; illustrating concepts with contingency tables; and linking logistic regression concepts to interpretation of…

  18. Robust Logistic and Probit Methods for Binary and Multinomial Regression

    PubMed Central

    Tabatabai, MA; Li, H; Eby, WM; Kengwoung-Keumo, JJ; Manne, U; Bae, S; Fouad, M; Singh, KP

    2015-01-01

    In this paper we introduce new robust estimators for the logistic and probit regressions for binary, multinomial, nominal and ordinal data and apply these models to estimate the parameters when outliers or inluential observations are present. Maximum likelihood estimates don't behave well when outliers or inluential observations are present. One remedy is to remove inluential observations from the data and then apply the maximum likelihood technique on the deleted data. Another approach is to employ a robust technique that can handle outliers and inluential observations without removing any observations from the data sets. The robustness of the method is tested using real and simulated data sets. PMID:26078914

  19. Ranked Tag Recommendation Systems Based on Logistic Regression

    NASA Astrophysics Data System (ADS)

    Quevedo, J. R.; Montañés, E.; Ranilla, J.; Díaz, I.

    This work proposes an approach to tag recommendation based on a logistic regression based system. The goal of the method is to support users of current social network systems by providing a rank of new meaningful tags for a resource. This system provides a ranked tag set and it feeds on different posts depending on the resource for which the user requests the recommendation. The performance of this approach is tested according to several evaluation measures, one of them proposed in this paper (F_1^+). The experiments show that this learning system outperforms certain benchmark recommenders.

  20. [Outliers and robust logistic regression in Health Sciences].

    PubMed

    Cutanda Henríquez, Francisco

    2008-01-01

    Logistic regression methods have many applications in Health Sciences. There is a vast literature about procedures to be followed and the way to find the estimators for the parameters from the observed values, and these methods are implemented to all the usual statistical packages. These estimators are of the 'maximum likelihood' kind, i.e., they are the ones that make the observed values the most probable among all the models that could have been used. The good properties of the maximum likelihood estimators are widely demonstrated. However, there are some practical circumstances that may cause the presence of 'outliers', i.e., observed values not corresponding to the logistic model we are assuming as a hypothesis. Occasionally, these anomalous observations can have a strong effect on the fit, and lead the study to the wrong conclusion. The causes of these outliers depend on the particular study, but it is possible to point out classification errors, observations (subjects) with special features which have not been taken into account, uncertainty in the measurement of some parameters, etc. The problem with maximum likelihood estimators is that they are not 'robust', i.e., their sensitivity to outliers could be arbitrarily large, and a minority of outliers could lead to a wrong logistic model. In this work, we will show two cases illustrating possible consequences, and we will discuss the application of robust methods. PMID:19180273

  1. Evaluating sediment chemistry and toxicity data using logistic regression modeling

    USGS Publications Warehouse

    Field, L.J.; MacDonald, D.D.; Norton, S.B.; Severn, C.G.; Ingersoll, C.G.

    1999-01-01

    This paper describes the use of logistic-regression modeling for evaluating matching sediment chemistry and toxicity data. Contaminant- specific logistic models were used to estimate the percentage of samples expected to be toxic at a given concentration. These models enable users to select the probability of effects of concern corresponding to their specific assessment or management objective or to estimate the probability of observing specific biological effects at any contaminant concentration. The models were developed using a large database (n = 2,524) of matching saltwater sediment chemistry and toxicity data for field-collected samples compiled from a number of different sources and geographic areas. The models for seven chemicals selected as examples showed a wide range in goodness of fit, reflecting high variability in toxicity at low concentrations and limited data on toxicity at higher concentrations for some chemicals. The models for individual test endpoints (e.g., amphipod mortality) provided a better fit to the data than the models based on all endpoints combined. A comparison of the relative sensitivity of two amphipod species to specific contaminants illustrated an important application of the logistic model approach.

  2. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  3. Change point testing in logistic regression models with interaction term.

    PubMed

    Fong, Youyi; Di, Chongzhi; Permar, Sallie

    2015-04-30

    A threshold effect takes place in situations where the relationship between an outcome variable and a predictor variable changes as the predictor value crosses a certain threshold/change point. Threshold effects are often plausible in a complex biological system, especially in defining immune responses that are protective against infections such as HIV-1, which motivates the current work. We study two hypothesis testing problems in change point models. We first compare three different approaches to obtaining a p-value for the maximum of scores test in a logistic regression model with change point variable as a main effect. Next, we study the testing problem in a logistic regression model with the change point variable both as a main effect and as part of an interaction term. We propose a test based on the maximum of likelihood ratios test statistic and obtain its reference distribution through a Monte Carlo method. We also propose a maximum of weighted scores test that can be more powerful than the maximum of likelihood ratios test when we know the direction of the interaction effect. In simulation studies, we show that the proposed tests have a correct type I error and higher power than several existing methods. We illustrate the application of change point model-based testing methods in a recent study of immune responses that are associated with the risk of mother to child transmission of HIV-1. PMID:25612253

  4. Metadata for ReVA logistic regression dataset

    USGS Publications Warehouse

    LaMotte, Andrew E.

    2004-01-01

    The U.S. Geological Survey in cooperation with the U.S. Environmental Protection Agency's Regional Vulnerability Assessment Program, has developed a set of statistical tools to support regional-scale, ground-water quality and vulnerability assessments. The Regional Vulnerability Assessment Program goals are to develop and demonstrate approaches to comprehensive, regional-scale assessments that effectively inform water-resources managers and decision-makers as to the magnitude, extent, distribution, and uncertainty of current and anticipated environmental risks. The U.S. Geological Survey is developing and exploring the use of statistical probability models to characterize the relation between ground-water quality and geographic factors in the Mid-Atlantic Region. Available water-quality data obtained from U.S. Geological Survey National Water-Quality Assessment Program studies conducted in the Mid-Atlantic Region were used in association with geographic data (land cover, geology, soils, and others) to develop logistic-regression equations that use explanatory variables to predict the presence of a selected water-quality parameter exceeding specified management concentration thresholds. The resulting logistic-regression equations were transformed to determine the probability, P(X), of a water-quality parameter exceeding a specified management threshold. Additional statistical procedures modified by the U.S. Geological Survey were used to compare the observed values to model-predicted values at each sample point. In addition, procedures to evaluate the confidence of the model predictions and estimate the uncertainty of the probability value were developed and applied. The resulting logistic-regression models were applied to the Mid-Atlantic Region to predict the spatial probability of nitrate concentrations exceeding specified management thresholds. These thresholds are usually set or established by regulators or managers at national or local levels. At management

  5. Drought Patterns Forecasting using an Auto-Regressive Logistic Model

    NASA Astrophysics Data System (ADS)

    del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

    2014-12-01

    Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.

  6. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    NASA Astrophysics Data System (ADS)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  7. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    ERIC Educational Resources Information Center

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  8. Concept formation vs. logistic regression: predicting death in trauma patients.

    PubMed

    Hadzikadic, M; Hakenewerth, A; Bohren, B; Norton, J; Mehta, B; Andrews, C

    1996-10-01

    This study compares two classification models used to predict survival of injured patients entering the emergency department. Concept formation is a machine learning technique that summarizes known examples cases in the form of a tree. After the tree is constructed, it can then be used to predict the classification of new cases. Logistic regression, on the other hand, is a statistical model that allows for a quantitative relationship for a dichotomous event with several independent variables. The outcome (dependent) variable must have only two choices, e.g. does or does not occur, alive or dead, etc. The result of this model is an equation which is then used to predict the probability of class membership of a new case. The two models were evaluated on a trauma registry database composed of information on all trauma patients admitted in 1992 to a Level I trauma center. A total of 2155 records. representing all trauma patients admitted for more than 24 h or who died in the Emergency Department, were grouped into two databases as follows: (1) discharge status of 'died' (containing 151 records), and (2) any discharge status other than 'died' (containing 2004 records). Both databases contained the same variables. PMID:8955858

  9. Application of Bayesian Logistic Regression to Mining Biomedical Data

    PubMed Central

    Avali, Viji R.; Cooper, Gregory F.; Gopalakrishnan, Vanathi

    2014-01-01

    Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve. PMID:25954328

  10. Efficient Logistic Regression Designs Under an Imperfect Population Identifier

    PubMed Central

    Albert, Paul S.; Liu, Aiyi; Nansel, Tonja

    2013-01-01

    Summary Motivated by actual study designs, this article considers efficient logistic regression designs where the population is identified with a binary test that is subject to diagnostic error. We consider the case where the imperfect test is obtained on all participants, while the gold standard test is measured on a small chosen subsample. Under maximum-likelihood estimation, we evaluate the optimal design in terms of sample selection as well as verification. We show that there may be substantial efficiency gains by choosing a small percentage of individuals who test negative on the imperfect test for inclusion in the sample (e.g., verifying 90% test-positive cases). We also show that a two-stage design may be a good practical alternative to a fixed design in some situations. Under optimal and nearly optimal designs, we compare maximum-likelihood and semi-parametric efficient estimators under correct and misspecified models with simulations. The methodology is illustrated with an analysis from a diabetes behavioral intervention trial. PMID:24261471

  11. Risk factors for temporomandibular disorder: Binary logistic regression analysis

    PubMed Central

    Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.

    2014-01-01

    Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706

  12. Autoregressive logistic regression applied to atmospheric circulation patterns

    NASA Astrophysics Data System (ADS)

    Guanche, Y.; Mínguez, R.; Méndez, F. J.

    2014-01-01

    Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.

  13. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  14. Mixed-Effects Logistic Regression Models for Indirectly Observed Discrete Outcome Variables

    ERIC Educational Resources Information Center

    Vermunt, Jeroen K.

    2005-01-01

    A well-established approach to modeling clustered data introduces random effects in the model of interest. Mixed-effects logistic regression models can be used to predict discrete outcome variables when observations are correlated. An extension of the mixed-effects logistic regression model is presented in which the dependent variable is a latent…

  15. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  16. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  17. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…

  18. Regularized logistic regression and multiobjective variable selection for classifying MEG data.

    PubMed

    Santana, Roberto; Bielza, Concha; Larrañaga, Pedro

    2012-09-01

    This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori. PMID:22854976

  19. Nomographic representation of logistic regression models: a case study using patient self-assessment data.

    PubMed

    Dreiseitl, Stephan; Harbauer, Alexandra; Binder, Michael; Kittler, Harald

    2005-10-01

    Logistic regression models are widely used in medicine, but difficult to apply without the aid of electronic devices. In this paper, we present a novel approach to represent logistic regression models as nomograms that can be evaluated by simple line drawings. As a case study, we show how data obtained from a questionnaire-based patient self-assessment study on the risks of developing melanoma can be used to first identify a subset of significant covariates, build a logistic regression model, and finally transform the model to a graphical format. The advantage of the nomogram is that it can easily be mass-produced, distributed and evaluated, while providing the same information as the logistic regression model it represents. PMID:16198997

  20. A solution to the problem of separation in logistic regression.

    PubMed

    Heinze, Georg; Schemper, Michael

    2002-08-30

    The phenomenon of separation or monotone likelihood is observed in the fitting process of a logistic model if the likelihood converges while at least one parameter estimate diverges to +/- infinity. Separation primarily occurs in small samples with several unbalanced and highly predictive risk factors. A procedure by Firth originally developed to reduce the bias of maximum likelihood estimates is shown to provide an ideal solution to separation. It produces finite parameter estimates by means of penalized maximum likelihood estimation. Corresponding Wald tests and confidence intervals are available but it is shown that penalized likelihood ratio tests and profile penalized likelihood confidence intervals are often preferable. The clear advantage of the procedure over previous options of analysis is impressively demonstrated by the statistical analysis of two cancer studies. PMID:12210625

  1. Weighted pseudometric discriminatory power improvement using a Bayesian logistic regression model based on a variational method.

    PubMed

    Ksantini, Riadh; Ziou, Djemel; Colin, Bernard; Dubeau, François

    2008-02-01

    In this paper, we investigate the effectiveness of a Bayesian logistic regression model to compute the weights of a pseudo-metric, in order to improve its discriminatory capacity and thereby increase image retrieval accuracy. In the proposed Bayesian model, the prior knowledge of the observations is incorporated and the posterior distribution is approximated by a tractable Gaussian form using variational transformation and Jensen's inequality, which allow a fast and straightforward computation of the weights. The pseudo-metric makes use of the compressed and quantized versions of wavelet decomposed feature vectors, and in our previous work, the weights were adjusted by classical logistic regression model. A comparative evaluation of the Bayesian and classical logistic regression models is performed for content-based image retrieval as well as for other classification tasks, in a decontextualized evaluation framework. In this same framework, we compare the Bayesian logistic regression model to some relevant state-of-the-art classification algorithms. Experimental results show that the Bayesian logistic regression model outperforms these linear classification algorithms, and is a significantly better tool than the classical logistic regression model to compute the pseudo-metric weights and improve retrieval and classification performance. Finally, we perform a comparison with results obtained by other retrieval methods. PMID:18084057

  2. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  3. Performance and strategy comparisons of human listeners and logistic regression in discriminating underwater targets.

    PubMed

    Yang, Lixue; Chen, Kean

    2015-11-01

    To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy. PMID:26627787

  4. Solving Logistic Regression with Group Cardinality Constraints for Time Series Analysis

    PubMed Central

    Zhang, Yong; Pohl, Kilian M.

    2016-01-01

    We propose an algorithm to distinguish 3D+t images of healthy from diseased subjects by solving logistic regression based on cardinality constrained, group sparsity. This method reduces the risk of overfitting by providing an elegant solution to identifying anatomical regions most impacted by disease. It also ensures that consistent identification across the time series by grouping each image feature across time and counting the number of non-zero groupings. While popular in medical imaging, group cardinality constrained problems are generally solved by relaxing counting with summing over the groupings. We instead solve the original problem by generalizing a penalty decomposition algorithm, which alternates between minimizing a logistic regression function with a regularizer based on the Frobenius norm and enforcing sparsity. Applied to 86 cine MRIs of healthy cases and subjects with Tetralogy of Fallot (TOF), our method correctly identifies regions impacted by TOF and obtains a statistically significant higher classification accuracy than logistic regression without and relaxed grouped sparsity constraint.

  5. Use and interpretation of logistic regression in habitat-selection studies

    USGS Publications Warehouse

    Keating, K.A.; Cherry, S.

    2004-01-01

     Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in

  6. Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy

    PubMed Central

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031

  7. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  8. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  9. Optimizing the Classification Performance of Logistic Regression and Fisher's Discriminant Analyses.

    ERIC Educational Resources Information Center

    Yarnold, Paul R.; And Others

    1994-01-01

    A methodology is proposed to optimize the training classification performance of any suboptimal model. The method, referred to as univariate optimal discriminant analysis (UniODA), is illustrated through application to a two-group logistic regression analysis with 12 empirical examples. Maximizing percentage accuracy in classification is…

  10. Heteroscedastic Extended Logistic Regression for Post-Processing of Ensemble Guidance

    NASA Astrophysics Data System (ADS)

    Messner, Jakob W.; Mayr, Georg J.; Wilks, Daniel S.; Zeileis, Achim

    2014-05-01

    To achieve well-calibrated probabilistic weather forecasts, numerical ensemble forecasts are often statistically post-processed. One recent ensemble-calibration method is extended logistic regression which extends the popular logistic regression to yield full probability distribution forecasts. Although the purpose of this method is to post-process ensemble forecasts, usually only the ensemble mean is used as predictor variable, whereas the ensemble spread is neglected because it does not improve the forecasts. In this study we show that when simply used as ordinary predictor variable in extended logistic regression, the ensemble spread only affects the location but not the variance of the predictive distribution. Uncertainty information contained in the ensemble spread is therefore not utilized appropriately. To solve this drawback we propose a new approach where the ensemble spread is directly used to predict the dispersion of the predictive distribution. With wind speed data and ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) we show that using this approach, the ensemble spread can be used effectively to improve forecasts from extended logistic regression.

  11. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  12. The Use and Interpretation of Logistic Regression in Higher Education Journals: 1988-1999.

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; So, Tak-Shing Harry; Stage, Frances K.; St. John, Edward P.

    2002-01-01

    Examined use and interpretation of logistic regression in leading higher education journals. Found increasingly sophisticated use for a wide range of topics; however, there continued to be confusion over terminology. Sample sizes did not always achieve a desired level of stability in parameters estimated, and discussion of results in terms of…

  13. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  14. Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.

    ERIC Educational Resources Information Center

    Meshbane, Alice; Morris, John D.

    A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…

  15. Iterative Purification and Effect Size Use with Logistic Regression for Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Maller, Susan J.

    2007-01-01

    Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…

  16. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  17. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  18. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    ERIC Educational Resources Information Center

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  19. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    ERIC Educational Resources Information Center

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  20. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    ERIC Educational Resources Information Center

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  1. Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models

    NASA Astrophysics Data System (ADS)

    Das, Iswar; Stein, Alfred; Kerle, Norman; Dadhwal, Vinay K.

    2012-12-01

    Landslide susceptibility mapping (LSM) along road corridors in the Indian Himalayas is an essential exercise that helps planners and decision makers in determining the severity of probable slope failure areas. Logistic regression is commonly applied for this purpose, as it is a robust and straightforward technique that is relatively easy to handle. Ordinary logistic regression as a data-driven technique, however, does not allow inclusion of prior information. This study presents Bayesian logistic regression (BLR) for landslide susceptibility assessment along road corridors. The methodology is tested in a landslide-prone area in the Bhagirathi river valley in the Indian Himalayas. Parameter estimates from BLR are compared with those obtained from ordinary logistic regression. By means of iterative Markov Chain Monte Carlo simulation, BLR provides a rich set of results on parameter estimation. We assessed model performance by the receiver operator characteristics curve analysis, and validated the model using 50% of the landslide cells kept apart for testing and validation. The study concludes that BLR performs better in posterior parameter estimation in general and the uncertainty estimation in particular.

  2. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  3. Odds Ratio, Delta, ETS Classification, and Standardization Measures of DIF Magnitude for Binary Logistic Regression

    ERIC Educational Resources Information Center

    Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.

    2007-01-01

    Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…

  4. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    ERIC Educational Resources Information Center

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  5. Mixed-Effects Logistic Regression for Estimating Transitional Probabilities in Sequentially Coded Observational Data

    ERIC Educational Resources Information Center

    Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman

    2007-01-01

    This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…

  6. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    ERIC Educational Resources Information Center

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  7. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  8. Using logistic regression to describe the length of breastfeeding: a study in Guadalajara, Mexico.

    PubMed

    Gonzalez-Perez, G J; Vega-Lopez, M G; Cabrera-Pivaral, C

    1998-12-01

    This study seeks, through a logistic regression model, to describe the pattern of breastfeeding duration in Guadalajara, Mexico, during 1993. A multistage random sample of children under 1 year of age (n = 1036) was studied; observational data regarding breastfeeding duration, obtained through a "status quo" procedure, were compared with prevalence rates obtained from the logistic regression model. Modeling the duration of breastfeeding during the first year of life rather than only analyzing observational data helps researchers to understand this process in a dynamic and quantitative way. For example, uncommon indicators of breastfeeding were derived from the model. These indicators are impossible to obtain from observational data. The prevalence curve estimated through the logistic model was adequately fitted to observed data: there were no significant differences between the number or distribution of breastfed infants observed and those predicted by the model. Moreover, the model revealed that less than 40% of the children were breastfed in the fourth month of life; the median age for weaning was 39.3 days; 55% of the potential breastfeeding in the first 4 months did not occur; and the greatest abandonment of breastfeeding in the first 4 months was observed in the first 60 days. Thus, logistic regression seems a suitable option to construct a population-based model that describes breastfeeding duration during the first year of life. The indicators derived from the model offer health care providers valuable information for developing programs that promote breastfeeding. PMID:10205448

  9. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  10. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  11. Using probabilities of enterococci exceedance and logistic regression to evaluate long term weekly beach monitoring data.

    PubMed

    Aranda, Diana; Lopez, Jose V; Solo-Gabriele, Helena M; Fleisher, Jay M

    2016-02-01

    Recreational water quality surveillance involves comparing bacterial levels to set threshold values to determine beach closure. Bacterial levels can be predicted through models which are traditionally based upon multiple linear regression. The objective of this study was to evaluate exceedance probabilities, as opposed to bacterial levels, as an alternate method to express beach risk. Data were incorporated into a logistic regression for the purpose of identifying environmental parameters most closely correlated with exceedance probabilities. The analysis was based on 7,422 historical sample data points from the years 2000-2010 for 15 South Florida beach sample sites. Probability analyses showed which beaches in the dataset were most susceptible to exceedances. No yearly trends were observed nor were any relationships apparent with monthly rainfall or hurricanes. Results from logistic regression analyses found that among the environmental parameters evaluated, tide was most closely associated with exceedances, with exceedances 2.475 times more likely to occur at high tide compared to low tide. The logistic regression methodology proved useful for predicting future exceedances at a beach location in terms of probability and modeling water quality environmental parameters with dependence on a binary response. This methodology can be used by beach managers for allocating resources when sampling more than one beach. PMID:26837832

  12. Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression

    NASA Astrophysics Data System (ADS)

    Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin

    2016-06-01

    Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.

  13. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-01-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651

  14. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    PubMed

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making. PMID:16645908

  15. Predicting Outcomes of Hospitalization for Heart Failure Using Logistic Regression and Knowledge Discovery Methods

    PubMed Central

    Phillips, Kirk T.; Street, W. Nick

    2005-01-01

    The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367

  16. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  17. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA

    USGS Publications Warehouse

    Ohlmacher, G.C.; Davis, J.C.

    2003-01-01

    Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

  18. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    PubMed

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research. PMID:23156922

  19. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    PubMed

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification. PMID:26520484

  20. Statistical Comparison of Drastic and Ordinal Logistic Regression for Shallow Aquifers Over Conterminous USA

    NASA Astrophysics Data System (ADS)

    Kumarasamy, K.; Kaluarachchi, J.

    2006-12-01

    Groundwater is an important natural resource for numerous human activities, accounting for more than 50% of the total water used in the United States. It is vulnerable to contamination by several organic and inorganic pollutants such as nitrates, heavy metals, and pesticides etc. Assessment of groundwater vulnerability aids in the management of limited groundwater resources. The focus of this thesis is to (1) statistically compare two groundwater vulnerability assessment models; DRASTIC (Acronym for Depth to water, Net recharge, Aquifer Media, Soil media, Topography, Impact of vadose zone, and Hydraulic conductivity of aquifer) and Ordinal Logistic Regression on a national scale by using nitrate concentration as the performance indicator. (2) Analyze any discrepancies in the predictability of each of these models. (3) The advantage of each of the above mentioned models with respect to the knowledge, expertise, time, data requirement and its ability to predict the vulnerability. The breadth of nitrate concentration data in groundwater allows for a reliable comparison of the two models. This work will contribute towards the ultimate motive of developing a universal method to predict the vulnerability of groundwater.

  1. Analysis of sparse data in logistic regression in medical research: A newer approach

    PubMed Central

    Devika, S; Jeyaseelan, L; Sebastian, G

    2016-01-01

    Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs) with very wide 95% confidence interval (CI) (OR: >999.999, 95% CI: <0.001, >999.999). In this paper, we addressed this issue by using penalized logistic regression (PLR) method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13%) of the cases and in four (8.0%) of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0%) were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: <0.001, >999.999) whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48) using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86) times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41) using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in

  2. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features

    PubMed Central

    Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran

    2015-01-01

    Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087

  3. Comparing performances of heuristic and logistic regression models for a spatial landslide susceptibility assessment in Maramures, County, Northwestern Romania

    NASA Astrophysics Data System (ADS)

    Mǎgut, F. L.; Zaharia, S.; Glade, T.; Irimuş, I. A.

    2012-04-01

    Various methods exist in analyzing spatial landslide susceptibility and classing the results in susceptibility classes. The prediction of spatial landslide distribution can be performed by using a variety of methods based on GIS techniques. The two very common methods of a heuristic assessment and a logistic regression model are employed in this study in order to compare their performance in predicting the spatial distribution of previously mapped landslides for a study area located in Maramureš County, in Northwestern Romania. The first model determines a susceptibility index by combining the heuristic approach with GIS techniques of spatial data analysis. The criteria used for quantifying each susceptibility factor and the expression used to determine the susceptibility index are taken from the Romanian legislation (Governmental Decision 447/2003). This procedure is followed in any Romanian state-ordered study which relies on financial support. The logistic regression model predicts the spatial distribution of landslides by statistically calculating regressive coefficients which describe the dependency of previously mapped landslides on different factors. The identified shallow landslides correspond generally to Pannonian marl and Quaternary contractile clay deposits. The study region is located in the Northwestern part of Romania, including the Baia Mare municipality, the capital of Maramureš County. The study focuses on the former piedmontal region situated to the south of the volcanic mountains Gutâi, in the Baia Mare Depression, where most of the landslide activity has been recorded. In addition, a narrow sector of the volcanic mountains which borders the city of Baia Mare to the north has also been included to test the accuracy of the models in different lithologic units. The results of both models indicate a general medium landslide susceptibility of the study area. The more detailed differences will be discussed with respect to the advantages and

  4. A logistic regression equation for estimating the probability of a stream in Vermont having intermittent flow

    USGS Publications Warehouse

    Olson, Scott A.; Brouillette, Michael C.

    2006-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.

  5. SETUP OF RESOLUTIVE CRITERION FOR SEDIMENT-RELATED DISASTER WARNING INFORMATION USING LOGISTIC REGRESSION ANALYSIS

    NASA Astrophysics Data System (ADS)

    Sugihara, Shigemitsu; Shinozaki, Tsuguhiro; Ohishi, Hiroyuki; Araki, Yoshinori; Furukawa, Kohei

    It is difficult to deregulate sediment-related disaster warning information, for the reason that it is difficult to quantify the risk of disaster after the heavy rain. If we can quantify the risk according to the rain situation, it will be an indication of deregulation. In this study, using logistic regression analysis, we quantified the risk according to the rain situation as the probability of disaster occurrence. And we analyzed the setup of resolutive criterion for sediment-related disaster warning information. As a result, we can improve convenience of the evaluation method of probability of disaster occurrence, which is useful to provide information of imminently situation.

  6. Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using sparse logistic regression.

    PubMed

    Hu, Wei

    2013-01-01

    We utilised Sparse Logistic Regression (SLR) to build two sparse and interpretable predictors. The first one (SLR-65) was based on a signature consisting of the top 65 probe sets (59 genes) differentially expressed between Pathologic Complete Response (PCR) and Residual Disease (RD) cases, and the second one (SLR-Notch) was based on the genes involved in the Notch singling related pathways (113 genes). The two predictors produced better predictions than the predictor in a previous study. The SLR-65 selected 16 informative genes and the SLR-Notch selected 12 informative genes. PMID:23649738

  7. [Abnormally broad confidence intervals in logistic regression: interpretation of results of statistical programs].

    PubMed

    de Irala, J; Fernandez-Crehuet Navajas, R; Serrano del Castillo, A

    1997-03-01

    This study describes the behavior of eight statistical programs (BMDP, EGRET, JMP, SAS, SPSS, STATA, STATISTIX, and SYSTAT) when performing a logistic regression with a simulated data set that contains a numerical problem created by the presence of a cell value equal to zero. The programs respond in different ways to this problem. Most of them give a warning, although many simultaneously present incorrect results, among which are confidence intervals that tend toward infinity. Such results can mislead the user. Various guidelines are offered for detecting these problems in actual analyses, and users are reminded of the importance of critical interpretation of the results of statistical programs. PMID:9162592

  8. Penalised logistic regression and dynamic prediction for discrete-time recurrent event data.

    PubMed

    Elgmati, Entisar; Fiaccone, Rosemeire L; Henderson, R; Matthews, John N S

    2015-10-01

    We consider methods for the analysis of discrete-time recurrent event data, when interest is mainly in prediction. The Aalen additive model provides an extremely simple and effective method for the determination of covariate effects for this type of data, especially in the presence of time-varying effects and time varying covariates, including dynamic summaries of prior event history. The method is weakened for predictive purposes by the presence of negative estimates. The obvious alternative of a standard logistic regression analysis at each time point can have problems of stability when event frequency is low and maximum likelihood estimation is used. The Firth penalised likelihood approach is stable but in removing bias in regression coefficients it introduces bias into predicted event probabilities. We propose an alterative modified penalised likelihood, intermediate between Firth and no penalty, as a pragmatic compromise between stability and bias. Illustration on two data sets is provided. PMID:25626559

  9. Predicting pilot-error incidents of US airline pilots using logistic regression.

    PubMed

    McFadden, K L

    1997-06-01

    In a population of 70,164 airline pilots obtained from the Federal Aviation Administration, 475 males and 22 females had pilot-error incidents in the years 1986-1992. A simple chi-squared test revealed that female pilots employed by major airlines had a significantly greater likelihood of pilot-error incidents than their male colleagues. In order to control for age, experience (total flying hours), risk exposure (recent flying hours) and employer (major/non-major airline) simultaneously, the author built a model of male pilot-error incidents using logistic regression. The regression analysis indicated that youth, inexperience and non-major airline employer were independent contributors to the increased risk of pilot-error incidents. The results also provide further support to the literature that pilot performance does not differ significantly between male and female airline pilots. PMID:9414359

  10. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    ERIC Educational Resources Information Center

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  11. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    NASA Astrophysics Data System (ADS)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  12. Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression

    PubMed Central

    Bartlett, Jonathan W.; Harel, Ofer; Carpenter, James R.

    2015-01-01

    Missing data are a commonly occurring threat to the validity and efficiency of epidemiologic studies. Perhaps the most common approach to handling missing data is to simply drop those records with 1 or more missing values, in so-called “complete records” or “complete case” analysis. In this paper, we bring together earlier-derived yet perhaps now somewhat neglected results which show that a logistic regression complete records analysis can provide asymptotically unbiased estimates of the association of an exposure of interest with an outcome, adjusted for a number of confounders, under a surprisingly wide range of missing-data assumptions. We give detailed guidance describing how the observed data can be used to judge the plausibility of these assumptions. The results mean that in large epidemiologic studies which are affected by missing data and analyzed by logistic regression, exposure associations may be estimated without bias in a number of settings where researchers might otherwise assume that bias would occur. PMID:26429998

  13. Reliability estimation for cutting tools based on logistic regression model using vibration signals

    NASA Astrophysics Data System (ADS)

    Chen, Baojia; Chen, Xuefeng; Li, Bing; He, Zhengjia; Cao, Hongrui; Cai, Gaigai

    2011-10-01

    As an important part of CNC machine, the reliability of cutting tools influences the whole manufacturing effectiveness and stability of equipment. The present study proposes a novel reliability estimation approach to the cutting tools based on logistic regression model by using vibration signals. The operation condition information of the CNC machine is incorporated into reliability analysis to reflect the product time-varying characteristics. The proposed approach is superior to other degradation estimation methods in that it does not necessitate any assumption about degradation paths and probability density functions of condition parameters. The three steps of new reliability estimation approach for cutting tools are as follows. First, on-line vibration signals of cutting tools are measured during the manufacturing process. Second, wavelet packet (WP) transform is employed to decompose the original signals and correlation analysis is employed to find out the feature frequency bands which indicate tool wear. Third, correlation analysis is also used to select the salient feature parameters which are composed of feature band energy, energy entropy and time-domain features. Finally, reliability estimation is carried out based on logistic regression model. The approach has been validated on a NC lathe. Under different failure threshold, the reliability and failure time of the cutting tools are all estimated accurately. The positive results show the plausibility and effectiveness of the proposed approach, which can facilitate machine performance and reliability estimation.

  14. Grid Binary LOgistic REgression (GLORE): building shared models without sharing data

    PubMed Central

    Jiang, Xiaoqian; Kim, Jihoon; Ohno-Machado, Lucila

    2012-01-01

    Objective The classification of complex or rare patterns in clinical and genomic data requires the availability of a large, labeled patient set. While methods that operate on large, centralized data sources have been extensively used, little attention has been paid to understanding whether models such as binary logistic regression (LR) can be developed in a distributed manner, allowing researchers to share models without necessarily sharing patient data. Material and methods Instead of bringing data to a central repository for computation, we bring computation to the data. The Grid Binary LOgistic REgression (GLORE) model integrates decomposable partial elements or non-privacy sensitive prediction values to obtain model coefficients, the variance-covariance matrix, the goodness-of-fit test statistic, and the area under the receiver operating characteristic (ROC) curve. Results We conducted experiments on both simulated and clinically relevant data, and compared the computational costs of GLORE with those of a traditional LR model estimated using the combined data. We showed that our results are the same as those of LR to a 10−15 precision. In addition, GLORE is computationally efficient. Limitation In GLORE, the calculation of coefficient gradients must be synchronized at different sites, which involves some effort to ensure the integrity of communication. Ensuring that the predictors have the same format and meaning across the data sets is necessary. Conclusion The results suggest that GLORE performs as well as LR and allows data to remain protected at their original sites. PMID:22511014

  15. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package

    PubMed Central

    Reid, Stephen; Tibshirani, Rob

    2014-01-01

    We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587

  16. A Logistic Regression Mixture Model for Interval Mapping of Genetic Trait Loci Affecting Binary Phenotypes

    PubMed Central

    Deng, Weiping; Chen, Hanfeng; Li, Zhaohai

    2006-01-01

    Often in genetic research, presence or absence of a disease is affected by not only the trait locus genotypes but also some covariates. The finite logistic regression mixture models and the methods under the models are developed for detection of a binary trait locus (BTL) through an interval-mapping procedure. The maximum-likelihood estimates (MLEs) of the logistic regression parameters are asymptotically unbiased. The null asymptotic distributions of the likelihood-ratio test (LRT) statistics for detection of a BTL are found to be given by the supremum of a χ2-process. The limiting null distributions are free of the null model parameters and are determined explicitly through only four (backcross case) or nine (intercross case) independent standard normal random variables. Therefore a threshold for detecting a BTL in a flanking marker interval can be approximated easily by using a Monte Carlo method. It is pointed out that use of a threshold incorrectly determined by reading off a χ2-probability table can result in an excessive false BTL detection rate much more severely than many researchers might anticipate. Simulation results show that the BTL detection procedures based on the thresholds determined by the limiting distributions perform quite well when the sample sizes are moderately large. PMID:16272416

  17. Sub-resolution assist feature (SRAF) printing prediction using logistic regression

    NASA Astrophysics Data System (ADS)

    Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei

    2015-03-01

    In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.

  18. Predicting students' success at pre-university studies using linear and logistic regressions

    NASA Astrophysics Data System (ADS)

    Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir

    2014-09-01

    The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.

  19. Integration of logistic regression, Markov chain and cellular automata models to simulate urban expansion

    NASA Astrophysics Data System (ADS)

    Jokar Arsanjani, Jamal; Helbich, Marco; Kainz, Wolfgang; Darvishi Boloorani, Ali

    2013-04-01

    This research analyses the suburban expansion in the metropolitan area of Tehran, Iran. A hybrid model consisting of logistic regression model, Markov chain (MC), and cellular automata (CA) was designed to improve the performance of the standard logistic regression model. Environmental and socio-economic variables dealing with urban sprawl were operationalised to create a probability surface of spatiotemporal states of built-up land use for the years 2006, 2016, and 2026. For validation, the model was evaluated by means of relative operating characteristic values for different sets of variables. The approach was calibrated for 2006 by cross comparing of actual and simulated land use maps. The achieved outcomes represent a match of 89% between simulated and actual maps of 2006, which was satisfactory to approve the calibration process. Thereafter, the calibrated hybrid approach was implemented for forthcoming years. Finally, future land use maps for 2016 and 2026 were predicted by means of this hybrid approach. The simulated maps illustrate a new wave of suburban development in the vicinity of Tehran at the western border of the metropolis during the next decades.

  20. Logistic regression function for detection of suspicious performance during baseline evaluations using concussion vital signs.

    PubMed

    Hill, Benjamin David; Womble, Melissa N; Rohling, Martin L

    2015-01-01

    This study utilized logistic regression to determine whether performance patterns on Concussion Vital Signs (CVS) could differentiate known groups with either genuine or feigned performance. For the embedded measure development group (n = 174), clinical patients and undergraduate students categorized as feigning obtained significantly lower scores on the overall test battery mean for the CVS, Shipley-2 composite score, and California Verbal Learning Test-Second Edition subtests than did genuinely performing individuals. The final full model of 3 predictor variables (Verbal Memory immediate hits, Verbal Memory immediate correct passes, and Stroop Test complex reaction time correct) was significant and correctly classified individuals in their known group 83% of the time (sensitivity = .65; specificity = .97) in a mixed sample of young-adult clinical cases and simulators. The CVS logistic regression function was applied to a separate undergraduate college group (n = 378) that was asked to perform genuinely and identified 5% as having possibly feigned performance indicating a low false-positive rate. The failure rate was 11% and 16% at baseline cognitive testing in samples of high school and college athletes, respectively. These findings have particular relevance given the increasing use of computerized test batteries for baseline cognitive testing and return-to-play decisions after concussion. PMID:25371976

  1. Variable selection for logistic regression using a prediction-focused information criterion.

    PubMed

    Claeskens, Gerda; Croux, Christophe; Van Kerckhoven, Johan

    2006-12-01

    In biostatistical practice, it is common to use information criteria as a guide for model selection. We propose new versions of the focused information criterion (FIC) for variable selection in logistic regression. The FIC gives, depending on the quantity to be estimated, possibly different sets of selected variables. The standard version of the FIC measures the mean squared error of the estimator of the quantity of interest in the selected model. In this article, we propose more general versions of the FIC, allowing other risk measures such as the one based on L(p) error. When prediction of an event is important, as is often the case in medical applications, we construct an FIC using the error rate as a natural risk measure. The advantages of using an information criterion which depends on both the quantity of interest and the selected risk measure are illustrated by means of a simulation study and application to a study on diabetic retinopathy. PMID:17156270

  2. Accounting for informatively missing data in logistic regression by means of reassessment sampling.

    PubMed

    Lin, Ji; Lyles, Robert H

    2015-05-20

    We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. PMID:25707010

  3. Using GIS and logistic regression to estimate agricultural chemical concentrations in rivers of the midwestern USA

    USGS Publications Warehouse

    Battaglin, W.A.

    1996-01-01

    Agricultural chemicals (herbicides, insecticides, other pesticides and fertilizers) in surface water may constitute a human health risk. Recent research on unregulated rivers in the midwestern USA documents that elevated concentrations of herbicides occur for 1-4 months following application in spring and early summer. In contrast, nitrate concentrations in unregulated rivers are elevated during the fall, winter and spring. Natural and anthropogenic variables of river drainage basins, such as soil permeability, the amount of agricultural chemicals applied or percentage of land planted in corn, affect agricultural chemical concentrations in rivers. Logistic regression (LGR) models are used to investigate relations between various drainage basin variables and the concentration of selected agricultural chemicals in rivers. The method is successful in contributing to the understanding of agricultural chemical concentration in rivers. Overall accuracies of the best LGR models, defined as the number of correct classifications divided by the number of attempted classifications, averaged about 66%.

  4. Statistical modelling for thoracic surgery using a nomogram based on logistic regression.

    PubMed

    Liu, Run-Zhong; Zhao, Ze-Rui; Ng, Calvin S H

    2016-08-01

    A well-developed clinical nomogram is a popular decision-tool, which can be used to predict the outcome of an individual, bringing benefits to both clinicians and patients. With just a few steps on a user-friendly interface, the approximate clinical outcome of patients can easily be estimated based on their clinical and laboratory characteristics. Therefore, nomograms have recently been developed to predict the different outcomes or even the survival rate at a specific time point for patients with different diseases. However, on the establishment and application of nomograms, there is still a lot of confusion that may mislead researchers. The objective of this paper is to provide a brief introduction on the history, definition, and application of nomograms and then to illustrate simple procedures to develop a nomogram with an example based on a multivariate logistic regression model in thoracic surgery. In addition, validation strategies and common pitfalls have been highlighted. PMID:27621910

  5. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    NASA Astrophysics Data System (ADS)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  6. Inconsistent treatment estimates from mis-specified logistic regression analyses of randomized trials.

    PubMed

    Matthews, J N S; Badi, N H

    2015-08-30

    When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However, when the treatment effect is assessed by other summaries, for example by an odds ratio if the outcome is binary, then bias can arise if some covariates are omitted, regardless of the use of randomization for treatment allocation or the size of the trial. We present accurate closed-form approximations for this asymptotic bias when important normally distributed covariates are omitted from a logistic regression. We compare this approximation with ones in the literature and derive more convenient forms for some of these existing results. The expressions give insight into the form of the bias, which simulations show is usable for distributions other than the normal. The key result applies even when there are additional binary covariates in the model. PMID:25869059

  7. Statistical modelling for thoracic surgery using a nomogram based on logistic regression

    PubMed Central

    Liu, Run-Zhong; Zhao, Ze-Rui

    2016-01-01

    A well-developed clinical nomogram is a popular decision-tool, which can be used to predict the outcome of an individual, bringing benefits to both clinicians and patients. With just a few steps on a user-friendly interface, the approximate clinical outcome of patients can easily be estimated based on their clinical and laboratory characteristics. Therefore, nomograms have recently been developed to predict the different outcomes or even the survival rate at a specific time point for patients with different diseases. However, on the establishment and application of nomograms, there is still a lot of confusion that may mislead researchers. The objective of this paper is to provide a brief introduction on the history, definition, and application of nomograms and then to illustrate simple procedures to develop a nomogram with an example based on a multivariate logistic regression model in thoracic surgery. In addition, validation strategies and common pitfalls have been highlighted. PMID:27621910

  8. Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression

    NASA Astrophysics Data System (ADS)

    Dineva, S.; Prodanova, K.; Mlachkova, D.

    2013-12-01

    The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.

  9. Prediction of occult neck disease in laryngeal cancer by means of a logistic regression statistical model.

    PubMed

    Ghouri, A F; Zamora, R L; Sessions, D G; Spitznagel, E L; Harvey, J E

    1994-10-01

    The ability to accurately predict the presence of subclinical metastatic neck disease in clinically N0 patients with primary epidermoid cancer of the larynx would be of great value in determining whether to perform an elective neck dissection. We describe a statistical approach to estimating the probability of occult neck disease given pretreatment clinical parameters. A retrospective study was performed involving 736 clinically N0 patients with primary laryngeal cancer who were treated surgically with primary resection and ipsilateral neck dissection. Nodal involvement was determined histologically after surgical lymphadenectomy. A logistic regression model was used to derive an equation that calculated the probability of occult neck metastasis based on pretreatment T stage, tumor location, and histologic grade. The model has a sensitivity of 74%, a specificity of 87%, and can be entered into a programmable calculator. PMID:7934602

  10. Modeling Group Size and Scalar Stress by Logistic Regression from an Archaeological Perspective

    PubMed Central

    Alberti, Gianmarco

    2014-01-01

    Johnson’s scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups’ size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson’s theory and on Dunbar’s findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122–132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147–170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion. PMID:24626241

  11. Screening for ketosis using multiple logistic regression based on milk yield and composition.

    PubMed

    Kayano, Mitsunori; Kataoka, Tomoko

    2015-11-01

    Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (P<0.05) for the diagnosis of ketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (P<0.01). A diagnostic rule was constructed for each group of cows: (1) 9.978 × P/F ratio + 0.085 × milk yield <10 and (2) 2.327 × SNF - 2.703 × lactose + 0.225 × MUN <10. The sensitivity, specificity and the area under the curve (AUC) of the diagnostic rules were (1) 0.800, 0.729 and 0.811; (2) 0.813, 0.730 and 0.787, respectively. The P/F ratio, which is a widely used measure of ketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively. PMID:26074408

  12. Screening for ketosis using multiple logistic regression based on milk yield and composition

    PubMed Central

    KAYANO, Mitsunori; KATAOKA, Tomoko

    2015-01-01

    Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (P<0.05) for the diagnosis of ketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (P<0.01). A diagnostic rule was constructed for each group of cows: (1) 9.978 × P/F ratio + 0.085 × milk yield <10 and (2) 2.327 × SNF − 2.703 × lactose + 0.225 × MUN <10. The sensitivity, specificity and the area under the curve (AUC) of the diagnostic rules were (1) 0.800, 0.729 and 0.811; (2) 0.813, 0.730 and 0.787, respectively. The P/F ratio, which is a widely used measure of ketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively. PMID:26074408

  13. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-01-01

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569

  14. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-10-01

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569

  15. Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization

    PubMed Central

    Huang, Hai-Hui; Liu, Xiao-Ying; Liang, Yong

    2016-01-01

    Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods. PMID:27136190

  16. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-10

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions. PMID:27188374

  17. Multiple logistic regression model of signalling practices of drivers on urban highways

    NASA Astrophysics Data System (ADS)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  18. Logistic regression and artificial neural network models for mapping of regional-scale landslide susceptibility in volcanic mountains of West Java (Indonesia)

    NASA Astrophysics Data System (ADS)

    Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.

    2016-05-01

    West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.

  19. A logistic regression based approach for the prediction of flood warning threshold exceedance

    NASA Astrophysics Data System (ADS)

    Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara

    2016-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances

  20. Flood susceptible analysis at Kelantan river basin using remote sensing and logistic regression model

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding

  1. Confidence intervals after multiple imputation: combining profile likelihood information from logistic regressions.

    PubMed

    Heinze, Georg; Ploner, Meinhard; Beyea, Jan

    2013-12-20

    In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter β  at level 1 - α to be identified as those β* and β** that satisfy CDF c (β*) = α ∕ 2 and CDF c (β**) = 1 - α ∕ 2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477

  2. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression

    PubMed Central

    Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa

    2016-01-01

    Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  3. Modified Logistic Regression Models Using Gene Coexpression and Clinical Features to Predict Prostate Cancer Progression

    PubMed Central

    Zhao, Hongya; Logothetis, Christopher J.; Gorlov, Ivan P.; Zeng, Jia; Dai, Jianguo

    2013-01-01

    Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. PMID:24367394

  4. Effect of driver, roadway, collision, and vehicle characteristics on crash severity: a conditional logistic regression approach.

    PubMed

    Kadilar, Gamze Ozel

    2016-06-01

    The aim of the study is to examine the factors that appear to have a higher potential for serious injury or death of drivers in traffic accidents in Turkey, such as collision type, roadway surface, vehicle speed, alcohol/drug use, and restraint use. Driver crash severity is the dependent variable of this study with two categories, fatal and non-fatal. Due to the binary nature of the dependent variable, a conditional logistic regression analysis was found suitable. Of the 16 independent variables obtained from Turkish police accident reports, 11 variables were found most significantly associated with driver crash severity. They are age, education level, restraint use, roadway condition, roadway type, time of day, collision location, collision type, number and direction of vehicles, vehicle speed, and alcohol/drug use. This study found that belted drivers aged 18-25 years involving two vehicles travelling in the same direction, in an urban area, during the daytime, and on an avenue or a street have better chances of survival in traffic accidents. PMID:25087577

  5. Predictive occurrence models for coastal wetland plant communities: Delineating hydrologic response surfaces with multinomial logistic regression

    NASA Astrophysics Data System (ADS)

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-02-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007-Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  6. Knowledge and perception on tuberculosis transmission in Tanzania: Multinomial logistic regression analysis of secondary data.

    PubMed

    Ismail, Abbas; Josephat, Peter

    2014-01-01

    Tuberculosis (TB) is one of the most important public health problems in Tanzania and was declared as a national public health emergency in 2006. Community and individual knowledge and perceptions are critical factors in the control of the disease. The objective of this study was to analyze the knowledge and perception on the transmission of TB in Tanzania. Multinomial Logistic Regression analysis was considered in order to quantify the impact of knowledge and perception on TB. The data used was adopted as secondary data from larger national survey 2007-08 Tanzania HIV/AIDS and Malaria Indicator Survey. The findings across groups revealed that knowledge on TB transmission increased with an increase in age and level of education. People in rural areas had less knowledge regarding tuberculosis transmission compared to urban areas [OR = 0.7]. People with the access to radio [OR = 1.7] were more knowledgeable on tuberculosis transmission compared to those who did not have access to radio. People who did not have telephone [OR = 0.6] were less knowledgeable on tuberculosis route of transmission compared to those who had telephone. The findings showed that socio-demographic factors such as age, education, place of residence and owning telephone or radio varied systematically with knowledge on tuberculosis transmission. PMID:26867270

  7. Kernel-based logistic regression model for protein sequence without vectorialization.

    PubMed

    Fong, Youyi; Datta, Saheli; Georgiev, Ivelin S; Kwong, Peter D; Tomaras, Georgia D

    2015-07-01

    Protein sequence data arise more and more often in vaccine and infectious disease research. These types of data are discrete, high-dimensional, and complex. We propose to study the impact of protein sequences on binary outcomes using a kernel-based logistic regression model, which models the effect of protein through a random effect whose variance-covariance matrix is mostly determined by a kernel function. We propose a novel, biologically motivated, profile hidden Markov model (HMM)-based mutual information (MI) kernel. Hypothesis testing can be carried out using the maximum of the score statistics and a parametric bootstrap procedure. To improve the power of testing, we propose intuitive modifications to the test statistic. We show through simulation studies that the profile HMM-based MI kernel can be substantially more powerful than competing kernels, and that the modified test statistics bring incremental gains in power. We use these proposed methods to investigate two problems from HIV-1 vaccine research: (1) identifying segments of HIV-1 envelope (Env) protein that confer resistance to neutralizing antibody and (2) identifying segments of Env that are associated with attenuation of protective vaccine effect by antibodies of isotype A in the RV144 vaccine trial. PMID:25532524

  8. Binary Logistic Regression Analysis of Foramen Magnum Dimensions for Sex Determination

    PubMed Central

    Kamath, Venkatesh Gokuldas; Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna

    2015-01-01

    Purpose. The structural integrity of foramen magnum is usually preserved in fire accidents and explosions due to its resistant nature and secluded anatomical position and this study attempts to determine its sexing potential. Methods. The sagittal and transverse diameters and area of foramen magnum of seventy-two skulls (41 male and 31 female) from south Indian population were measured. The analysis was done using Student's t-test, linear correlation, histogram, Q-Q plot, and Binary Logistic Regression (BLR) to obtain a model for sex determination. The predicted probabilities of BLR were analysed using Receiver Operating Characteristic (ROC) curve. Result. BLR analysis and ROC curve revealed that the predictability of the dimensions in sexing the crania was 69.6% for sagittal diameter, 66.4% for transverse diameter, and 70.3% for area of foramen. Conclusion. The sexual dimorphism of foramen magnum dimensions is established. However, due to considerable overlapping of male and female values, it is unwise to singularly rely on the foramen measurements. However, considering the high sex predictability percentage of its dimensions in the present study and the studies preceding it, the foramen measurements can be used to supplement other sexing evidence available so as to precisely ascertain the sex of the skeleton. PMID:26346917

  9. Multinomial logistic regression analysis for differentiating 3 treatment outcome trajectory groups for headache-associated disability.

    PubMed

    Lewis, Kristin Nicole; Heckman, Bernadette Davantes; Himawan, Lina

    2011-08-01

    Growth mixture modeling (GMM) identified latent groups based on treatment outcome trajectories of headache disability measures in patients in headache subspecialty treatment clinics. Using a longitudinal design, 219 patients in headache subspecialty clinics in 4 large cities throughout Ohio provided data on their headache disability at pretreatment and 3 follow-up assessments. GMM identified 3 treatment outcome trajectory groups: (1) patients who initiated treatment with elevated disability levels and who reported statistically significant reductions in headache disability (high-disability improvers; 11%); (2) patients who initiated treatment with elevated disability but who reported no reductions in disability (high-disability nonimprovers; 34%); and (3) patients who initiated treatment with moderate disability and who reported statistically significant reductions in headache disability (moderate-disability improvers; 55%). Based on the final multinomial logistic regression model, a dichotomized treatment appointment attendance variable was a statistically significant predictor for differentiating high-disability improvers from high-disability nonimprovers. Three-fourths of patients who initiated treatment with elevated disability levels did not report reductions in disability after 5 months of treatment with new preventive pharmacotherapies. Preventive headache agents may be most efficacious for patients with moderate levels of disability and for patients with high disability levels who attend all treatment appointments. PMID:21420240

  10. Use of binary logistic regression technique with MODIS data to estimate wild fire risk

    NASA Astrophysics Data System (ADS)

    Fan, Hong; Di, Liping; Yang, Wenli; Bonnlander, Brian; Li, Xiaoyan

    2007-11-01

    Many forest fires occur across the globe each year, which destroy life and property, and strongly impact ecosystems. In recent years, wildland fires and altered fire disturbance regimes have become a significant management and science problem affecting ecosystems and wildland/urban interface cross the United States and global. In this paper, we discuss the estimation of 504 probability models for forecasting fire risk for 14 fuel types, 12 months, one day/week/month in advance, which use 19 years of historical fire data in addition to meteorological and vegetation variables. MODIS land products are utilized as a major data source, and a logistical binary regression was adopted to solve fire forecast probability. In order to better modeling the change of fire risk along with the transition of seasons, some spatial and temporal stratification strategies were applied. In order to explore the possibilities of real time prediction, the Matlab distributing computing toolbox was used to accelerate the prediction. Finally, this study give an evaluation and validation of predict based on the ground truth collected. Validating results indicate these fire risk models have achieved nearly 70% accuracy of prediction and as well MODIS data are potential data source to implement near real-time fire risk prediction.

  11. Predictive occurrence models for coastal wetland plant communities: delineating hydrologic response surfaces with multinomial logistic regression

    USGS Publications Warehouse

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-01-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  12. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    PubMed

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed. PMID:25255417

  13. Comparison of linear discriminant analysis and logistic regression for data classification

    NASA Astrophysics Data System (ADS)

    Liong, Choong-Yeun; Foo, Sin-Fan

    2013-04-01

    Linear discriminant analysis (LDA) and logistic regression (LR) are often used for the purpose of classifying populations or groups using a set of predictor variables. Assumptions of multivariate normality and equal variance-covariance matrices across groups are required before proceeding with LDA, but such assumptions are not required for LR and hence LR is considered to be much more robust than LDA. In this paper, several real datasets which are different in terms of normality, number of independent variables and sample size are used to study the performance of both methods. The methods are compared based on the percentage of correct classification and B index. The results show that overall, LR performs better regardless of the distribution of the data is normal or nonnormal. However, LR needs longer computing time than LDA with the increase in sample size. The performance of LDA was also tested by using various prior probabilities. The results show that the average percentage of correct classification and the B index are higher when the prior probability is set based on the group size rather than using equal probabilities for all groups.

  14. Exploring the performance of logistic regression model types on growth/no growth data of Listeria monocytogenes.

    PubMed

    Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F

    2007-03-20

    Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In

  15. Loss of Power in Logistic, Ordinal Logistic, and Probit Regression When an Outcome Variable Is Coarsely Categorized

    ERIC Educational Resources Information Center

    Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.

    2006-01-01

    Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…

  16. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  17. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    SciTech Connect

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  18. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network

    PubMed Central

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.

  19. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    NASA Astrophysics Data System (ADS)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  20. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  1. A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

    ERIC Educational Resources Information Center

    Guler, Nese; Penfield, Randall D.

    2009-01-01

    In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

  2. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    ERIC Educational Resources Information Center

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  3. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  4. A logistic regression equation for estimating the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Archfield, Stacey A.

    2002-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis

  5. Logistic Regression for Seismically Induced Landslide Predictions: Using Uniform Hazard and Geophysical Layers as Predictor Variables

    NASA Astrophysics Data System (ADS)

    Nowicki, M. A.; Hearne, M.; Thompson, E.; Wald, D. J.

    2012-12-01

    Seismically induced landslides present a costly and often fatal threats in many mountainous regions. Substantial effort has been invested to understand where seismically induced landslides may occur in the future. Both slope-stability methods and, more recently, statistical approaches to the problem are described throughout the literature. Though some regional efforts have succeeded, no uniformly agreed-upon method is available for predicting the likelihood and spatial extent of seismically induced landslides. For use in the U. S. Geological Survey (USGS) Prompt Assessment of Global Earthquakes for Response (PAGER) system, we would like to routinely make such estimates, in near-real time, around the globe. Here we use the recently produced USGS ShakeMap Atlas of historic earthquakes to develop an empirical landslide probability model. We focus on recent events, yet include any digitally-mapped landslide inventories for which well-constrained ShakeMaps are also available. We combine these uniform estimates of the input shaking (e.g., peak acceleration and velocity) with broadly available susceptibility proxies, such as topographic slope and surface geology. The resulting database is used to build a predictive model of the probability of landslide occurrence with logistic regression. The landslide database includes observations from the Northridge, California (1994); Wenchuan, China (2008); ChiChi, Taiwan (1999); and Chuetsu, Japan (2004) earthquakes; we also provide ShakeMaps for moderate-sized events without landslide for proper model testing and training. The performance of the regression model is assessed with both statistical goodness-of-fit metrics and a qualitative review of whether or not the model is able to capture the spatial extent of landslides for each event. Part of our goal is to determine which variables can be employed based on globally-available data or proxies, and whether or not modeling results from one region are transferrable to

  6. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA

    NASA Astrophysics Data System (ADS)

    Mair, Alan; El-Kadi, Aly I.

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (> 1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.

  7. Landslide susceptibility analysis with logistic regression model based on FCM sampling strategy

    NASA Astrophysics Data System (ADS)

    Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji

    2013-08-01

    Several mathematical models are used to predict the spatial distribution characteristics of landslides to mitigate damage caused by landslide disasters. Although some studies have achieved excellent results around the world, few studies take the inter-relationship of the selected points (training points) into account. In this paper, we present the Fuzzy c-means (FCM) algorithm as an optimal method for choosing the appropriate input landslide points as training data. Based on different combinations of the Fuzzy exponent (m) and the number of clusters (c), five groups of sampling points were derived from formal seed cells points and applied to analyze the landslide susceptibility in Mizunami City, Gifu Prefecture, Japan. A logistic regression model is applied to create the models of the relationships between landslide-conditioning factors and landslide occurrence. The pre-existing landslide bodies and the area under the relative operative characteristic (ROC) curve were used to evaluate the performance of all the models with different m and c. The results revealed that Model no. 4 (m=1.9, c=4) and Model no. 5 (m=1.9, c=5) have significantly high classification accuracies, i.e., 90.0%. Moreover, over 30% of the landslide bodies were grouped under the very high susceptibility zone. Otherwise, Model no. 4 and Model no. 5 had higher area under the ROC curve (AUC) values, which were 0.78 and 0.79, respectively. Therefore, Model no. 4 and Model no. 5 offer better model results for landslide susceptibility mapping. Maps derived from Model no. 4 and Model no. 5 would offer the local authorities crucial information for city planning and development.

  8. Shock Index Correlates with Extravasation on Angiographs of Gastrointestinal Hemorrhage: A Logistics Regression Analysis

    SciTech Connect

    Nakasone, Yutaka Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori

    2007-09-15

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.

  9. LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS

    PubMed Central

    Almquist, Zack W.; Butts, Carter T.

    2015-01-01

    Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218

  10. Binary logistic regression analysis of hard palate dimensions for sexing human crania

    PubMed Central

    Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna

    2016-01-01

    Sex determination is the preliminary step in every forensic investigation and the hard palate assumes significance in cranial sexing in cases involving burns and explosions due to its resistant nature and secluded location. This study analyzes the sexing potential of incisive foramen to posterior nasal spine length, palatine process of maxilla length, horizontal plate of palatine bone length and transverse length between the greater palatine foramina. The study deviates from the conventional method of measuring the maxillo-alveolar length and breadth as the dimensions considered in this study are more heat resistant and useful in situations with damaged alveolar margins. The study involves 50 male and 50 female adult dry skulls of Indian ethnic group. The dimensions measured were statistically analyzed using Student's t test, binary logistic regression and receiver operating characteristic curve. It was observed that the incisive foramen to posterior nasal spine length is a definite sex marker with sex predictability of 87.2%. The palatine process of maxilla length with 66.8% sex predictability and the horizontal plate of palatine bone length with 71.9% sex predictability cannot be relied upon as definite sex markers. The transverse length between the greater palatine foramina is statistically insignificant in sexing crania (P=0.318). Considering a significant overlap of values in both the sexes the palatal dimensions singularly cannot be relied upon for sexing. Nevertheless, considering the high sex predictability of incisive foramen to posterior nasal spine length this dimension can definitely be used to supplement other sexing evidence available to precisely conclude the cranial sex. PMID:27382518

  11. Wisconsin Card Sorting Test scores and clinical and sociodemographic correlates in Schizophrenia: multiple logistic regression analysis

    PubMed Central

    Banno, Masahiro; Koide, Takayoshi; Aleksic, Branko; Okada, Takashi; Kikuchi, Tsutomu; Kohmura, Kunihiro; Adachi, Yasunori; Kawano, Naoko; Iidaka, Tetsuya; Ozaki, Norio

    2012-01-01

    Objectives This study investigated what clinical and sociodemographic factors affected Wisconsin Card Sorting Test (WCST) factor scores of patients with schizophrenia to evaluate parameters or items of the WCST. Design Cross-sectional study. Setting Patients with schizophrenia from three hospitals participated. Participants Participants were recruited from July 2009 to August 2011. 131 Japanese patients with schizophrenia (84 men and 47 women, 43.5±13.8 years (mean±SD)) entered and completed the study. Participants were recruited in the study if they (1) met DSM-IV criteria for schizophrenia; (2) were physically healthy and (3) had no mood disorders, substance abuse, neurodevelopmental disorders, epilepsy or mental retardation. We examined their basic clinical and sociodemographic factors (sex, age, education years, age of onset, duration of illness, chlorpromazine equivalent doses and the positive and negative syndrome scale (PANSS) scores). Primary and secondary outcome measures All patients carried out the WCST Keio version. Five indicators were calculated, including categories achieved (CA), perseverative errors in Milner (PEM) and Nelson (PEN), total errors (TE) and difficulties of maintaining set (DMS). From the principal component analysis, we identified two factors (1 and 2). We assessed the relationship between these factor scores and clinical and sociodemographic factors, using multiple logistic regression analysis. Results Factor 1 was mainly composed of CA, PEM, PEN and TE. Factor 2 was mainly composed of DMS. The factor 1 score was affected by age, education years and the PANSS negative scale score. The factor 2 score was affected by duration of illness. Conclusions Age, education years, PANSS negative scale score and duration of illness affected WCST factor scores in patients with schizophrenia. Using WCST factor scores may reduce the possibility of type I errors due to multiple comparisons. PMID:23135537

  12. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.

    PubMed

    Heinze, Georg; Puhr, Rainer

    2010-03-30

    Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs. PMID:20213709

  13. Spectral and Spatial-Based Classification for Broad-Scale Land Cover Mapping Based on Logistic Regression

    PubMed Central

    Mallinis, Georgios; Koutsias, Nikos

    2008-01-01

    Improvement of satellite sensor characteristics motivates the development of new techniques for satellite image classification. Spatial information seems to be critical in classification processes, especially for heterogeneous and complex landscapes such as those observed in the Mediterranean basin. In our study, a spectral classification method of a LANDSAT-5 TM imagery that uses several binomial logistic regression models was developed, evaluated and compared to the familiar parametric maximum likelihood algorithm. The classification approach based on logistic regression modelling was extended to a contextual one by using autocovariates to consider spatial dependencies of every pixel with its neighbours. Finally, the maximum likelihood algorithm was upgraded to contextual by considering typicality, a measure which indicates the strength of class membership. The use of logistic regression for broad-scale land cover classification presented higher overall accuracy (75.61%), although not statistically significant, than the maximum likelihood algorithm (64.23%), even when the latter was refined following a spatial approach based on Mahalanobis distance (66.67%). However, the consideration of the spatial autocovariate in the logistic models significantly improved the fit of the models and increased the overall accuracy from 75.61% to 80.49%.

  14. Multinomial Logistic Regression & Bootstrapping for Bayesian Estimation of Vertical Facies Prediction in Heterogeneous Sandstone Reservoirs

    NASA Astrophysics Data System (ADS)

    Al-Mudhafar, W. J.

    2013-12-01

    Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly

  15. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health

    PubMed Central

    Wagner, Philippe; Ghith, Nermin; Leckie, George

    2016-01-01

    Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible

  16. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  17. What's the Risk? A Simple Approach for Estimating Adjusted Risk Measures from Nonlinear Models Including Logistic Regression

    PubMed Central

    Kleinman, Lawrence C; Norton, Edward C

    2009-01-01

    Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213

  18. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    USGS Publications Warehouse

    Li, J.; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  19. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-01-01

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  20. Analysis of an Environmental Exposure Health Questionnaire in a Metropolitan Minority Population Utilizing Logistic Regression and Support Vector Machines

    PubMed Central

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler

    2014-01-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953

  1. Classical and Bayesian estimation in the logistic regression model applied to diagnosis of child attention deficit hyperactivity disorder.

    PubMed

    Gordóvil-Merino, Amalia; Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel; de la Fuente-Solanas, Emilia I

    2010-04-01

    The limitations inherent to classical estimation of the logistic regression models are known. The Bayesian approach in statistical analysis is an alternative to be considered, given that it makes it possible to introduce prior information about the phenomenon under study. The aim of the present work is to analyze binary and multinomial logistic regression simple models estimated by means of a Bayesian approach in comparison to classical estimation. To that effect, Child Attention Deficit Hyperactivity Disorder (ADHD) clinical data were analyzed. The sample included 286 participants of 6-12 years (78% boys, 22% girls) with ADHD positive diagnosis in 86.7% of the cases. The results show a reduction of standard errors associated to the coefficients obtained from the Bayesian analysis, thus bringing a greater stability to the coefficients. Complex models where parameter estimation may be easily compromised could benefit from this advantage. PMID:20524554

  2. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    NASA Astrophysics Data System (ADS)

    Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee

    2016-02-01

    The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  3. Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model.

    PubMed

    Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan

    2016-10-01

    Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. PMID:27472815

  4. A Comparison of Logistic Regression Model and Artificial Neural Networks in Predicting of Student’s Academic Failure

    PubMed Central

    Teshnizi, Saeed Hosseini; Ayatollahi, Sayyed Mohhamad Taghi

    2015-01-01

    Background and objective: Artificial Neural Networks (ANNs) have recently been applied in situations where an analysis based on the logistic regression (LR) is a standard statistical approach; direct comparisons of the results, however, are seldom attempted. In this study, we compared both logistic regression models and feed-forward neural networks on the academic failure data set. Methods: The data for this study included 18 questions about study situation of 275 undergraduate students selected randomly from among nursing and midwifery and paramedic schools of Hormozgan University of Medical Sciences in 2013. Logistic regression with forward method and feed forward Artificial Neural Network with 15 neurons in hidden layer were fitted to the dataset. The accuracy of the models in predicting academic failure was compared by using ROC (Receiver Operating Characteristic) and classification accuracy. Results: Among nine ANNs, the ANN with 15 neurons in hidden layer was a better ANN compared with LR. The Area Under Receiver Operating Characteristics (AUROC) of the LR model and ANN with 15 neurons in hidden layers, were estimated as 0.55 and 0.89, respectively and ANN was significantly greater than the LR. The LR and ANN models respectively classified 77.5% and 84.3% of the students correctly. Conclusion: Based on this dataset, it seems the classification of the students in two groups with and without academic failure by using ANN with 15 neurons in the hidden layer is better than the LR model. PMID:26635438

  5. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

    NASA Astrophysics Data System (ADS)

    Schaeben, Helmut; Semmler, Georg

    2016-09-01

    The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

  6. Numerical comparisons of two formulations of the logistic regressive models with the mixed model in segregation analysis of discrete traits.

    PubMed

    Demenais, F M; Laing, A E; Bonney, G E

    1992-01-01

    Segregation analysis of discrete traits can be conducted by the classical mixed model and the recently introduced regressive models. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affected persons have a liability exceeding a threshold. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype effects, the phenotypes of older relatives, and other covariates. A formulation of the regressive models, based on an underlying liability model, has been recently proposed. The regression coefficients on antecedents are expressed in terms of the relevant familial correlations and a one-to-one correspondence with the parameters of the mixed model can thus be established. Computer simulations are conducted to evaluate the fit of the two formulations of the regressive models to the mixed model on nuclear families. The two forms of the class D regressive model provide a good fit to a generated mixed model, in terms of both hypothesis testing and parameter estimation. The simpler class A regressive model, which assumes that the outcomes of children depend solely on the outcomes of parents, is not robust against a sib-sib correlation exceeding that specified by the model, emphasizing testing class A against class D. The studies reported here show that if the true state of nature is that described by the mixed model, then a regressive model will do just as well. Moreover, the regressive models, allowing for more patterns of family dependence, provide a flexible framework to understand gene-environment interactions in complex diseases. PMID:1487139

  7. Modeling the dynamics of urban growth using multinomial logistic regression: a case study of Jiayu County, Hubei Province, China

    NASA Astrophysics Data System (ADS)

    Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei

    2008-10-01

    Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.

  8. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire

  9. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  10. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  11. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  12. Transmission Risks of Schistosomiasis Japonica: Extraction from Back-propagation Artificial Neural Network and Logistic Regression Model

    PubMed Central

    Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong

    2013-01-01

    Background The transmission of schistosomiasis japonica in a local setting is still poorly understood in the lake regions of the People's Republic of China (P. R. China), and its transmission patterns are closely related to human, social and economic factors. Methodology/Principal Findings We aimed to apply the integrated approach of artificial neural network (ANN) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China. By using the back-propagation (BP) of the ANN model, 16 factors out of 27 factors were screened, and the top five factors ranked by the absolute value of mean impact value (MIV) were mainly related to human behavior, i.e. integration of water contact history and infection history, family with past infection, history of water contact, infection history, and infection times. The top five factors screened by the logistic regression model were mainly related to the social economics, i.e. village level, economic conditions of family, age group, education level, and infection times. The risk of human infection with S. japonicum is higher in the population who are at age 15 or younger, or with lower education, or with the higher infection rate of the village, or with poor family, and in the population with more than one time to be infected. Conclusion/Significance Both BP artificial neural network and logistic regression model established in a small scale suggested that individual behavior and socioeconomic status are the most important risk factors in the transmission of schistosomiasis japonica. It was reviewed that the young population (≤15) in higher-risk areas was the main target to be intervened for the disease transmission control. PMID:23556015

  13. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis

    PubMed Central

    Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients’ functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan’s community-based occupational therapy (OT) service referral based on experts’ beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services. PMID:26863544

  14. Assessing the spatial variability of coefficients of landslide predictors in different regions of Romania using logistic regression

    NASA Astrophysics Data System (ADS)

    Mărgărint, M. C.; Grozavu, A.; Patriche, C. V.

    2013-12-01

    In landslide susceptibility assessment, an important issue is the correct identification of significant contributing factors, which leads to the improvement of predictions regarding this type of geomorphologic processes. In the scientific literature, different weightings are assigned to these factors, but contain large variations. This study aims to identify the spatial variability and range of variation for the coefficients of landslide predictors in different geographical conditions. Four sectors of 15 km × 15 km (225 km2) were selected for analysis from representative regions in Romania in terms of spatial extent of landslides, situated both on the hilly areas (the Transylvanian Plateau and Moldavian Plateau) and lower mountain region (Subcarpathians). The following factors were taken into consideration: elevation, slope angle, slope height, terrain curvature (mean, plan and profile), distance from drainage network, slope aspect, land use, and lithology. For each sector, landslide inventory, digital elevation model and thematic layers of the mentioned predictors were achieved and integrated in a georeferenced environment. The logistic regression was applied separately for the four study sectors as the statistical method for assessing terrain landsliding susceptibility. Maps of landslide susceptibility were produced, the values of which were classified by using the natural breaks method (Jenks). The accuracy of the logistic regression outcomes was evaluated using the ROC (receiver operating characteristic) curve and AUC (area under the curve) parameter, which show values between 0.852 and 0.922 for training samples, and between 0.851 and 0.940 for validation samples. The values of coefficients are generally confined within the limits specified by the scientific literature. In each sector, landslide susceptibility is essentially related to some specific predictors, such as the slope angle, land use, slope height, and lithology. The study points out that the

  15. Assessing the spatial variability of weights of landslide causal factors in different regions from Romania, using logistic regression

    NASA Astrophysics Data System (ADS)

    Margarint, M. C.; Grozavu, A.; Patriche, C. V.

    2012-04-01

    Landslides represent a significant natural hazard in hilly areas of Romania which cause important damages. The scientific interest for landslide susceptibility mapping is quite recent and standardized through legislation. However, there is need for improving the methodology, in order for the susceptibility maps to constitute a sound basis for territorial planning. The logistic regression is one of the main statistical methods used for assessing terrain susceptibility for landsliding. There are different degrees of weighting the landslide causal factors mentioned in the scientific literature, but with large variations. This study aims to identify the range of variation of landslide causal factors for different regions in Romania. The following factors were taken into consideration: slope angle, terrain altitude, terrain curvature (mean, plan and profile), soil type, lithologic class, land use, distance from drainage network and roads, mean annual precipitations. Four square perimeters of 15x15 km were chosen from representative regions in terms of spatial extent of landslides: two situated in the central-northern part of the Moldavian Plateau, one in the Transylvania Depression and one in the Moldavian Subcarpathians. The logistic regression was applied separately for the four sectors. In order to monitor the differences in the final results, numerous attempts have been made, starting from landslides polygons acquired from both the topographic maps at scale 1:25.000 (1984-1985 edition) and the ortophotoimages (2005-2006). The other elements were acquired from cartographic materials at appropriate scales, according the international methodology. The data integration was accomplished in the georeferenced environment provided by TNTMips 6.9 ArcGIS 9.3 and SAGA 2.0.8 software packages, while the statistical analysis was performed using Excel 2003 and XLSTAT 2010 trial version. Maps for all landslide causal factors were achieved for each perimeter. The logistic

  16. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  17. Applicability of the Ricketts’ posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians

    PubMed Central

    Perez, Ivan; Chavez, Allison K.; Ponce, Dario

    2016-01-01

    Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732

  18. GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies

    PubMed Central

    2013-01-01

    Background Genome-wide association studies have become very popular in identifying genetic contributions to phenotypes. Millions of SNPs are being tested for their association with diseases and traits using linear or logistic regression models. This conceptually simple strategy encounters the following computational issues: a large number of tests and very large genotype files (many Gigabytes) which cannot be directly loaded into the software memory. One of the solutions applied on a grand scale is cluster computing involving large-scale resources. We show how to speed up the computations using matrix operations in pure R code. Results We improve speed: computation time from 6 hours is reduced to 10-15 minutes. Our approach can handle essentially an unlimited amount of covariates efficiently, using projections. Data files in GWAS are vast and reading them into computer memory becomes an important issue. However, much improvement can be made if the data is structured beforehand in a way allowing for easy access to blocks of SNPs. We propose several solutions based on the R packages ff and ncdf. We adapted the semi-parallel computations for logistic regression. We show that in a typical GWAS setting, where SNP effects are very small, we do not lose any precision and our computations are few hundreds times faster than standard procedures. Conclusions We provide very fast algorithms for GWAS written in pure R code. We also show how to rearrange SNP data for fast access. PMID:23711206

  19. Estimating the susceptibility of surface water in Texas to nonpoint-source contamination by use of logistic regression modeling

    USGS Publications Warehouse

    Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby

    2003-01-01

    In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.

  20. Sample size estimation for alternating logistic regressions analysis of multilevel randomized community trials of under-age drinking.

    PubMed

    Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark

    2012-07-01

    Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure. PMID:24347839

  1. Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

    PubMed

    Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar

    2016-08-15

    Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26892025

  2. The effect of high leverage points on the maximum estimated likelihood for separation in logistic regression

    NASA Astrophysics Data System (ADS)

    Ariffin, Syaiba Balqish; Midi, Habshah; Arasan, Jayanthi; Rana, Md Sohel

    2015-02-01

    This article is concerned with the performance of the maximum estimated likelihood estimator in the presence of separation in the space of the independent variables and high leverage points. The maximum likelihood estimator suffers from the problem of non overlap cases in the covariates where the regression coefficients are not identifiable and the maximum likelihood estimator does not exist. Consequently, iteration scheme fails to converge and gives faulty results. To remedy this problem, the maximum estimated likelihood estimator is put forward. It is evident that the maximum estimated likelihood estimator is resistant against separation and the estimates always exist. The effect of high leverage points are then investigated on the performance of maximum estimated likelihood estimator through real data sets and Monte Carlo simulation study. The findings signify that the maximum estimated likelihood estimator fails to provide better parameter estimates in the presence of both separation, and high leverage points.

  3. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election

    PubMed Central

    2013-01-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060

  4. Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2012-01-01

    Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…

  5. Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm

    PubMed Central

    2015-01-01

    This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety. PMID:25916593

  6. Trinculo: Bayesian and frequentist multinomial logistic regression for genome-wide association studies of multi-category phenotypes

    PubMed Central

    Jostins, Luke; McVean, Gilean

    2016-01-01

    Motivation: For many classes of disease the same genetic risk variants underly many related phenotypes or disease subtypes. Multinomial logistic regression provides an attractive framework to analyze multi-category phenotypes, and explore the genetic relationships between these phenotype categories. We introduce Trinculo, a program that implements a wide range of multinomial analyses in a single fast package that is designed to be easy to use by users of standard genome-wide association study software. Availability and implementation: An open source C implementation, with code and binaries for Linux and Mac OSX, is available for download at http://sourceforge.net/projects/trinculo Supplementary information: Supplementary data are available at Bioinformatics online. Contact: lj4@well.ox.ac.uk PMID:26873930

  7. An application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case

    NASA Astrophysics Data System (ADS)

    Mehrjoo, Saeed; Bashiri, Mahdi

    2013-05-01

    Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be inefficient because of daily fluctuations in real factories. Decision support systems can provide productive tools for production planners to offer a feasible and prompt decision in effective and robust production planning. In this paper, we propose a robust decision support tool for detailed production planning based on statistical multivariate method including principal component analysis and logistic regression. The proposed approach has been used in a real case in Iranian automotive industry. In the presence of existing multisource uncertainties, the results of applying the proposed method in the selected case show that the accuracy of daily production planning increases in comparison with the existing method.

  8. Modelling post-fire soil erosion hazard using ordinal logistic regression: A case study in South-eastern Spain

    NASA Astrophysics Data System (ADS)

    Notario del Pino, Jesús S.; Ruiz-Gallardo, José-Reyes

    2015-03-01

    Treatments that minimize soil erosion after large wildfires depend, among other factors, on fire severity and landscape configuration so that, in practice, most of them are applied according to emergency criteria. Therefore, simple tools to predict soil erosion risk help to decide where the available resources should be used first. In this study, a predictive model for soil erosion degree, based on ordinal logistic regression, has been developed and evaluated using data from three large forest fires in South-eastern Spain. The field data were successfully fit to the model in 60% of cases after 50 runs (i.e., agreement between observed and predicted soil erosion degrees), using slope steepness, slope aspect, and fire severity as predictors. North-facing slopes were shown to be less prone to soil erosion than the rest.

  9. Deep Human Parsing with Active Template Regression.

    PubMed

    Liang, Xiaodan; Liu, Si; Shen, Xiaohui; Yang, Jianchao; Liu, Luoqi; Dong, Jian; Lin, Liang; Yan, Shuicheng

    2015-12-01

    In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an active template regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches 64.38 percent by our ATR framework, significantly higher than 44.76 percent based on the state-of-the-art algorithm [28]. PMID:26539846

  10. Prediction of Foreign Object Debris/Damage (FOD) type for elimination in the aeronautics manufacturing environment through logistic regression model

    NASA Astrophysics Data System (ADS)

    Espino, Natalia V.

    Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.

  11. Developing logistic regression models using purchase attributes and demographics to predict the probability of purchases of regular and specialty eggs.

    PubMed

    Bejaei, M; Wiseman, K; Cheng, K M

    2015-01-01

    Consumers' interest in specialty eggs appears to be growing in Europe and North America. The objective of this research was to develop logistic regression models that utilise purchaser attributes and demographics to predict the probability of a consumer purchasing a specific type of table egg including regular (white and brown), non-caged (free-run, free-range and organic) or nutrient-enhanced eggs. These purchase prediction models, together with the purchasers' attributes, can be used to assess market opportunities of different egg types specifically in British Columbia (BC). An online survey was used to gather data for the models. A total of 702 completed questionnaires were submitted by BC residents. Selected independent variables included in the logistic regression to develop models for different egg types to predict the probability of a consumer purchasing a specific type of table egg. The variables used in the model accounted for 54% and 49% of variances in the purchase of regular and non-caged eggs, respectively. Research results indicate that consumers of different egg types exhibit a set of unique and statistically significant characteristics and/or demographics. For example, consumers of regular eggs were less educated, older, price sensitive, major chain store buyers, and store flyer users, and had lower awareness about different types of eggs and less concern regarding animal welfare issues. However, most of the non-caged egg consumers were less concerned about price, had higher awareness about different types of table eggs, purchased their eggs from local/organic grocery stores, farm gates or farmers markets, and they were more concerned about care and feeding of hens compared to consumers of other eggs types. PMID:26103791

  12. Analysis of vegetation distribution in Interior Alaska and sensitivity to climate change using a logistic regression approach

    USGS Publications Warehouse

    Calef, M.P.; McGuire, A.D.; Epstein, H.E.; Rupp, T.S.; Shugart, H.H.

    2005-01-01

    Aim: To understand drivers of vegetation type distribution and sensitivity to climate change. Location: Interior Alaska. Methods: A logistic regression model was developed that predicts the potential equilibrium distribution of four major vegetation types: tundra, deciduous forest, black spruce forest and white spruce forest based on elevation, aspect, slope, drainage type, fire interval, average growing season temperature and total growing season precipitation. The model was run in three consecutive steps. The hierarchical logistic regression model was used to evaluate how scenarios of changes in temperature, precipitation and fire interval may influence the distribution of the four major vegetation types found in this region. Results: At the first step, tundra was distinguished from forest, which was mostly driven by elevation, precipitation and south to north aspect. At the second step, forest was separated into deciduous and spruce forest, a distinction that was primarily driven by fire interval and elevation. At the third step, the identification of black vs. white spruce was driven mainly by fire interval and elevation. The model was verified for Interior Alaska, the region used to develop the model, where it predicted vegetation distribution among the steps with an accuracy of 60-83%. When the model was independently validated for north-west Canada, it predicted vegetation distribution among the steps with an accuracy of 53-85%. Black spruce remains the dominant vegetation type under all scenarios, potentially expanding most under warming coupled with increasing fire interval. White spruce is clearly limited by moisture once average growing season temperatures exceeded a critical limit (+2 ??C). Deciduous forests expand their range the most when any two of the following scenarios are combined: decreasing fire interval, warming and increasing precipitation. Tundra can be replaced by forest under warming but expands under precipitation increase. Main

  13. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    PubMed

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i. PMID:27441122

  14. Approximate and Pseudo-Likelihood Analysis for Logistic Regression Using External Validation Data to Model Log Exposure

    PubMed Central

    KUPPER, Lawrence L.

    2012-01-01

    A common goal in environmental epidemiologic studies is to undertake logistic regression modeling to associate a continuous measure of exposure with binary disease status, adjusting for covariates. A frequent complication is that exposure may only be measurable indirectly, through a collection of subject-specific variables assumed associated with it. Motivated by a specific study to investigate the association between lung function and exposure to metal working fluids, we focus on a multiplicative-lognormal structural measurement error scenario and approaches to address it when external validation data are available. Conceptually, we emphasize the case in which true untransformed exposure is of interest in modeling disease status, but measurement error is additive on the log scale and thus multiplicative on the raw scale. Methodologically, we favor a pseudo-likelihood (PL) approach that exhibits fewer computational problems than direct full maximum likelihood (ML) yet maintains consistency under the assumed models without necessitating small exposure effects and/or small measurement error assumptions. Such assumptions are required by computationally convenient alternative methods like regression calibration (RC) and ML based on probit approximations. We summarize simulations demonstrating considerable potential for bias in the latter two approaches, while supporting the use of PL across a variety of scenarios. We also provide accessible strategies for obtaining adjusted standard errors to accompany RC and PL estimates. PMID:24027381

  15. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands

    PubMed Central

    Franklin, Erik C.; Kelley, Christopher; Frazer, L. Neil; Toonen, Robert J.

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence”) threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i. PMID:27441122

  16. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  17. Integration of geographic information systems and logistic multiple regression for aquatic macrophyte modeling

    SciTech Connect

    Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.

    1994-06-01

    Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.

  18. Adverse events associated with incretin-based drugs in Japanese spontaneous reports: a mixed effects logistic regression model

    PubMed Central

    Narushima, Daichi; Kawasaki, Yohei; Takamatsu, Shoji

    2016-01-01

    Background: Spontaneous Reporting Systems (SRSs) are passive systems composed of reports of suspected Adverse Drug Events (ADEs), and are used for Pharmacovigilance (PhV), namely, drug safety surveillance. Exploration of analytical methodologies to enhance SRS-based discovery will contribute to more effective PhV. In this study, we proposed a statistical modeling approach for SRS data to address heterogeneity by a reporting time point. Furthermore, we applied this approach to analyze ADEs of incretin-based drugs such as DPP-4 inhibitors and GLP-1 receptor agonists, which are widely used to treat type 2 diabetes. Methods: SRS data were obtained from the Japanese Adverse Drug Event Report (JADER) database. Reported adverse events were classified according to the MedDRA High Level Terms (HLTs). A mixed effects logistic regression model was used to analyze the occurrence of each HLT. The model treated DPP-4 inhibitors, GLP-1 receptor agonists, hypoglycemic drugs, concomitant suspected drugs, age, and sex as fixed effects, while the quarterly period of reporting was treated as a random effect. Before application of the model, Fisher’s exact tests were performed for all drug-HLT combinations. Mixed effects logistic regressions were performed for the HLTs that were found to be associated with incretin-based drugs. Statistical significance was determined by a two-sided p-value <0.01 or a 99% two-sided confidence interval. Finally, the models with and without the random effect were compared based on Akaike’s Information Criteria (AIC), in which a model with a smaller AIC was considered satisfactory. Results: The analysis included 187,181 cases reported from January 2010 to March 2015. It showed that 33 HLTs, including pancreatic, gastrointestinal, and cholecystic events, were significantly associated with DPP-4 inhibitors or GLP-1 receptor agonists. In the AIC comparison, half of the HLTs reported with incretin-based drugs favored the random effect, whereas HLTs

  19. Finding Factors Influencing Risk: Comparing Variable Selection Methods Applied to Logistic Regression Models of Cases and Controls

    PubMed Central

    Swartz, Michael D.; Yu, Robert K.; Shete, Sanjay

    2011-01-01

    When modeling the risk of a disease, the very act of selecting the factors to include can heavily impact the results. This study compares the performance of several variable selection techniques applied to logistic regression. We performed realistic simulation studies to compare five methods of variable selection: (1) a confidence interval approach for significant coefficients (CI), (2) backward selection, (3) forward selection, (4) stepwise selection, and (5) Bayesian stochastic search variable selection (SSVS) using both informed and uniformed priors. We defined our simulated diseases mimicking odds ratios for cancer risk found in the literature for environmental factors, such as smoking; dietary risk factors, such as fiber; genetic risk factors such as XPD; and interactions. We modeled the distribution of our covariates, including correlation, after the reported empirical distributions of these risk factors. We also used a null data set to calibrate the priors of the Bayesian method and evaluate its sensitivity. Of the standard methods (95% CI, backward, forward and stepwise selection) the CI approach resulted in the highest average percent of correct associations and lowest average percent of incorrect associations. SSVS with an informed prior had higher average percent of correct associations and lower average percent of incorrect associations than did the CI approach. This study shows that Bayesian methods offer a way to use prior information to both increase power and decrease false-positive results when selecting factors to model complex disease risk. PMID:18937224

  20. Occurrence probability assessment of earthquake-triggered landslides with Newmark displacement values and logistic regression: The Wenchuan earthquake, China

    NASA Astrophysics Data System (ADS)

    Wang, Ying; Song, Chongzhen; Lin, Qigen; Li, Juan

    2016-04-01

    The Newmark displacement model has been used to predict earthquake-triggered landslides. Logistic regression (LR) is also a common landslide hazard assessment method. We combined the Newmark displacement model and LR and applied them to Wenchuan County and Beichuan County in China, which were affected by the Ms. 8.0 Wenchuan earthquake on May 12th, 2008, to develop a mechanism-based landslide occurrence probability model and improve the predictive accuracy. A total of 1904 landslide sites in Wenchuan County and 3800 random non-landslide sites were selected as the training dataset. We applied the Newmark model and obtained the distribution of permanent displacement (Dn) for a 30 × 30 m grid. Four factors (Dn, topographic relief, and distances to drainages and roads) were used as independent variables for LR. Then, a combined model was obtained, with an AUC (area under the curve) value of 0.797 for Wenchuan County. A total of 617 landslide sites and non-landslide sites in Beichuan County were used as a validation dataset with AUC = 0.753. The proposed method may also be applied to earthquake-induced landslides in other regions.

  1. Quantifying determinants of cash crop expansion and their relative effects using logistic regression modeling and variance partitioning

    NASA Astrophysics Data System (ADS)

    Xiao, Rui; Su, Shiliang; Mai, Gengchen; Zhang, Zhonghao; Yang, Chenxue

    2015-02-01

    Cash crop expansion has been a major land use change in tropical and subtropical regions worldwide. Quantifying the determinants of cash crop expansion should provide deeper spatial insights into the dynamics and ecological consequences of cash crop expansion. This paper investigated the process of cash crop expansion in Hangzhou region (China) from 1985 to 2009 using remotely sensed data. The corresponding determinants (neighborhood, physical, and proximity) and their relative effects during three periods (1985-1994, 1994-2003, and 2003-2009) were quantified by logistic regression modeling and variance partitioning. Results showed that the total area of cash crops increased from 58,874.1 ha in 1985 to 90,375.1 ha in 2009, with a net growth of 53.5%. Cash crops were more likely to grow in loam soils. Steep areas with higher elevation would experience less likelihood of cash crop expansion. A consistently higher probability of cash crop expansion was found on places with abundant farmland and forest cover in the three periods. Besides, distance to river and lake, distance to county center, and distance to provincial road were decisive determinants for farmers' choice of cash crop plantation. Different categories of determinants and their combinations exerted different influences on cash crop expansion. The joint effects of neighborhood and proximity determinants were the strongest, and the unique effect of physical determinants decreased with time. Our study contributed to understanding of the proximate drivers of cash crop expansion in subtropical regions.

  2. A revised logistic regression equation and an automated procedure for mapping the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Steeves, Peter A.

    2006-01-01

    A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial

  3. Discrimination of Mine Seismic Events and Blasts Using the Fisher Classifier, Naive Bayesian Classifier and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Dong, Longjun; Wesseloo, Johan; Potvin, Yves; Li, Xibing

    2016-01-01

    Seismic events and blasts generate seismic waveforms that have different characteristics. The challenge to confidently differentiate these two signatures is complex and requires the integration of physical and statistical techniques. In this paper, the different characteristics of blasts and seismic events were investigated by comparing probability density distributions of different parameters. Five typical parameters of blasts and events and the probability density functions of blast time, as well as probability density functions of origin time difference for neighbouring blasts were extracted as discriminant indicators. The Fisher classifier, naive Bayesian classifier and logistic regression were used to establish discriminators. Databases from three Australian and Canadian mines were established for training, calibrating and testing the discriminant models. The classification performances and discriminant precision of the three statistical techniques were discussed and compared. The proposed discriminators have explicit and simple functions which can be easily used by workers in mines or researchers. Back-test, applied results, cross-validated results and analysis of receiver operating characteristic curves in different mines have shown that the discriminator for one of the mines has a reasonably good discriminating performance.

  4. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    NASA Astrophysics Data System (ADS)

    Mousavi, S. Mostafa; Horton, Stephen, P.; Langston, Charles A.; Samei, Borhan

    2016-07-01

    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocenters and seismic features of the recorded signals in time, frequency, and time-frequency domains. We applied the technique to 440 microearthquakes -1.7logistic regression (LR) and artificial neural network (ANN) models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.

  5. Using logistic regression modeling to predict sexual recidivism: the Minnesota Sex Offender Screening Tool-3 (MnSOST-3).

    PubMed

    Duwe, Grant; Freske, Pamela J

    2012-08-01

    This study presents the results from efforts to revise the Minnesota Sex Offender Screening Tool-Revised (MnSOST-R), one of the most widely used sex offender risk-assessment tools. The updated instrument, the MnSOST-3, contains nine individual items, six of which are new. The population for this study consisted of the cross-validation sample for the MnSOST-R (N = 220) and a contemporary sample of 2,315 sex offenders released from Minnesota prisons between 2003 and 2006. To score and select items for the MnSOST-3, we used predicted probabilities generated from a multiple logistic regression model. We used bootstrap resampling to not only refine our selection of predictors but also internally validate the model. The results indicate the MnSOST-3 has a relatively high level of predictive discrimination, as evidenced by an apparent AUC of .821 and an optimism-corrected AUC of .796. The findings show the MnSOST-3 is well calibrated with actual recidivism rates for all but the highest risk offenders. Although estimating a penalized maximum likelihood model did not improve the overall calibration, the results suggest the MnSOST-3 may still be useful in helping identify high-risk offenders whose sexual recidivism risk exceeds 50%. Results from an interrater reliability assessment indicate the instrument, which is scored in a Microsoft Excel application, has an adequate degree of consistency across raters (ICC = .83 for both consistency and absolute agreement). PMID:22291047

  6. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree.

    PubMed

    Chao, Cheng-Min; Yu, Ya-Wen; Cheng, Bor-Wen; Kuo, Yao-Lung

    2014-10-01

    The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients' survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90% for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame. PMID:25119239

  7. Development of synthetic velocity - depth damage curves using a Weighted Monte Carlo method and Logistic Regression analysis

    NASA Astrophysics Data System (ADS)

    Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.

    2014-05-01

    Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution

  8. Comparison of K-Means Clustering with Linear Probability Model, Linear Discriminant Function, and Logistic Regression for Predicting Two-Group Membership.

    ERIC Educational Resources Information Center

    So, Tak-Shing Harry; Peng, Chao-Ying Joanne

    This study compared the accuracy of predicting two-group membership obtained from K-means clustering with those derived from linear probability modeling, linear discriminant function, and logistic regression under various data properties. Multivariate normally distributed populations were simulated based on combinations of population proportions,…

  9. Detection of Aberrant Responding on a Personality Scale in a Military Sample: An Application of Evaluating Person Fit with Two-Level Logistic Regression

    ERIC Educational Resources Information Center

    Woods, Carol M.; Oltmanns, Thomas F.; Turkheimer, Eric

    2008-01-01

    Person-fit assessment is used to identify persons who respond aberrantly to a test or questionnaire. In this study, S. P. Reise's (2000) method for evaluating person fit using 2-level logistic regression was applied to 13 personality scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. Clark, 1996) that had been administered…

  10. Multinomial logistic regression model to assess the levels in trans, trans-muconic acid and inferential-risk age group among benzene-exposed group.

    PubMed

    Mala, A; Ravichandran, B; Raghavan, S; Rajmohan, H R

    2010-08-01

    There are only a few studies performed on multinomial logistic regression on the benzene-exposed occupational group. A study was carried out to assess the relationship between the benzene concentration and trans-trans-muconic acid (t,t-MA), biomarkers in urine samples from petrol filling workers. A total of 117 workers involved in this occupation were selected for this current study. Generally, logistic regression analysis (LR) is a common statistical technique that could be used to predict the likelihood of categorical or binary or dichotomous outcome variables. The multinomial logistic regression equations were used to predict the relationship between benzene concentration and t,t-MA. The results showed a significant correlation between benzene and t,t-MA among the petrol fillers. Prediction equations were estimated by adopting the physical characteristic viz., age, experience in years and job categories of petrol filling station workers. Interestingly, there was no significant difference observed among experience in years. Petrol fillers and cashiers having a higher occupational risk were in the age group of ≤24 and between 25 and 34 years. Among the petrol fillers, the t,t-MA levels with exceeding ACGIH TWA-TLV level was showing to be more significant. This study demonstrated that multinomial logistic regression is an effective model for profiling the greatest risk of the benzene-exposed group caused by different explanatory variables. PMID:21120078

  11. Multinomial logistic regression model to assess the levels in trans, trans-muconic acid and inferential-risk age group among benzene-exposed group

    PubMed Central

    Mala, A.; Ravichandran, B.; Raghavan, S.; Rajmohan, H. R.

    2010-01-01

    There are only a few studies performed on multinomial logistic regression on the benzene-exposed occupational group. A study was carried out to assess the relationship between the benzene concentration and trans-trans-muconic acid (t,t-MA), biomarkers in urine samples from petrol filling workers. A total of 117 workers involved in this occupation were selected for this current study. Generally, logistic regression analysis (LR) is a common statistical technique that could be used to predict the likelihood of categorical or binary or dichotomous outcome variables. The multinomial logistic regression equations were used to predict the relationship between benzene concentration and t,t-MA. The results showed a significant correlation between benzene and t,t-MA among the petrol fillers. Prediction equations were estimated by adopting the physical characteristic viz., age, experience in years and job categories of petrol filling station workers. Interestingly, there was no significant difference observed among experience in years. Petrol fillers and cashiers having a higher occupational risk were in the age group of ≤24 and between 25 and 34 years. Among the petrol fillers, the t,t-MA levels with exceeding ACGIH TWA-TLV level was showing to be more significant. This study demonstrated that multinomial logistic regression is an effective model for profiling the greatest risk of the benzene-exposed group caused by different explanatory variables. PMID:21120078

  12. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  13. Landslide susceptibility assessment using logistic regression and its comparison with a rock mass classification system, along a road section in the northern Himalayas (India)

    NASA Astrophysics Data System (ADS)

    Das, Iswar; Sahoo, Sashikant; van Westen, Cees; Stein, Alfred; Hack, Robert

    2010-02-01

    Landslide studies are commonly guided by ground knowledge and field measurements of rock strength and slope failure criteria. With increasing sophistication of GIS-based statistical methods, however, landslide susceptibility studies benefit from the integration of data collected from various sources and methods at different scales. This study presents a logistic regression method for landslide susceptibility mapping and verifies the result by comparing it with the geotechnical-based slope stability probability classification (SSPC) methodology. The study was carried out in a landslide-prone national highway road section in the northern Himalayas, India. Logistic regression model performance was assessed by the receiver operator characteristics (ROC) curve, showing an area under the curve equal to 0.83. Field validation of the SSPC results showed a correspondence of 72% between the high and very high susceptibility classes with present landslide occurrences. A spatial comparison of the two susceptibility maps revealed the significance of the geotechnical-based SSPC method as 90% of the area classified as high and very high susceptible zones by the logistic regression method corresponds to the high and very high class in the SSPC method. On the other hand, only 34% of the area classified as high and very high by the SSPC method falls in the high and very high classes of the logistic regression method. The underestimation by the logistic regression method can be attributed to the generalisation made by the statistical methods, so that a number of slopes existing in critical equilibrium condition might not be classified as high or very high susceptible zones.

  14. SU-F-BRD-01: A Logistic Regression Model to Predict Objective Function Weights in Prostate Cancer IMRT

    SciTech Connect

    Boutilier, J; Chan, T; Lee, T; Craig, T; Sharpe, M

    2014-06-15

    Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.

  15. Optimizing Helicoverpa zea (Lepidoptera: Noctuidae) insecticidal efficacy in Minnesota sweet corn: a logistic regression to assess timing parameters.

    PubMed

    Burkness, Eric C; Galvan, Tederson L; Hutchison, W D

    2009-04-01

    Late-season plantings of sweet corn in Minnesota result in an abundant supply of silking corn, Zea mays L., throughout August to early September that is highly attractive to the corn earworm, Helicoverpa zea (Boddie) (Lepidoptera: Noctuidae). During a 10-yr period, 1997-2006, insecticide efficacy trials were conducted in late-planted sweet corn in Minnesota for management of H. zea. These data were used to develop a logistic regression model to identify the variables and interactions that most influenced efficacy (proportion control) of late-instar H. zea. The pyrethroid lambdacyhalothrin (0.028 kg [AI]/ha) is a commonly used insecticide in sweet corn and was therefore chosen for use in parameter evaluation. Three variables were found to be significant (alpha = 0.05), the percentage of plants silking at the time of the first insecticide application, the interval between the first and second insecticide applications, and the interval between the last insecticide application and harvest. Odds ratio estimates indicated that as the percentage of plants silking at the time of first application increased, control of H. zea increased. As the interval between the first and second insecticide application increased, control of H. zea decreased. Finally, as the interval between the last insecticide application and harvest increased, control of H. zea increased. An additional timing trial was conducted in 2007 by using lambda-cyhalothrin, to evaluate the impact of the percentage of plants silking at the first application. The results indicated no significant differences in efficacy against late-instar H. zea at 0, 50, 90, and 100% of plants silking at the first application (regimes of five or more sprays). The implications of these effects are discussed within the context of current integrated pest management programs for late-planted sweet corn in the upper midwestern United States. PMID:19449649

  16. Identification and validation of a logistic regression model for predicting serious injuries associated with motor vehicle crashes.

    PubMed

    Kononen, Douglas W; Flannagan, Carol A C; Wang, Stewart C

    2011-01-01

    A multivariate logistic regression model, based upon National Automotive Sampling System Crashworthiness Data System (NASS-CDS) data for calendar years 1999-2008, was developed to predict the probability that a crash-involved vehicle will contain one or more occupants with serious or incapacitating injuries. These vehicles were defined as containing at least one occupant coded with an Injury Severity Score (ISS) of greater than or equal to 15, in planar, non-rollover crash events involving Model Year 2000 and newer cars, light trucks, and vans. The target injury outcome measure was developed by the Centers for Disease Control and Prevention (CDC)-led National Expert Panel on Field Triage in their recent revision of the Field Triage Decision Scheme (American College of Surgeons, 2006). The parameters to be used for crash injury prediction were subsequently specified by the National Expert Panel. Model input parameters included: crash direction (front, left, right, and rear), change in velocity (delta-V), multiple vs. single impacts, belt use, presence of at least one older occupant (≥ 55 years old), presence of at least one female in the vehicle, and vehicle type (car, pickup truck, van, and sport utility). The model was developed using predictor variables that may be readily available, post-crash, from OnStar-like telematics systems. Model sensitivity and specificity were 40% and 98%, respectively, using a probability cutpoint of 0.20. The area under the receiver operator characteristic (ROC) curve for the final model was 0.84. Delta-V (mph), seat belt use and crash direction were the most important predictors of serious injury. Due to the complexity of factors associated with rollover-related injuries, a separate screening algorithm is needed to model injuries associated with this crash mode. PMID:21094304

  17. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  18. Functional logistic regression approach to detecting gene by longitudinal environmental exposure interaction in a case-control study.

    PubMed

    Wei, Peng; Tang, Hongwei; Li, Donghui

    2014-11-01

    Most complex human diseases are likely the consequence of the joint actions of genetic and environmental factors. Identification of gene-environment (G × E) interactions not only contributes to a better understanding of the disease mechanisms, but also improves disease risk prediction and targeted intervention. In contrast to the large number of genetic susceptibility loci discovered by genome-wide association studies, there have been very few successes in identifying G × E interactions, which may be partly due to limited statistical power and inaccurately measured exposures. Although existing statistical methods only consider interactions between genes and static environmental exposures, many environmental/lifestyle factors, such as air pollution and diet, change over time, and cannot be accurately captured at one measurement time point or by simply categorizing into static exposure categories. There is a dearth of statistical methods for detecting gene by time-varying environmental exposure interactions. Here, we propose a powerful functional logistic regression (FLR) approach to model the time-varying effect of longitudinal environmental exposure and its interaction with genetic factors on disease risk. Capitalizing on the powerful functional data analysis framework, our proposed FLR model is capable of accommodating longitudinal exposures measured at irregular time points and contaminated by measurement errors, commonly encountered in observational studies. We use extensive simulations to show that the proposed method can control the Type I error and is more powerful than alternative ad hoc methods. We demonstrate the utility of this new method using data from a case-control study of pancreatic cancer to identify the windows of vulnerability of lifetime body mass index on the risk of pancreatic cancer as well as genes that may modify this association. PMID:25219575

  19. The impact of smoking and polymorphic enzymes of xenobiotic metabolism on the stage of bladder tumors: a generalized ordered logistic regression analysis.

    PubMed

    Khedhiri, Sami; Stambouli, Nejla; Ouerhani, Slah; Rouissi, Kamel; Marrakchi, Raja; Gaaied, Amel B; Slama, M B

    2010-07-01

    Cigarette smoking is the predominant risk factor for bladder cancer in males and females. The tobacco carcinogens are metabolized by various xenobiotic metabolizing enzymes such as N-acetyltransferases (NAT) and glutathione S-transferases (GST). Polymorphisms in NAT and GST genes alter the ability of these enzymes to metabolize carcinogens. In this paper, we conduct a statistical analysis based on logistic regressions to assess the impact of smoking and metabolizing enzyme genotypes on the risk to develop bladder cancer using a case-control study from Tunisia. We also use the generalized ordered logistic model to investigate whether these factors do have an impact on the progression of bladder tumors. PMID:20063011

  20. Predicting Falls and When to Intervene in Older People: A Multilevel Logistical Regression Model and Cost Analysis

    PubMed Central

    Smith, Matthew I.; de Lusignan, Simon; Mullett, David; Correa, Ana; Tickner, Jermaine; Jones, Simon

    2016-01-01

    Introduction Falls are the leading cause of injury in older people. Reducing falls could reduce financial pressures on health services. We carried out this research to develop a falls risk model, using routine primary care and hospital data to identify those at risk of falls, and apply a cost analysis to enable commissioners of health services to identify those in whom savings can be made through referral to a falls prevention service. Methods Multilevel logistical regression was performed on routinely collected general practice and hospital data from 74751 over 65’s, to produce a risk model for falls. Validation measures were carried out. A cost-analysis was performed to identify at which level of risk it would be cost-effective to refer patients to a falls prevention service. 95% confidence intervals were calculated using a Monte Carlo Model (MCM), allowing us to adjust for uncertainty in the estimates of these variables. Results A risk model for falls was produced with an area under the curve of the receiver operating characteristics curve of 0.87. The risk cut-off with the highest combination of sensitivity and specificity was at p = 0.07 (sensitivity of 81% and specificity of 78%). The risk cut-off at which savings outweigh costs was p = 0.27 and the risk cut-off with the maximum savings was p = 0.53, which would result in referral of 1.8% and 0.45% of the over 65’s population respectively. Above a risk cut-off of p = 0.27, costs do not exceed savings. Conclusions This model is the best performing falls predictive tool developed to date; it has been developed on a large UK city population; can be readily run from routine data; and can be implemented in a way that optimises the use of health service resources. Commissioners of health services should use this model to flag and refer patients at risk to their falls service and save resources. PMID:27448280

  1. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  2. Modeling simultaneous exceedance of drinking-water standards of arsenic and nitrate in the Southern Ogallala aquifer using multinomial logistic regression

    NASA Astrophysics Data System (ADS)

    Venkataraman, Kartik; Uddameri, Venkatesh

    2012-08-01

    SummaryThe occurrence of elevated levels of arsenic and nitrate in aquifers impacted by agricultural activities is common and can result in adverse health effects in rural areas. Numerous wells located in the Ogallala aquifer in the Southern High Plains of Texas have tested positive for both arsenic and nitrate MCL exceedance. To model the simultaneous exceedance of both chemicals, two types of Logistic Regression (LR) models were developed by (a) treating arsenic and nitrate independently and combining the marginal probabilities of their exceedance, and (b) treating the two exceedances together by using a multinomial model. Influencing variables representative of both soil and aquifer properties and data for which was readily available were identified. The predictive capacities of the two models were evaluated using Received Operating Characteristics (ROCs) and spatial trends in predictions were studied. The LR model constructed from the marginal probabilities had lower overall accuracy (59% correct classifications) and was extremely conservative by over-predicting outcomes. In contrast, the multinomial model showed good overall accuracy (79% correct classifications), made the correct predictions 90% of the time when both arsenic and nitrate MCL exceedances were observed, and was a good fit for wells located in agricultural areas. The results of the multinomial model also confirm previous studies that attributed shallow subsurface arsenic to anthropogenic activities. Based on the insights provided by the model it is recommended that where agricultural areas are concerned, the occurrence of arsenic and nitrate are better evaluated together.

  3. A Logistic Regression Analysis of Turkey's 15-Year-Olds' Scoring above the OECD Average on the PISA'09 Reading Assessment

    ERIC Educational Resources Information Center

    Kasapoglu, Koray

    2014-01-01

    This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split…

  4. Association of Perceived Stress with Stressful Life Events, Lifestyle and Sociodemographic Factors: A Large-Scale Community-Based Study Using Logistic Quantile Regression

    PubMed Central

    Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza

    2012-01-01

    Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people. PMID:23091560

  5. Logistic-AFT location-scale mixture regression models with nonsusceptibility for left-truncated and general interval-censored data.

    PubMed

    Chen, Chen-Hsin; Tsay, Yuh-Chyuan; Wu, Ya-Chi; Horng, Cheng-Fang

    2013-10-30

    In conventional survival analysis there is an underlying assumption that all study subjects are susceptible to the event. In general, this assumption does not adequately hold when investigating the time to an event other than death. Owing to genetic and/or environmental etiology, study subjects may not be susceptible to the disease. Analyzing nonsusceptibility has become an important topic in biomedical, epidemiological, and sociological research, with recent statistical studies proposing several mixture models for right-censored data in regression analysis. In longitudinal studies, we often encounter left, interval, and right-censored data because of incomplete observations of the time endpoint, as well as possibly left-truncated data arising from the dissimilar entry ages of recruited healthy subjects. To analyze these kinds of incomplete data while accounting for nonsusceptibility and possible crossing hazards in the framework of mixture regression models, we utilize a logistic regression model to specify the probability of susceptibility, and a generalized gamma distribution, or a log-logistic distribution, in the accelerated failure time location-scale regression model to formulate the time to the event. Relative times of the conditional event time distribution for susceptible subjects are extended in the accelerated failure time location-scale submodel. We also construct graphical goodness-of-fit procedures on the basis of the Turnbull-Frydman estimator and newly proposed residuals. Simulation studies were conducted to demonstrate the validity of the proposed estimation procedure. The mixture regression models are illustrated with alcohol abuse data from the Taiwan Aboriginal Study Project and hypertriglyceridemia data from the Cardiovascular Disease Risk Factor Two-township Study in Taiwan. PMID:23661280

  6. Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey)

    NASA Astrophysics Data System (ADS)

    Tunusluoglu, M. C.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.

    2008-03-01

    Debris flow is one of the most destructive mass movements. Sometimes regional debris flow susceptibility or hazard assessments can be more difficult than the other mass movements. Determination of debris accumulation zones and debris source areas, which is one of the most crucial stages in debris flow investigations, can be too difficult because of morphological restrictions. The main goal of the present study is to extract debris source areas by logistic regression analyses based on the data from the slopes of the Barla, Besparmak and Kapi Mountains in the SW part of the Taurids Mountain belt of Turkey, where formation of debris material are clearly evident and common. In this study, in order to achieve this goal, extensive field observations to identify the areal extent of debris source areas and debris material, air-photo studies to determine the debris source areas and also desk studies including Geographical Information System (GIS) applications and statistical assessments were performed. To justify the training data used in logistic regression analyses as representative, a random sampling procedure was applied. By using the results of the logistic regression analysis, the debris source area probability map of the region is produced. However, according to the field experiences of the authors, the produced map yielded over-predicted results. The main source of the over-prediction is structural relation between the bedding planes and slope aspects on the basis of the field observations, for the generation of debris, the dip of the bedding planes must be taken into consideration regarding the slope face. In order to eliminate this problem, in this study, an approach has been developed using probability distribution of the aspect values. With the application of structural adjustment, the final adjusted debris source area probability map is obtained for the study area. The field observations revealed that the actual debris source areas in the field coincide with

  7. GIS-based groundwater spring potential mapping in the Sultan Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic regression methods and their comparison

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan

    2011-12-01

    SummaryIn this study, groundwater spring potential maps produced by three different methods, frequency ratio, weights of evidence, and logistic regression, were evaluated using validation data sets and compared to each other. Groundwater spring occurrence potential maps in the Sultan Mountains (Konya, Turkey) were constructed using the relationship between groundwater spring locations and their causative factors. Groundwater spring locations were identified in the study area from a topographic map. Different thematic maps of the study area, such as geology, topography, geomorphology, hydrology, and land use/cover, have been used to identify groundwater potential zones. Seventeen spring-related parameter layers of the entire study area were used to generate groundwater spring potential maps. These are geology (lithology), fault density, distance to fault, relative permeability of lithologies, elevation, slope aspect, slope steepness, curvature, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport capacity index, drainage density, distance to drainage, land use/cover, and precipitation. The predictive capability of each model was determined by the area under the relative operating characteristic curve. The areas under the curve for frequency ratio, weights of evidence and logistic regression methods were calculated as 0.903, 0.880, and 0.840, respectively. These results indicate that frequency ratio and weights of evidence models are relatively good estimators, whereas logistic regression is a relatively poor estimator of groundwater spring potential mapping in the study area. The frequency ratio model is simple; the process of input, calculation and output can be readily understood. The produced groundwater spring potential maps can serve planners and engineers in groundwater development plans and land-use planning.

  8. Evaluating risk factors for endemic human Salmonella Enteritidis infections with different phage types in Ontario, Canada using multinomial logistic regression and a case-case study approach

    PubMed Central

    2012-01-01

    Background Identifying risk factors for Salmonella Enteritidis (SE) infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT) in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a) have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a) as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68) and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94), after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors. PMID:23057531

  9. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    PubMed

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia. PMID:25155260

  10. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan

    2011-07-01

    SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model

  11. 44 CFR 208.38 - Reimbursement for re-supply and logistics costs incurred during Activation.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 44 Emergency Management and Assistance 1 2010-10-01 2010-10-01 false Reimbursement for re-supply and logistics costs incurred during Activation. 208.38 Section 208.38 Emergency Management and...-supply and logistics costs incurred during Activation. With the exception of emergency...

  12. Understanding data in clinical research: a simple graphical display for plotting data (up to four independent variables) after binary logistic regression analysis.

    PubMed

    Mesa, José Luis

    2004-01-01

    In clinical research, suitable visualization techniques of data after statistical analysis are crucial for the researches' and physicians' understanding. Common statistical techniques to analyze data in clinical research are logistic regression models. Among these, the application of binary logistic regression analysis (LRA) has greatly increased during past years, due to its diagnostic accuracy and because scientists often want to analyze in a dichotomous way whether some event will occur or not. Such an analysis lacks a suitable, understandable, and widely used graphical display, instead providing an understandable logit function based on a linear model for the natural logarithm of the odds in favor of the occurrence of the dependent variable, Y. By simple exponential transformation, such a logit equation can be transformed into a logistic function, resulting in predicted probabilities for the presence of the dependent variable, P(Y-1/X). This model can be used to generate a simple graphical display for binary LRA. For the case of a single predictor or explanatory (independent) variable, X, a plot can be generated with X represented by the abscissa (i.e., horizontal axis) and P(Y-1/X) represented by the ordinate (i.e., vertical axis). For the case of multiple predictor models, I propose here a relief 3D surface graphic in order to plot up to four independent variables (two continuous and two discrete). By using this technique, any researcher or physician would be able to transform a lesser understandable logit function into a figure easier to grasp, thus leading to a better knowledge and interpretation of data in clinical research. For this, a sophisticated statistical package is not necessary, because the graphical display may be generated by using any 2D or 3D surface plotter. PMID:14962632

  13. Alternative Approach To Modeling Bacterial Lag Time, Using Logistic Regression as a Function of Time, Temperature, pH, and Sodium Chloride Concentration

    PubMed Central

    Nonaka, Junko

    2012-01-01

    The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation. PMID:22729541

  14. Alternative approach to modeling bacterial lag time, using logistic regression as a function of time, temperature, pH, and sodium chloride concentration.

    PubMed

    Koseki, Shige; Nonaka, Junko

    2012-09-01

    The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation. PMID:22729541

  15. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    NASA Astrophysics Data System (ADS)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2

  16. Geographic information systems and logistic regression for high-resolution malaria risk mapping in a rural settlement of the southern Brazilian Amazon

    PubMed Central

    2013-01-01

    Background In Brazil, 99% of the cases of malaria are concentrated in the Amazon region, with high level of transmission. The objectives of the study were to use geographic information systems (GIS) analysis and logistic regression as a tool to identify and analyse the relative likelihood and its socio-environmental determinants of malaria infection in the Vale do Amanhecer rural settlement, Brazil. Methods A GIS database of georeferenced malaria cases, recorded in 2005, and multiple explanatory data layers was built, based on a multispectral Landsat 5 TM image, digital map of the settlement blocks and a SRTM digital elevation model. Satellite imagery was used to map the spatial patterns of land use and cover (LUC) and to derive spectral indices of vegetation density (NDVI) and soil/vegetation humidity (VSHI). An Euclidian distance operator was applied to measure proximity of domiciles to potential mosquito breeding habitats and gold mining areas. The malaria risk model was generated by multiple logistic regression, in which environmental factors were considered as independent variables and the number of cases, binarized by a threshold value was the dependent variable. Results Out of a total of 336 cases of malaria, 133 positive slides were from inhabitants at Road 08, which corresponds to 37.60% of the notifications. The southern region of the settlement presented 276 cases and a greater number of domiciles in which more than ten cases/home were notified. From these, 102 (30.36%) cases were caused by Plasmodium falciparum and 174 (51.79%) cases by Plasmodium vivax. Malaria risk is the highest in the south of the settlement, associated with proximity to gold mining sites, intense land use, high levels of soil/vegetation humidity and low vegetation density. Conclusions Mid-resolution, remote sensing data and GIS-derived distance measures can be successfully combined with digital maps of the housing location of (non-) infected inhabitants to predict relative

  17. A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan; Altural, Tolga

    2013-03-01

    This study evaluated and compared landslide susceptibility maps produced with three different methods, frequency ratio, weights of evidence, and logistic regression, by using validation datasets. The field surveys performed as part of this investigation mapped the locations of 90 landslides that had been identified in the Sultan Mountains of south-western Turkey. The landslide influence parameters used for this study are geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transportation capacity index, distance to drainage, distance to fault, drainage density, fault density, and spring density maps. The relationships between landslide distributions and these parameters were analysed using the three methods, and the results of these methods were then used to calculate the landslide susceptibility of the entire study area. The accuracy of the final landslide susceptibility maps was evaluated based on the landslides observed during the fieldwork, and the accuracy of the models was evaluated by calculating each model's relative operating characteristic curve. The predictive capability of each model was determined from the area under the relative operating characteristic curve and the areas under the curves obtained using the frequency ratio, logistic regression, and weights of evidence methods are 0.976, 0.952, and 0.937, respectively. These results indicate that the frequency ratio and weights of evidence models are relatively good estimators of landslide susceptibility in the study area. Specifically, the results of the correlation analysis show a high correlation between the frequency ratio and weights of evidence results, and the frequency ratio and logistic regression methods exhibit correlation coefficients of 0.771 and 0.727, respectively. The frequency ratio model is simple, and its input, calculation and output processes are

  18. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    NASA Astrophysics Data System (ADS)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  19. Detection of aberrant responding on a personality scale in a military sample: an application of evaluating person fit with two-level logistic regression.

    PubMed

    Woods, Carol M; Oltmanns, Thomas F; Turkheimer, Eric

    2008-06-01

    Person-fit assessment is used to identify persons who respond aberrantly to a test or questionnaire. In this study, S. P. Reise's (2000) method for evaluating person fit using 2-level logistic regression was applied to 13 personality scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. Clark, 1996) that had been administered to military recruits (N = 2,026). Results revealed significant person-fit heterogeneity and indicated that for 5 SNAP scales (Disinhibition, Entitlement, Exhibitionism, Negative Temperament, and Workaholism), the scale was more discriminating for some people than for others. Possible causes of aberrant responding were explored with several covariates. On all 5 scales, severe pathology emerged as a key influence on responses, and there was evidence of differential test functioning with respect to gender, ethnicity, or both. Other potential sources of aberrancy were carelessness, haphazard responding, or uncooperativeness. Social desirability was not as influential as expected. PMID:18557693

  20. lordif: An R Package for Detecting Differential Item Functioning Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations.

    PubMed

    Choi, Seung W; Gibbons, Laura E; Crane, Paul K

    2011-03-01

    Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group-specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient-Reported Outcomes Measurement Information System. PMID:21572908

  1. Novel Logistic Regression Model of Chest CT Attenuation Coefficient Distributions for the Automated Detection of Abnormal (Emphysema or ILD) versus Normal Lung

    PubMed Central

    Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.

    2015-01-01

    Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294

  2. Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

    SciTech Connect

    Robertson, John M.; Soehn, Matthias; Yan Di

    2010-05-01

    Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to develop a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.

  3. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    PubMed

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence. PMID:26837828

  4. Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees

    PubMed Central

    2013-01-01

    Background Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann–Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. Results The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence… Conclusions The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence. PMID:23506640

  5. Large scale landslide susceptibility assessment using the statistical methods of logistic regression and BSA - study case: the sub-basin of the small Niraj (Transylvania Depression, Romania)

    NASA Astrophysics Data System (ADS)

    Roşca, S.; Bilaşco, Ş.; Petrea, D.; Fodorean, I.; Vescan, I.; Filip, S.; Măguţ, F.-L.

    2015-11-01

    The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2) is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area) and predictability of the logistic model (AUROC = 0.63 for the validation area).

  6. A Simulation Study to Assess the Effect of the Number of Response Categories on the Power of Ordinal Logistic Regression for Differential Item Functioning Analysis in Rating Scales

    PubMed Central

    Allahyari, Elahe; Jafari, Peyman; Bagheri, Zahra

    2016-01-01

    Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected. PMID:27403207

  7. Landslide-susceptibility analysis using light detection and ranging-derived digital elevation models and logistic regression models: a case study in Mizunami City, Japan

    NASA Astrophysics Data System (ADS)

    Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji

    2013-01-01

    To mitigate the damage caused by landslide disasters, different mathematical models have been applied to predict landslide spatial distribution characteristics. Although some researchers have achieved excellent results around the world, few studies take the spatial resolution of the database into account. Four types of digital elevation model (DEM) ranging from 2 to 20 m derived from light detection and ranging technology to analyze landslide susceptibility in Mizunami City, Gifu Prefecture, Japan, are presented. Fifteen landslide-causative factors are considered using a logistic-regression approach to create models for landslide potential analysis. Pre-existing landslide bodies are used to evaluate the performance of the four models. The results revealed that the 20-m model had the highest classification accuracy (71.9%), whereas the 2-m model had the lowest value (68.7%). In the 2-m model, 89.4% of the landslide bodies fit in the medium to very high categories. For the 20-m model, only 83.3% of the landslide bodies were concentrated in the medium to very high classes. When the cell size decreases from 20 to 2 m, the area under the relative operative characteristic increases from 0.68 to 0.77. Therefore, higher-resolution DEMs would provide better results for landslide-susceptibility mapping.

  8. Structural vascular disease in Africans: Performance of ethnic-specific waist circumference cut points using logistic regression and neural network analyses: The SABPA study.

    PubMed

    Botha, J; de Ridder, J H; Potgieter, J C; Steyn, H S; Malan, L

    2013-10-01

    A recently proposed model for waist circumference cut points (RPWC), driven by increased blood pressure, was demonstrated in an African population. We therefore aimed to validate the RPWC by comparing the RPWC and the Joint Statement Consensus (JSC) models via Logistic Regression (LR) and Neural Networks (NN) analyses. Urban African gender groups (N=171) were stratified according to the JSC and RPWC cut point models. Ultrasound carotid intima media thickness (CIMT), blood pressure (BP) and fasting bloods (glucose, high density lipoprotein (HDL) and triglycerides) were obtained in a well-controlled setting. The RPWC male model (LR ROC AUC: 0.71, NN ROC AUC: 0.71) was practically equal to the JSC model (LR ROC AUC: 0.71, NN ROC AUC: 0.69) to predict structural vascular -disease. Similarly, the female RPWC model (LR ROC AUC: 0.84, NN ROC AUC: 0.82) and JSC model (LR ROC AUC: 0.82, NN ROC AUC: 0.81) equally predicted CIMT as surrogate marker for structural vascular disease. Odds ratios supported validity where prediction of CIMT revealed -clinical -significance, well over 1, for both the JSC and RPWC models in African males and females (OR 3.75-13.98). In conclusion, the proposed RPWC model was substantially validated utilizing linear and non-linear analyses. We therefore propose ethnic-specific WC cut points (African males, ≥90 cm; -females, ≥98 cm) to predict a surrogate marker for structural vascular disease. PMID:23934678

  9. Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men

    SciTech Connect

    Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

    1985-08-01

    The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

  10. Spatial prediction of Lactarius deliciosus and Lactarius salmonicolor mushroom distribution with logistic regression models in the Kızılcasu Planning Unit, Turkey.

    PubMed

    Mumcu Kucuker, Derya; Baskent, Emin Zeki

    2015-01-01

    Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans. PMID:24821473

  11. Shallow landslide susceptibility model for the Oria river basin, Gipuzkoa province (North of Spain). Application of the logistic regression and comparison with previous studies.

    NASA Astrophysics Data System (ADS)

    Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange

    2016-04-01

    In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.

  12. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo

    PubMed Central

    JACOB, BENJAMIN G.; NOVAK, ROBERT J.; TOE, LAURENT; SANFO, MOUSSA S.; AFRIYIE, ABENA N.; IBRAHIM, MOHAMMED A.; GRIFFITH, DANIEL A.; UNNASCH, THOMAS R.

    2013-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  13. 78 FR 73872 - Agency Information Collection Activities: Proposed Collection; Comment Request; Logistics...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-09

    ...; Comment Request; Logistics Capability Assistance Tool (LCAT) AGENCY: Federal Emergency Management Agency... accordance with the Paperwork Reduction Act of 1995, this notice seeks comments concerning the Logistics... Repass, Program Analyst, Logistics Management Directorate, Logistics Plans & Exercises Division,...

  14. Landslide Susceptibility Analysis by the comparison and integration of Random Forest and Logistic Regression methods; application to the disaster of Nova Friburgo - Rio de Janeiro, Brasil (January 2011)

    NASA Astrophysics Data System (ADS)

    Esposito, Carlo; Barra, Anna; Evans, Stephen G.; Scarascia Mugnozza, Gabriele; Delaney, Keith

    2014-05-01

    The study of landslide susceptibility by multivariate statistical methods is based on finding a quantitative relationship between controlling factors and landslide occurrence. Such studies have become popular in the last few decades thanks to the development of geographic information systems (GIS) software and the related improved data management. In this work we applied a statistical approach to an area of high landslide susceptibility mainly due to its tropical climate and geological-geomorphological setting. The study area is located in the south-east region of Brazil that has frequently been affected by flood and landslide hazard, especially because of heavy rainfall events during the summer season. In this work we studied a disastrous event that occurred on January 11th and 12th of 2011, which involved Região Serrana (the mountainous region of Rio de Janeiro State) and caused more than 5000 landslides and at least 904 deaths. In order to produce susceptibility maps, we focused our attention on an area of 93,6 km2 that includes Nova Friburgo city. We utilized two different multivariate statistic methods: Logistic Regression (LR), already widely used in applied geosciences, and Random Forest (RF), which has only recently been applied to landslide susceptibility analysis. With reference to each mapping unit, the first method (LR) results in a probability of landslide occurrence, while the second one (RF) gives a prediction in terms of % of area susceptible to slope failure. With this aim in mind, a landslide inventory map (related to the studied event) has been drawn up through analyses of high-resolution GeoEye satellite images, in a GIS environment. Data layers of 11 causative factors have been created and processed in order to be used as continuous numerical or discrete categorical variables in statistical analysis. In particular, the logistic regression method has frequent difficulties in managing numerical continuous and discrete categorical variables

  15. Predicting reintubation, prolonged mechanical ventilation and death in post-coronary artery bypass graft surgery: a comparison between artificial neural networks and logistic regression models

    PubMed Central

    Mendes, Renata G.; de Souza, César R.; Machado, Maurício N.; Correa, Paulo R.; Di Thommazo-Luporini, Luciana; Arena, Ross; Myers, Jonathan; Pizzolato, Ednaldo B.

    2015-01-01

    Introduction In coronary artery bypass (CABG) surgery, the common complications are the need for reintubation, prolonged mechanical ventilation (PMV) and death. Thus, a reliable model for the prognostic evaluation of those particular outcomes is a worthwhile pursuit. The existence of such a system would lead to better resource planning, cost reductions and an increased ability to guide preventive strategies. The aim of this study was to compare different methods – logistic regression (LR) and artificial neural networks (ANNs) – in accomplishing this goal. Material and methods Subjects undergoing CABG (n = 1315) were divided into training (n = 1053) and validation (n = 262) groups. The set of independent variables consisted of age, gender, weight, height, body mass index, diabetes, creatinine level, cardiopulmonary bypass, presence of preserved ventricular function, moderate and severe ventricular dysfunction and total number of grafts. The PMV was also an input for the prediction of death. The ability of ANN to discriminate outcomes was assessed using receiver-operating characteristic (ROC) analysis and the results were compared using a multivariate LR. Results The ROC curve areas for LR and ANN models, respectively, were: for reintubation 0.62 (CI: 0.50–0.75) and 0.65 (CI: 0.53–0.77); for PMV 0.67 (CI: 0.57–0.78) and 0.72 (CI: 0.64–0.81); and for death 0.86 (CI: 0.79–0.93) and 0.85 (CI: 0.80–0.91). No differences were observed between models. Conclusions The ANN has similar discriminating power in predicting reintubation, PMV and death outcomes. Thus, both models may be applicable as a predictor for these outcomes in subjects undergoing CABG. PMID:26322087

  16. Comparing performances of logistic regression and neural networks for predicting melatonin excretion patterns in the rat exposed to ELF magnetic fields.

    PubMed

    Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi

    2010-02-01

    Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in

  17. A Remote Sensing Based Approach for the Assessment of Debris Flow Hazards Using Artificial Neural Network and Binary Logistic Regression Modeling

    NASA Astrophysics Data System (ADS)

    El Kadiri, R.; Sultan, M.; Elbayoumi, T.; Sefry, S.

    2013-12-01

    Efforts to map the distribution of debris flows, to assess the factors controlling their development, and to identify the areas prone to their development are often hampered by the absence or paucity of appropriate monitoring systems and historical databases and the inaccessibility of these areas in many parts of the world. We developed methodologies that heavily rely on readily available observations extracted from remote sensing datasets and successfully applied these techniques over the the Jazan province, in the Red Sea hills of Saudi Arabia. We first identified debris flows (10,334 locations) from high spatial resolution satellite datasets (e.g., GeoEye, Orbview), and verified a subset of these occurrences in the field. We then constructed a GIS to host the identified debris flow locations together with co-registered relevant data (e.g., lithology, elevation) and derived products (e.g., slope, normalized difference vegetation index, etc). Spatial analysis of the data sets in the GIS sets indicated various degrees of correspondence between the distribution of debris flows and various variables (e.g., stream power index, topographic position index, normalized difference vegetation index, distance to stream, flow accumulation, slope and soil weathering index, aspect, elevation) suggesting a causal effect. For example, debris flows were found in areas of high slope, low distance to low stream orders and low vegetation index. To evaluate the extent to which these factors control landslide distribution, we constructed and applied: (1) a stepwise input selection by testing all input combinations to make the final model more compact and effective, (2) a statistic-based binary logistic regression (BLR) model, and (3) a mathematical-based artificial neural network (ANN) model. Only 80% (8267 locations) of the data was used for the construction of each of the models and the remaining samples (2067 locations) were used for the accuracy assessment purposes. Results

  18. Three-Level Mixed-Effects Logistic Regression Analysis Reveals Complex Epidemiology of Swine Rotaviruses in Diagnostic Samples from North America

    PubMed Central

    Diaz, Andres; Rossow, Stephanie; Ciarlet, Max; Marthaler, Douglas

    2016-01-01

    Rotaviruses (RV) are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively) had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82%) were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs) were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs) for RVA and RVB detection were lower for 1–3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1–3 day old pigs was significantly higher (p < 0.001) than pigs in the 4–20 days old and >55 day old age groups. Furthermore, pigs in the 21–55 day old age group had statistically higher cORs of RV co-detection compared to 1–3 day old pigs (p < 0.001). The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America. PMID:27145176

  19. Evaluation of the diagnostic value of alpha-l-fucosidase, alpha-fetoprotein and thymidine kinase 1 with ROC and logistic regression for hepatocellular carcinoma

    PubMed Central

    Zhang, Shi-Yan; Lin, Bi-Ding; Li, Bu-Ren

    2015-01-01

    The purpose of this study was to evaluate the diagnostic efficiency for hepatocellular carcinoma (HCC) with the combined analysis of alpha-l-fucosidase (AFU), alpha-fetoprotein (AFP) and thymidine kinase 1 (TK1). Serum levels of AFU, AFP and TK1 were measured in: 116 patients with HCC, 109 patients with benign hepatic diseases, and 104 normal subjects. The diagnostic value was analyzed using the logistic regression equation and receiver operating characteristic curves (ROC). Statistical distribution of the three tested tumor markers in every group was non-normally distributed (Kolmogorov–Sminov test, Z = 0.156–0.517, P < 0.001). The serum levels of AFP and TK1 in patients with HCC were significantly higher than those in patients with benign hepatic diseases (Mann–Whitney U test, Z = −8.570 to –5.943, all P < 0.001). However, there was no statistically significant difference of AFU between these two groups (Mann–Whitney U test, Z = −1.820, P = 0.069). The levels of AFU were significantly higher in patients with benign hepatic diseases than in normal subjects (Mann–Whitney U test, Z = −7.984, P < 0.001). Receiver operating characteristic curves (ROC) in patients with HCC versus those without HCC indicated the optimal cut-off value was 40.80 U/L for AFU, 10.86 μg/L for AFP and 1.92 pmol/L for TK1, respectively. The area under ROC curve (AUC) was 0.718 for AFU, 0.832 for AFP, 0.773 for TK1 and 0.900 for the combination of the three tumor markers. The combination resulted in a higher Youden index and a sensitivity of 85.3%. The combined detection of serum AFU, AFP and TK1 could play a complementary role in the diagnosis of HCC, and could significantly improve the sensitivity for the diagnosis of HCC. PMID:25870783

  20. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  1. RESEARCH PAPER: A logistic model for magnetic energy storage in solar active regions

    NASA Astrophysics Data System (ADS)

    Wang, Hua-Ning; Cui, Yan-Mei; He, Han

    2009-06-01

    Previous statistical analyses of a large number of SOHO/MDI full disk longitudinal magnetograms provided a result that demonstrated how responses of solar flares to photospheric magnetic properties can be fitted with sigmoid functions. A logistic model reveals that these fitted sigmoid functions might be related to the free energy storage process in solar active regions. Although this suggested model is rather simple, the free energy level of active regions can be estimated and the probability of a solar flare with importance over a threshold can be forecast within a given time window.

  2. Predicting Antitumor Activity of Peptides by Consensus of Regression Models Trained on a Small Data Sample

    PubMed Central

    Radman, Andreja; Gredičak, Matija; Kopriva, Ivica; Jerić, Ivanka

    2011-01-01

    Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met) with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel) support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM) regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample. PMID:22272081

  3. Predicting antitumor activity of peptides by consensus of regression models trained on a small data sample.

    PubMed

    Radman, Andreja; Gredičak, Matija; Kopriva, Ivica; Jerić, Ivanka

    2011-01-01

    Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met) with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel) support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM) regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample. PMID:22272081

  4. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  5. Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

    ERIC Educational Resources Information Center

    Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.

    2013-01-01

    The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

  6. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  7. Multivariate Logistic Regression for Predicting Total Culturable Virus Presence at the Intake of a Potable-Water Treatment Plant: Novel Application of the Atypical Coliform/Total Coliform Ratio▿

    PubMed Central

    Black, L. E.; Brion, G. M.; Freitas, S. J.

    2007-01-01

    Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus. PMID:17468270

  8. Impact of Colic Pain as a Significant Factor for Predicting the Stone Free Rate of One-Session Shock Wave Lithotripsy for Treating Ureter Stones: A Bayesian Logistic Regression Model Analysis

    PubMed Central

    Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong

    2015-01-01

    Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059

  9. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    ERIC Educational Resources Information Center

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  10. Logistics planning for phased programs.

    NASA Technical Reports Server (NTRS)

    Cook, W. H.

    1973-01-01

    It is pointed out that the proper and early integration of logistics planning into the phased program planning process will drastically reduce these logistics costs. Phased project planning is a phased approach to the planning, approval, and conduct of major research and development activity. A progressive build-up of knowledge of all aspects of the program is provided. Elements of logistics are discussed together with aspects of integrated logistics support, logistics program planning, and logistics activities for phased programs. Continuing logistics support can only be assured if there is a comprehensive sequential listing of all logistics activities tied to the program schedule and a real-time inventory of assets.

  11. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

    PubMed Central

    Nepal, Reecha; Spencer, Joanna; Bhogal, Guneet; Nedunuri, Amulya; Poelman, Thomas; Kamath, Thejas; Chung, Edwin; Kantardjieff, Katherine; Gottlieb, Andrea; Lustig, Brooke

    2015-01-01

    A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov–Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications. PMID:26664348

  12. KSC ISS Logistics Support

    NASA Technical Reports Server (NTRS)

    Tellado, Joseph

    2014-01-01

    The presentation contains a status of KSC ISS Logistics Operations. It basically presents current top level ISS Logistics tasks being conducted at KSC, current International Partner activities, hardware processing flow focussing on late Stow operations, list of KSC Logistics POC's, and a backup list of Logistics launch site services. This presentation is being given at the annual International Space Station (ISS) Multi-lateral Logistics Maintenance Control Panel meeting to be held in Turin, Italy during the week of May 13-16. The presentatiuon content doesn't contain any potential lessons learned.

  13. The management challenge for household waste in emerging economies like Brazil: realistic source separation and activation of reverse logistics.

    PubMed

    Fehr, M

    2014-09-01

    Business opportunities in the household waste sector in emerging economies still evolve around the activities of bulk collection and tipping with an open material balance. This research, conducted in Brazil, pursued the objective of shifting opportunities from tipping to reverse logistics in order to close the balance. To do this, it illustrated how specific knowledge of sorted waste composition and reverse logistics operations can be used to determine realistic temporal and quantitative landfill diversion targets in an emerging economy context. Experimentation constructed and confirmed the recycling trilogy that consists of source separation, collection infrastructure and reverse logistics. The study on source separation demonstrated the vital difference between raw and sorted waste compositions. Raw waste contained 70% biodegradable and 30% inert matter. Source separation produced 47% biodegradable, 20% inert and 33% mixed material. The study on collection infrastructure developed the necessary receiving facilities. The study on reverse logistics identified private operators capable of collecting and processing all separated inert items. Recycling activities for biodegradable material were scarce and erratic. Only farmers would take the material as animal feed. No composting initiatives existed. The management challenge was identified as stimulating these activities in order to complete the trilogy and divert the 47% source-separated biodegradable discards from the landfills. PMID:24990590

  14. Continuous Melatonin Attenuates the Regressing Activities of Short Photoperiod in Male Golden Hamsters

    PubMed Central

    Choi, Donchan

    2013-01-01

    Golden hamsters reproduce in a limited time of a year. Their sexual activities are active in summer but inactive in winter during which day length does not exceed night time and environmental conditions are severe to them. The reproductive activities are determined by the length of light in a day (photoperiod). Melatonin is synthesized and secreted only at night time from the pineal gland. Duration of elevated melatonin is longer in winter than summer, resulting in gonadal regression. The present study aimed at the influences of continuous melatonin treatments impinging on the gonadal function in male golden hamsters. Animals received empty or melatonin-filled capsules for 10 weeks. They were divided into long photoperiod (LP) and short photoperiod (SP). All the animals maintained in LP (either empty or melatonin-filled capsules) showed large testes, implying that melatonin had no effects on testicular functions. Animals housed in SP displayed completely regressed testes. But animals kept in SP and implanted with melatonin capsules exhibited blockage of full regression by SP. These results suggest that constant release of melatonin prohibits the regressing influence of SP. PMID:25949127

  15. 44 CFR 208.38 - Reimbursement for re-supply and logistics costs incurred during Activation.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 44 Emergency Management and Assistance 1 2011-10-01 2011-10-01 false Reimbursement for re-supply... URBAN SEARCH AND RESCUE RESPONSE SYSTEM Response Cooperative Agreements § 208.38 Reimbursement for re... this subpart, DHS will not reimburse costs incurred for re-supply and logistical support...

  16. A Latent Transition Model with Logistic Regression

    ERIC Educational Resources Information Center

    Chung, Hwan; Walls, Theodore A.; Park, Yousung

    2007-01-01

    Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian…

  17. Restriction of Replication Fork Regression Activities by a Conserved SMC Complex

    PubMed Central

    Xue, Xiaoyu; Choi, Koyi; Bonner, Jaclyn; Chiba, Tamara; Kwon, Youngho; Xu, Yuanyuan; Sanchez, Humberto; Wyman, Claire; Niu, Hengyao; Zhao, Xiaolan; Sung, Patrick

    2014-01-01

    SUMMARY Conserved, multi-tasking DNA helicases mediate diverse DNA transactions and are relevant for human disease pathogenesis. These helicases and their regulation help maintain genome stability during DNA replication and repair. We show that the structural maintenance of chromosome complex Smc5-Smc6 restrains the replication fork regression activity of Mph1 helicase, but not its D-loop disruptive activity. This regulatory mechanism enables flexibility in replication fork repair without interfering with DNA break repair. In vitro studies find that Smc5-Smc6 binds to a Mph1 region required for efficient fork regression, preventing assembly of Mph1 oligomers at the junction of DNA forks. In vivo impairment of this regulatory mechanism compensates for the inactivation of another fork regression helicase and increases reliance on joint DNA structure removal or avoidance. Our findings provide molecular insights into replication fork repair regulation and uncover a role of Smc5-Smc6 in directing Mph1 activity towards a specific biochemical outcome. PMID:25439736

  18. Intermittent Metronomic Drug Schedule Is Essential for Activating Antitumor Innate Immunity and Tumor Xenograft Regression12

    PubMed Central

    Chen, Chong-Sheng; Doloff, Joshua C; Waxman, David J

    2014-01-01

    Metronomic chemotherapy using cyclophosphamide (CPA) is widely associated with antiangiogenesis; however, recent studies implicate other immune-based mechanisms, including antitumor innate immunity, which can induce major tumor regression in implanted brain tumor models. This study demonstrates the critical importance of drug schedule: CPA induced a potent antitumor innate immune response and tumor regression when administered intermittently on a 6-day repeating metronomic schedule but not with the same total exposure to activated CPA administered on an every 3-day schedule or using a daily oral regimen that serves as the basis for many clinical trials of metronomic chemotherapy. Notably, the more frequent metronomic CPA schedules abrogated the antitumor innate immune and therapeutic responses. Further, the innate immune response and antitumor activity both displayed an unusually steep dose-response curve and were not accompanied by antiangiogenesis. The strong recruitment of innate immune cells by the 6-day repeating CPA schedule was not sustained, and tumor regression was abolished, by a moderate (25%) reduction in CPA dose. Moreover, an ∼20% increase in CPA dose eliminated the partial tumor regression and weak innate immune cell recruitment seen in a subset of the every 6-day treated tumors. Thus, metronomic drug treatment must be at a sufficiently high dose but also sufficiently well spaced in time to induce strong sustained antitumor immune cell recruitment. Many current clinical metronomic chemotherapeutic protocols employ oral daily low-dose schedules that do not meet these requirements, suggesting that they may benefit from optimization designed to maximize antitumor immune responses. PMID:24563621

  19. Gender differences in French GPs' activity: the contribution of quantile regressions.

    PubMed

    Dumontet, Magali; Franc, Carine

    2015-05-01

    In any fee-for-service system, doctors may be encouraged to increase the number of services (private activity) they provide to receive a higher income. Studying private activity determinants helps to predict doctors' provision of care. In the context of strong feminization and heterogeneity in general practitioners' (GP) behavior, we first aim to measure the effects of the determinants of private activity. Second, we study the evolution of these effects along the private activity distribution. Third, we examine the differences between male and female GPs. From an exhaustive database of French GPs working in private practice in 2008, we performed an ordinary least squares (OLS) regression and quantile regressions (QR) on the GPs' private activity. Among other determinants, we examined the trade-offs within the GPs' household considering his/her marital status, spousal income, and children. While the OLS results showed that female GPs had less private activity than male GPs (-13%), the QR results emphasized a private activity gender gap that increased significantly in the upper tail of the distribution. We also find gender differences in the private activity determinants, including family structure, practice characteristics, and case-mix variables. For instance, having a youngest child under 12 years old had a positive effect on the level of private activity for male GPs and a negative effect for female GPs. The results allow us to understand to what extent the supply of care differs between male and female GPs. In the context of strong feminization, this is essential to consider for organizing and forecasting the GPs' supply of care. PMID:24700186

  20. Artificial neural network and multiple regression model for nickel(II) adsorption on powdered activated carbons.

    PubMed

    Hema, M; Srinivasan, K

    2011-07-01

    Nickel removal efficiency of powered activated carbons of coconut oilcake, neem oilcake and commercial carbon was investigated by using artificial neural network. The effective parameters for the removal of nickel (%R) by adsorption process, which included the pH, contact time (T), distinctiveness of activated carbon (Cn), amount of activated carbon (Cw) and initial concentration of nickel (Co) were investigated. Levenberg-Marquardt (LM) Back-propagation algorithm is used to train the network. The network topology was optimized by varying number of hidden layer and number of neurons in hidden layer. The model was developed in terms of training; validation and testing of experimental data, the test subsets that each of them contains 60%, 20% and 20% of total experimental data, respectively. Multiple regression equation was developed for nickel adsorption system and the output was compared with both simulated and experimental outputs. Standard deviation (SD) with respect to experimental output was quite higher in the case of regression model when compared with ANN model. The obtained experimental data best fitted with the artificial neural network. PMID:23029923

  1. Quantitative structure-activity relationship of the curcumin-related compounds using various regression methods

    NASA Astrophysics Data System (ADS)

    Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi

    2016-03-01

    Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.

  2. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    USGS Publications Warehouse

    Foster, Guy M.; Graham, Jennifer L.

    2016-01-01

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  3. Inferring Active and Prognostic Ligand-Receptor Pairs with Interactions in Survival Regression Models

    PubMed Central

    Ruggeri, Christina; Eng, Kevin H

    2014-01-01

    Modeling signal transduction in cancer cells has implications for targeting new therapies and inferring the mechanisms that improve or threaten a patient’s treatment response. For transcriptome-wide studies, it has been proposed that simple correlation between a ligand and receptor pair implies a relationship to the disease process. Statistically, a differential correlation (DC) analysis across groups stratified by prognosis can link the pair to clinical outcomes. While the prognostic effect and the apparent change in correlation are both biological consequences of activation of the signaling mechanism, a correlation-driven analysis does not clearly capture this assumption and makes inefficient use of continuous survival phenotypes. To augment the correlation hypothesis, we propose that a regression framework assuming a patient-specific, latent level of signaling activation exists and generates both prognosis and correlation. Data from these systems can be inferred via interaction terms in survival regression models allowing signal transduction models beyond one pair at a time and adjusting for other factors. We illustrate the use of this model on ovarian cancer data from the Cancer Genome Atlas (TCGA) and discuss how the finding may be used to develop markers to guide targeted molecular therapies. PMID:25657571

  4. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression. PMID:18450049

  5. Insights into antioxidant activity of 1-adamantylthiopyridine analogs using multiple linear regression.

    PubMed

    Worachartcheewan, Apilak; Nantasenamat, Chanin; Owasirikul, Wiwat; Monnor, Teerawat; Naruepantawart, Orapan; Janyapaisarn, Sayamon; Prachayasittikul, Supaluk; Prachayasittikul, Virapong

    2014-02-12

    A data set of 1-adamantylthiopyridine analogs (1-19) with antioxidant activity, comprising of 2,2-diphenyl-1-picrylhydrazyl (DPPH) and superoxide dismutase (SOD) activities, was used for constructing quantitative structure-activity relationship (QSAR) models. Molecular structures were geometrically optimized at B3LYP/6-31g(d) level and subjected for further molecular descriptor calculation using Dragon software. Multiple linear regression (MLR) was employed for the development of QSAR models using 3 significant descriptors (i.e. Mor29e, F04[N-N] and GATS5v) for predicting the DPPH activity and 2 essential descriptors (i.e. EEig06r and Mor06v) for predicting the SOD activity. Such molecular descriptors accounted for the effects and positions of substituent groups (R) on the 1-adamantylthiopyridine ring. The results showed that high atomic electronegativity of polar substituent group (R = CO2H) afforded high DPPH activity, while substituent with high atomic van der Waals volumes such as R = Br gave high SOD activity. Leave-one-out cross-validation (LOO-CV) and external test set were used for model validation. Correlation coefficient (QCV) and root mean squared error (RMSECV) of the LOO-CV set for predicting DPPH activity were 0.5784 and 8.3440, respectively, while QExt and RMSEExt of external test set corresponded to 0.7353 and 4.2721, respectively. Furthermore, QCV and RMSECV values of the LOO-CV set for predicting SOD activity were 0.7549 and 5.6380, respectively. The QSAR model's equation was then used in predicting the SOD activity of tested compounds and these were subsequently verified experimentally. It was observed that the experimental activity was more potent than the predicted activity. Structure-activity relationships of significant descriptors governing antioxidant activity are also discussed. The QSAR models investigated herein are anticipated to be useful in the rational design and development of novel compounds with antioxidant activity. PMID

  6. Estrogen receptor α inhibitor activates the unfolded protein response, blocks protein synthesis, and induces tumor regression.

    PubMed

    Andruska, Neal D; Zheng, Xiaobin; Yang, Xujuan; Mao, Chengjian; Cherian, Mathew M; Mahapatra, Lily; Helferich, William G; Shapiro, David J

    2015-04-14

    Recurrent estrogen receptor α (ERα)-positive breast and ovarian cancers are often therapy resistant. Using screening and functional validation, we identified BHPI, a potent noncompetitive small molecule ERα biomodulator that selectively blocks proliferation of drug-resistant ERα-positive breast and ovarian cancer cells. In a mouse xenograft model of breast cancer, BHPI induced rapid and substantial tumor regression. Whereas BHPI potently inhibits nuclear estrogen-ERα-regulated gene expression, BHPI is effective because it elicits sustained ERα-dependent activation of the endoplasmic reticulum (EnR) stress sensor, the unfolded protein response (UPR), and persistent inhibition of protein synthesis. BHPI distorts a newly described action of estrogen-ERα: mild and transient UPR activation. In contrast, BHPI elicits massive and sustained UPR activation, converting the UPR from protective to toxic. In ERα(+) cancer cells, BHPI rapidly hyperactivates plasma membrane PLCγ, generating inositol 1,4,5-triphosphate (IP3), which opens EnR IP3R calcium channels, rapidly depleting EnR Ca(2+) stores. This leads to activation of all three arms of the UPR. Activation of the PERK arm stimulates phosphorylation of eukaryotic initiation factor 2α (eIF2α), resulting in rapid inhibition of protein synthesis. The cell attempts to restore EnR Ca(2+) levels, but the open EnR IP3R calcium channel leads to an ATP-depleting futile cycle, resulting in activation of the energy sensor AMP-activated protein kinase and phosphorylation of eukaryotic elongation factor 2 (eEF2). eEF2 phosphorylation inhibits protein synthesis at a second site. BHPI's novel mode of action, high potency, and effectiveness in therapy-resistant tumor cells make it an exceptional candidate for further mechanistic and therapeutic exploration. PMID:25825714

  7. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  8. Transcriptome Analysis of Spermatogenically Regressed, Recrudescent and Active Phase Testis of Seasonally Breeding Wall Lizards Hemidactylus flaviviridis

    PubMed Central

    Gautam, Mukesh; Mathur, Amitabh; Khan, Meraj Alam; Majumdar, Subeer S.; Rai, Umesh

    2013-01-01

    Background Reptiles are phylogenically important group of organisms as mammals have evolved from them. Wall lizard testis exhibits clearly distinct morphology during various phases of a reproductive cycle making them an interesting model to study regulation of spermatogenesis. Studies on reptile spermatogenesis are negligible hence this study will prove to be an important resource. Methodology/Principal Findings Histological analyses show complete regression of seminiferous tubules during regressed phase with retracted Sertoli cells and spermatognia. In the recrudescent phase, regressed testis regain cellular activity showing presence of normal Sertoli cells and developing germ cells. In the active phase, testis reaches up to its maximum size with enlarged seminiferous tubules and presence of sperm in seminiferous lumen. Total RNA extracted from whole testis of regressed, recrudescent and active phase of wall lizard was hybridized on Mouse Whole Genome 8×60 K format gene chip. Microarray data from regressed phase was deemed as control group. Microarray data were validated by assessing the expression of some selected genes using Quantitative Real-Time PCR. The genes prominently expressed in recrudescent and active phase testis are cytoskeleton organization GO 0005856, cell growth GO 0045927, GTpase regulator activity GO: 0030695, transcription GO: 0006352, apoptosis GO: 0006915 and many other biological processes. The genes showing higher expression in regressed phase belonged to functional categories such as negative regulation of macromolecule metabolic process GO: 0010605, negative regulation of gene expression GO: 0010629 and maintenance of stem cell niche GO: 0045165. Conclusion/Significance This is the first exploratory study profiling transcriptome of three drastically different conditions of any reptilian testis. The genes expressed in the testis during regressed, recrudescent and active phase of reproductive cycle are in concordance with the testis

  9. The logistics of choice.

    PubMed

    Killeen, Peter R

    2015-07-01

    The generalized matching law (GML) is reconstructed as a logistic regression equation that privileges no particular value of the sensitivity parameter, a. That value will often approach 1 due to the feedback that drives switching that is intrinsic to most concurrent schedules. A model of that feedback reproduced some features of concurrent data. The GML is a law only in the strained sense that any equation that maps data is a law. The machine under the hood of matching is in all likelihood the very law that was displaced by the Matching Law. It is now time to return the Law of Effect to centrality in our science. PMID:25988932

  10. A Bayesian Nonlinear Mixed-Effects Regression Model for the Characterization of Early Bactericidal Activity of Tuberculosis Drugs

    PubMed Central

    Burger, Divan Aristo; Schall, Robert

    2015-01-01

    Trials of the early bactericidal activity (EBA) of tuberculosis (TB) treatments assess the decline, during the first few days to weeks of treatment, in colony forming unit (CFU) count of Mycobacterium tuberculosis in the sputum of patients with smear-microscopy-positive pulmonary TB. Profiles over time of CFU data have conventionally been modeled using linear, bilinear, or bi-exponential regression. We propose a new biphasic nonlinear regression model for CFU data that comprises linear and bilinear regression models as special cases and is more flexible than bi-exponential regression models. A Bayesian nonlinear mixed-effects (NLME) regression model is fitted jointly to the data of all patients from a trial, and statistical inference about the mean EBA of TB treatments is based on the Bayesian NLME regression model. The posterior predictive distribution of relevant slope parameters of the Bayesian NLME regression model provides insight into the nature of the EBA of TB treatments; specifically, the posterior predictive distribution allows one to judge whether treatments are associated with monolinear or bilinear decline of log(CFU) count, and whether CFU count initially decreases fast, followed by a slower rate of decrease, or vice versa. PMID:25322214

  11. Comparison and applicability of landslide susceptibility models based on landslide ratio-based logistic regression, frequency ratio, weight of evidence, and instability index methods in an extreme rainfall event

    NASA Astrophysics Data System (ADS)

    Wu, Chunhung

    2016-04-01

    Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall

  12. Prediction of kinase inhibitor response using activity profiling, in vitro screening, and elastic net regression

    PubMed Central

    2014-01-01

    Background Many kinase inhibitors have been approved as cancer therapies. Recently, libraries of kinase inhibitors have been extensively profiled, thus providing a map of the strength of action of each compound on a large number of its targets. These profiled libraries define drug-kinase networks that can predict the effectiveness of untested drugs and elucidate the roles of specific kinases in different cellular systems. Predictions of drug effectiveness based on a comprehensive network model of cellular signalling are difficult, due to our partial knowledge of the complex biological processes downstream of the targeted kinases. Results We have developed the Kinase Inhibitors Elastic Net (KIEN) method, which integrates information contained in drug-kinase networks with in vitro screening. The method uses the in vitro cell response of single drugs and drug pair combinations as a training set to build linear and nonlinear regression models. Besides predicting the effectiveness of untested drugs, the KIEN method identifies sets of kinases that are statistically associated to drug sensitivity in a given cell line. We compared different versions of the method, which is based on a regression technique known as elastic net. Data from two-drug combinations led to predictive models, and we found that predictivity can be improved by applying logarithmic transformation to the data. The method was applied to the A549 lung cancer cell line, and we identified specific kinases known to have an important role in this type of cancer (TGFBR2, EGFR, PHKG1 and CDK4). A pathway enrichment analysis of the set of kinases identified by the method showed that axon guidance, activation of Rac, and semaphorin interactions pathways are associated to a selective response to therapeutic intervention in this cell line. Conclusions We have proposed an integrated experimental and computational methodology, called KIEN, that identifies the role of specific kinases in the drug response of a given

  13. Biomass Logistics

    SciTech Connect

    J. Richard Hess; Kevin L. Kenney; William A. Smith; Ian Bonner; David J. Muth

    2015-04-01

    Equipment manufacturers have made rapid improvements in biomass harvesting and handling equipment. These improvements have increased transportation and handling efficiencies due to higher biomass densities and reduced losses. Improvements in grinder efficiencies and capacity have reduced biomass grinding costs. Biomass collection efficiencies (the ratio of biomass collected to the amount available in the field) as high as 75% for crop residues and greater than 90% for perennial energy crops have also been demonstrated. However, as collection rates increase, the fraction of entrained soil in the biomass increases, and high biomass residue removal rates can violate agronomic sustainability limits. Advancements in quantifying multi-factor sustainability limits to increase removal rate as guided by sustainable residue removal plans, and mitigating soil contamination through targeted removal rates based on soil type and residue type/fraction is allowing the use of new high efficiency harvesting equipment and methods. As another consideration, single pass harvesting and other technologies that improve harvesting costs cause biomass storage moisture management challenges, which challenges are further perturbed by annual variability in biomass moisture content. Monitoring, sampling, simulation, and analysis provide basis for moisture, time, and quality relationships in storage, which has allowed the development of moisture tolerant storage systems and best management processes that combine moisture content and time to accommodate baled storage of wet material based upon “shelf-life.” The key to improving biomass supply logistics costs has been developing the associated agronomic sustainability and biomass quality technologies and processes that allow the implementation of equipment engineering solutions.

  14. Prediction of Inhibitory Activity of Epidermal Growth Factor Receptor Inhibitors Using Grid Search-Projection Pursuit Regression Method

    PubMed Central

    Du, Hongying; Hu, Zhide; Bazzoli, Andrea; Zhang, Yang

    2011-01-01

    The epidermal growth factor receptor (EGFR) protein tyrosine kinase (PTK) is an important protein target for anti-tumor drug discovery. To identify potential EGFR inhibitors, we conducted a quantitative structure–activity relationship (QSAR) study on the inhibitory activity of a series of quinazoline derivatives against EGFR tyrosine kinase. Two 2D-QSAR models were developed based on the best multi-linear regression (BMLR) and grid-search assisted projection pursuit regression (GS-PPR) methods. The results demonstrate that the inhibitory activity of quinazoline derivatives is strongly correlated with their polarizability, activation energy, mass distribution, connectivity, and branching information. Although the present investigation focused on EGFR, the approach provides a general avenue in the structure-based drug development of different protein receptor inhibitors. PMID:21811593

  15. Country logistics performance and disaster impact.

    PubMed

    Vaillancourt, Alain; Haavisto, Ira

    2016-04-01

    The aim of this paper is to deepen the understanding of the relationship between country logistics performance and disaster impact. The relationship is analysed through correlation analysis and regression models for 117 countries for the years 2007 to 2012 with disaster impact variables from the International Disaster Database (EM-DAT) and logistics performance indicators from the World Bank. The results show a significant relationship between country logistics performance and disaster impact overall and for five out of six specific logistic performance indicators. These specific indicators were further used to explore the relationship between country logistic performance and disaster impact for three specific disaster types (epidemic, flood and storm). The findings enhance the understanding of the role of logistics in a humanitarian context with empirical evidence of the importance of country logistics performance in disaster response operations. PMID:26282578

  16. The Effects of Various Configurations of Likert, Ordered Categorical, or Rating Scale Data on the Ordinal Logistic Regression Pseudo R-Squared Measure of Fit: The Case of the Cummulative Logit Model.

    ERIC Educational Resources Information Center

    Zumbo, Bruno D.; Ochieng, Charles O.

    Many measures found in educational research are ordered categorical response variables that are empirical realizations of an underlying normally distributed variate. These ordered categorical variables are commonly referred to as Likert or rating scale data. Regression models are commonly fit using these ordered categorical variables as the…

  17. Logistic Stick-Breaking Process

    PubMed Central

    Ren, Lu; Du, Lan; Carin, Lawrence; Dunson, David B.

    2013-01-01

    A logistic stick-breaking process (LSBP) is proposed for non-parametric clustering of general spatially- or temporally-dependent data, imposing the belief that proximate data are more likely to be clustered together. The sticks in the LSBP are realized via multiple logistic regression functions, with shrinkage priors employed to favor contiguous and spatially localized segments. The LSBP is also extended for the simultaneous processing of multiple data sets, yielding a hierarchical logistic stick-breaking process (H-LSBP). The model parameters (atoms) within the H-LSBP are shared across the multiple learning tasks. Efficient variational Bayesian inference is derived, and comparisons are made to related techniques in the literature. Experimental analysis is performed for audio waveforms and images, and it is demonstrated that for segmentation applications the LSBP yields generally homogeneous segments with sharp boundaries. PMID:25258593

  18. Linking mutagenic activity to micropollutant concentrations in wastewater samples by partial least square regression and subsequent identification of variables.

    PubMed

    Hug, Christine; Sievers, Moritz; Ottermanns, Richard; Hollert, Henner; Brack, Werner; Krauss, Martin

    2015-11-01

    We deployed multivariate regression to identify compounds co-varying with the mutagenic activity of complex environmental samples. Wastewater treatment plant (WWTP) effluents with a large share of industrial input of different sampling dates were evaluated for mutagenic activity by the Ames Fluctuation Test and chemically characterized by a screening for suspected pro-mutagens and non-targeted software-based peak detection in full scan data. Areas of automatically detected peaks were used as predictor matrix for partial least squares projections to latent structures (PLS) in combination with measured mutagenic activity. Detected peaks were successively reduced by the exclusion of all peaks with lowest variable importance until the best model (high R(2) and Q(2)) was reached. Peaks in the best model co-varying with the observed mutagenicity showed increased chlorine, bromine, sulfur, and nitrogen abundance compared to original peak set indicating a preferential selection of anthropogenic compounds. The PLS regression revealed four tentatively identified compounds, newly identified 4-(dimethylamino)-pyridine, and three known micropollutants present in domestic wastewater as co-varying with the mutagenic activity. Co-variance between compounds stemming from industrial wastewater and mutagenic activity supported the application of "virtual" EDA as a statistical tool to separate toxicologically relevant from less relevant compounds. PMID:26070082

  19. Copula regression analysis of simultaneously recorded frontal eye field and inferotemporal spiking activity during object-based working memory.

    PubMed

    Hu, Meng; Clark, Kelsey L; Gong, Xiajing; Noudoost, Behrad; Li, Mingyao; Moore, Tirin; Liang, Hualou

    2015-06-10

    Inferotemporal (IT) neurons are known to exhibit persistent, stimulus-selective activity during the delay period of object-based working memory tasks. Frontal eye field (FEF) neurons show robust, spatially selective delay period activity during memory-guided saccade tasks. We present a copula regression paradigm to examine neural interaction of these two types of signals between areas IT and FEF of the monkey during a working memory task. This paradigm is based on copula models that can account for both marginal distribution over spiking activity of individual neurons within each area and joint distribution over ensemble activity of neurons between areas. Considering the popular GLMs as marginal models, we developed a general and flexible likelihood framework that uses the copula to integrate separate GLMs into a joint regression analysis. Such joint analysis essentially leads to a multivariate analog of the marginal GLM theory and hence efficient model estimation. In addition, we show that Granger causality between spike trains can be readily assessed via the likelihood ratio statistic. The performance of this method is validated by extensive simulations, and compared favorably to the widely used GLMs. When applied to spiking activity of simultaneously recorded FEF and IT neurons during working memory task, we observed significant Granger causality influence from FEF to IT, but not in the opposite direction, suggesting the role of the FEF in the selection and retention of visual information during working memory. The copula model has the potential to provide unique neurophysiological insights about network properties of the brain. PMID:26063909

  20. Antimicrobial Activity of Aroma Compounds against Saccharomyces cerevisiae and Improvement of Microbiological Stability of Soft Drinks as Assessed by Logistic Regression▿

    PubMed Central

    Belletti, Nicoletta; Kamdem, Sylvain Sado; Patrignani, Francesca; Lanciotti, Rosalba; Covelli, Alessandro; Gardini, Fausto

    2007-01-01

    The combined effects of a mild heat treatment (55°C) and the presence of three aroma compounds [citron essential oil, citral, and (E)-2-hexenal] on the spoilage of noncarbonated beverages inoculated with different amounts of a Saccharomyces cerevisiae strain were evaluated. The results, expressed as growth/no growth, were elaborated using a logistic regression in order to assess the probability of beverage spoilage as a function of thermal treatment length, concentration of flavoring agents, and yeast inoculum. The logit models obtained for the three substances were extremely precise. The thermal treatment alone, even if prolonged for 20 min, was not able to prevent yeast growth. However, the presence of increasing concentrations of aroma compounds improved the stability of the products. The inhibiting effect of the compounds was enhanced by a prolonged thermal treatment. In fact, it influenced the vapor pressure of the molecules, which can easily interact within microbial membranes when they are in gaseous form. (E)-2-Hexenal showed a threshold level, related to initial inoculum and thermal treatment length, over which yeast growth was rapidly inhibited. Concentrations over 100 ppm of citral and thermal treatment longer than 16 min allowed a 90% probability of stability for bottles inoculated with 105 CFU/bottle. Citron gave the most interesting responses: beverages with 500 ppm of essential oil needed only 3 min of treatment to prevent yeast growth. In this framework, the logistic regression proved to be an important tool to study alternative hurdle strategies for the stabilization of noncarbonated beverages. PMID:17616627

  1. Gene-Activation Mechanisms in the Regression of Atherosclerosis, Elimination of Diabetes Type 2, and Prevention of Dementia

    PubMed Central

    Luoma, P.V

    2011-01-01

    Atherosclerotic vascular disease, diabetes mellitus (DM) and dementia are major global health problems. Both endogenous and exogenous factors activate genes functioning in biological processes. This review article focuses on gene-activation mechanisms that regress atherosclerosis, eliminate DM type 2 (DM2), and prevent cognitive decline and dementia. Gene-activating compounds upregulating functions of liver endoplasmic reticulum (ER) and affecting lipid and protein metabolism, increase ER size through membrane synthesis, and produce an antiatherogenic plasma lipoprotein profile. Numerous gene-activators regress atherosclerosis and reduce the occurrence of atherosclerotic disease. The gene-activators increase glucose disposal rate and insulin sensitivity and, by restoring normal glucose and insulin levels, remove metabolic syndrome and DM2. Patients with DM2 show an improvement of plasma lipoprotein profile and glucose tolerance together with increase in liver phospholipid (PL) and cytochrome (CYP) P450. The gene-activating compounds induce hepatic protein and PL synthesis, and upregulate enzymes including CYPs and glucokinase, nuclear receptors, apolipoproteins and ABC (ATP-binding cassette) transporters. They induce reparation of ER structures and eliminate consequences of ER stress. Healthy living habits activate mechanisms that maintain high levels of HDL and apolipoprotein AI, promote health, and prevent cognitive decline and dementia. Agonists of liver X receptor (LXR) reduce amyloid in brain plaques and improve cognitive performance in mouse models of Alzheimer`s disease. The gene activation increases the capacity to withstand cellular stress and to repair cellular damage and increases life span. Life free of major health problems and in good cognitive health promotes well-being and living a long and active life. PMID:21568932

  2. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    PubMed

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. PMID:24881546

  3. Overview of the SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division Activities and Technical Projects

    NASA Technical Reports Server (NTRS)

    Singhal, Surendra N.

    2003-01-01

    The SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division activities include identification and fulfillment of joint industry, government, and academia needs for development and implementation of RMSL technologies. Four Projects in the Probabilistic Methods area and two in the area of RMSL have been identified. These are: (1) Evaluation of Probabilistic Technology - progress has been made toward the selection of probabilistic application cases. Future effort will focus on assessment of multiple probabilistic softwares in solving selected engineering problems using probabilistic methods. Relevance to Industry & Government - Case studies of typical problems encountering uncertainties, results of solutions to these problems run by different codes, and recommendations on which code is applicable for what problems; (2) Probabilistic Input Preparation - progress has been made in identifying problem cases such as those with no data, little data and sufficient data. Future effort will focus on developing guidelines for preparing input for probabilistic analysis, especially with no or little data. Relevance to Industry & Government - Too often, we get bogged down thinking we need a lot of data before we can quantify uncertainties. Not True. There are ways to do credible probabilistic analysis with little data; (3) Probabilistic Reliability - probabilistic reliability literature search has been completed along with what differentiates it from statistical reliability. Work on computation of reliability based on quantification of uncertainties in primitive variables is in progress. Relevance to Industry & Government - Correct reliability computations both at the component and system level are needed so one can design an item based on its expected usage and life span; (4) Real World Applications of Probabilistic Methods (PM) - A draft of volume 1 comprising aerospace applications has been released. Volume 2, a compilation of real world

  4. Application of least squares support vector regression and linear multiple regression for modeling removal of methyl orange onto tin oxide nanoparticles loaded on activated carbon and activated carbon prepared from Pistacia atlantica wood.

    PubMed

    Ghaedi, M; Rahimi, Mahmoud Reza; Ghaedi, A M; Tyagi, Inderjeet; Agarwal, Shilpi; Gupta, Vinod Kumar

    2016-01-01

    Two novel and eco friendly adsorbents namely tin oxide nanoparticles loaded on activated carbon (SnO2-NP-AC) and activated carbon prepared from wood tree Pistacia atlantica (AC-PAW) were used for the rapid removal and fast adsorption of methyl orange (MO) from the aqueous phase. The dependency of MO removal with various adsorption influential parameters was well modeled and optimized using multiple linear regressions (MLR) and least squares support vector regression (LSSVR). The optimal parameters for the LSSVR model were found based on γ value of 0.76 and σ(2) of 0.15. For testing the data set, the mean square error (MSE) values of 0.0010 and the coefficient of determination (R(2)) values of 0.976 were obtained for LSSVR model, and the MSE value of 0.0037 and the R(2) value of 0.897 were obtained for the MLR model. The adsorption equilibrium and kinetic data was found to be well fitted and in good agreement with Langmuir isotherm model and second-order equation and intra-particle diffusion models respectively. The small amount of the proposed SnO2-NP-AC and AC-PAW (0.015 g and 0.08 g) is applicable for successful rapid removal of methyl orange (>95%). The maximum adsorption capacity for SnO2-NP-AC and AC-PAW was 250 mg g(-1) and 125 mg g(-1) respectively. PMID:26414425

  5. Dihydromyricetin promotes hepatocellular carcinoma regression via a p53 activation-dependent mechanism

    NASA Astrophysics Data System (ADS)

    Zhang, Qingyu; Liu, Jie; Liu, Bin; Xia, Juan; Chen, Nianping; Chen, Xiaofeng; Cao, Yi; Zhang, Chen; Lu, Caijie; Li, Mingyi; Zhu, Runzhi

    2014-04-01

    The development of antitumor chemotherapy drugs remains a key goal for oncologists, and natural products provide a vast resource for anti-cancer drug discovery. In the current study, we found that the flavonoid dihydromyricetin (DHM) exhibited antitumor activity against liver cancer cells, including primary cells obtained from hepatocellular carcinoma (HCC) patients. In contrast, DHM was not cytotoxic to immortalized normal liver cells. Furthermore, DHM treatment resulted in the growth inhibition and remission of xenotransplanted tumors in nude mice. Our results further demonstrated that this antitumor activity was caused by the activation of the p53-dependent apoptosis pathway via p53 phosphorylation at serine (15Ser). Moreover, our results showed that DHM plays a dual role in the induction of cell death when administered in combination with cisplatin, a common clinical drug that kills primary hepatoma cells but not normal liver cells.

  6. Association between Physical Activity and Teacher-Reported Academic Performance among Fifth-Graders in Shanghai: A Quantile Regression

    PubMed Central

    Zhang, Yunting; Zhang, Donglan; Jiang, Yanrui; Sun, Wanqi; Wang, Yan; Chen, Wenjuan; Li, Shenghui; Shi, Lu; Shen, Xiaoming; Zhang, Jun; Jiang, Fan

    2015-01-01

    Introduction A growing body of literature reveals the causal pathways between physical activity and brain function, indicating that increasing physical activity among children could improve rather than undermine their scholastic performance. However, past studies of physical activity and scholastic performance among students often relied on parent-reported grade information, and did not explore whether the association varied among different levels of scholastic performance. Our study among fifth-grade students in Shanghai sought to determine the association between regular physical activity and teacher-reported academic performance scores (APS), with special attention to the differential associational patterns across different strata of scholastic performance. Method A total of 2,225 students were chosen through a stratified random sampling, and a complete sample of 1470 observations were used for analysis. We used a quantile regression analysis to explore whether the association between physical activity and teacher-reported APS differs by distribution of APS. Results Minimal-intensity physical activity such as walking was positively associated with academic performance scores (β = 0.13, SE = 0.04). The magnitude of the association tends to be larger at the lower end of the APS distribution (β = 0.24, SE = 0.08) than in the higher end of the distribution (β = 0.00, SE = 0.07). Conclusion Based upon teacher-reported student academic performance, there is no evidence that spending time on frequent physical activity would undermine student’s APS. Those students who are below the average in their academic performance could be worse off in academic performance if they give up minimal-intensity physical activity. Therefore, cutting physical activity time in schools could hurt the scholastic performance among those students who were already at higher risk for dropping out due to inadequate APS. PMID:25774525

  7. Logistics background study: underground mining

    SciTech Connect

    Hanslovan, J. J.; Visovsky, R. G.

    1982-02-01

    Logistical functions that are normally associated with US underground coal mining are investigated and analyzed. These functions imply all activities and services that support the producing sections of the mine. The report provides a better understanding of how these functions impact coal production in terms of time, cost, and safety. Major underground logistics activities are analyzed and include: transportation and personnel, supplies and equipment; transportation of coal and rock; electrical distribution and communications systems; water handling; hydraulics; and ventilation systems. Recommended areas for future research are identified and prioritized.

  8. CYP2U1 mutations in two Iranian patients with activity induced dystonia, motor regression and spastic paraplegia

    PubMed Central

    Kariminejad, A.; Schöls, L.; Schüle, R.; Tonekaboni, S.H.; Abolhassani, A.; Fadaee, M.; Rosti, R.O.; Gleeson, J.G.

    2016-01-01

    Hereditary spastic paraplegia (HSP) is a heterogeneous condition characterized by progressive spasticity and weakness in the lower limbs. It is divided into two major groups, complicated and uncomplicated, based on the presence of additional features such as intellectual disability, ataxia, seizures, peripheral neuropathy and visual problems. SPG56 is an autosomal recessive form of HSP with complicated and uncomplicated manifestations, complicated being more common. CYP2U1 gene mutations have been identified as responsible for SPG56. Intellectual disability, dystonia, subclinical sensory motor neuropathy, pigmentary degenerative maculopathy, thin corpus callosum and periventricular white-matter hyperintensities were additional features noted in previous cases of SPG56. Here we identified two novel mutations in CYP2U1 in two unrelated patients by whole exome sequencing. Both patients had complicated HSP with activity-induced dystonia, suggesting dystonia as an additional finding in SPG56. Two out of 14 previously reported patients had dystonia, and the addition of our patients suggests dystonia in a quarter of SPG56 patients. Developmental regression has not been reported in SPG56 patients so far but both of our patients developed motor regression in infancy. PMID:27292318

  9. CYP2U1 mutations in two Iranian patients with activity induced dystonia, motor regression and spastic paraplegia.

    PubMed

    Kariminejad, A; Schöls, L; Schüle, R; Tonekaboni, S H; Abolhassani, A; Fadaee, M; Rosti, R O; Gleeson, J G

    2016-09-01

    Hereditary spastic paraplegia (HSP) is a heterogeneous condition characterized by progressive spasticity and weakness in the lower limbs. It is divided into two major groups, complicated and uncomplicated, based on the presence of additional features such as intellectual disability, ataxia, seizures, peripheral neuropathy and visual problems. SPG56 is an autosomal recessive form of HSP with complicated and uncomplicated manifestations, complicated being more common. CYP2U1 gene mutations have been identified as responsible for SPG56. Intellectual disability, dystonia, subclinical sensory motor neuropathy, pigmentary degenerative maculopathy, thin corpus callosum and periventricular white-matter hyperintensities were additional features noted in previous cases of SPG56. Here we identified two novel mutations in CYP2U1 in two unrelated patients by whole exome sequencing. Both patients had complicated HSP with activity-induced dystonia, suggesting dystonia as an additional finding in SPG56. Two out of 14 previously reported patients had dystonia, and the addition of our patients suggests dystonia in a quarter of SPG56 patients. Developmental regression has not been reported in SPG56 patients so far but both of our patients developed motor regression in infancy. PMID:27292318

  10. Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models.

    PubMed

    Yu, Xiaolei; Guo, Xulin

    2016-08-01

    The relationship between hourly photosynthetically active radiation (PAR) and the global solar radiation (R s ) was analyzed from data gathered over 3 years at Bondville, IL, and Sioux Falls, SD, Midwestern USA. These data were used to determine temporal variability of the PAR fraction and its dependence on different sky conditions, which were defined by the clearness index. Meanwhile, models based on artificial neural networks (ANNs) were established for predicting hourly PAR. The performance of the proposed models was compared with four existing conventional regression models in terms of the normalized root mean square error (NRMSE), the coefficient of determination (r (2)), the mean percentage error (MPE), and the relative standard error (RSE). From the overall analysis, it shows that the ANN model can predict PAR accurately, especially for overcast sky and clear sky conditions. Meanwhile, the parameters related to water vapor do not improve the prediction result significantly. PMID:26715137

  11. Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models

    NASA Astrophysics Data System (ADS)

    Yu, Xiaolei; Guo, Xulin

    2016-08-01

    The relationship between hourly photosynthetically active radiation (PAR) and the global solar radiation ( R s ) was analyzed from data gathered over 3 years at Bondville, IL, and Sioux Falls, SD, Midwestern USA. These data were used to determine temporal variability of the PAR fraction and its dependence on different sky conditions, which were defined by the clearness index. Meanwhile, models based on artificial neural networks (ANNs) were established for predicting hourly PAR. The performance of the proposed models was compared with four existing conventional regression models in terms of the normalized root mean square error (NRMSE), the coefficient of determination ( r 2), the mean percentage error (MPE), and the relative standard error (RSE). From the overall analysis, it shows that the ANN model can predict PAR accurately, especially for overcast sky and clear sky conditions. Meanwhile, the parameters related to water vapor do not improve the prediction result significantly.

  12. Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models

    NASA Astrophysics Data System (ADS)

    Yu, Xiaolei; Guo, Xulin

    2015-12-01

    The relationship between hourly photosynthetically active radiation (PAR) and the global solar radiation (R s ) was analyzed from data gathered over 3 years at Bondville, IL, and Sioux Falls, SD, Midwestern USA. These data were used to determine temporal variability of the PAR fraction and its dependence on different sky conditions, which were defined by the clearness index. Meanwhile, models based on artificial neural networks (ANNs) were established for predicting hourly PAR. The performance of the proposed models was compared with four existing conventional regression models in terms of the normalized root mean square error (NRMSE), the coefficient of determination (r 2), the mean percentage error (MPE), and the relative standard error (RSE). From the overall analysis, it shows that the ANN model can predict PAR accurately, especially for overcast sky and clear sky conditions. Meanwhile, the parameters related to water vapor do not improve the prediction result significantly.

  13. Statins Promote the Regression of Atherosclerosis via Activation of the CCR7-Dependent Emigration Pathway in Macrophages

    PubMed Central

    Feig, Jonathan E.; Shang, Yueting; Rotllan, Noemi; Vengrenyuk, Yuliya; Wu, Chaowei; Shamir, Raanan; Torra, Ines Pineda; Fernandez-Hernando, Carlos; Fisher, Edward A.; Garabedian, Michael J.

    2011-01-01

    HMG-CoA reductase inhibitors (statins) decrease atherosclerosis by lowering low-density-lipoprotein cholesterol. Statins are also thought to have additional anti-atherogenic properties, yet defining these non-conventional modes of statin action remains incomplete. We have previously developed a novel mouse transplant model of atherosclerosis regression in which aortic segments from diseased donors are placed into normolipidemic recipients. With this model, we demonstrated the rapid loss of CD68+ cells (mainly macrophages) in plaques through the induction of a chemokine receptor CCR7-dependent emigration process. Because the human and mouse CCR7 promoter contain Sterol Response Elements (SREs), we hypothesized that Sterol Regulatory Element Binding Proteins (SREBPs) are involved in increasing CCR7 expression and through this mechanism, statins would promote CD68+ cell emigration from plaques. We examined whether statin activation of the SREBP pathway in vivo would induce CCR7 expression and promote macrophage emigration from plaques. We found that western diet-fed apoE-/- mice treated with either atorvastatin or rosuvastatin led to a substantial reduction in the CD68+ cell content in the plaques despite continued hyperlipidemia. We also observed a significant increase in CCR7 mRNA in CD68+ cells from both the atorvastatin and rosuvastatin treated mice associated with emigration of CD68+ cells from plaques. Importantly, CCR7-/-/apoE-/- double knockout mice failed to display a reduction in CD68+ cell content upon statin treatment. Statins also affected the recruitment of transcriptional regulatory proteins and the organization of the chromatin at the CCR7 promoter to increase the transcriptional activity. Statins promote the beneficial remodeling of plaques in diseased mouse arteries through the stimulation of the CCR7 emigration pathway in macrophages. Therefore, statins may exhibit some of their clinical benefits by not only retarding the progression of

  14. Logistics Management: New trends in the Reverse Logistics

    NASA Astrophysics Data System (ADS)

    Antonyová, A.; Antony, P.; Soewito, B.

    2016-04-01

    Present level and quality of the environment are directly dependent on our access to natural resources, as well as their sustainability. In particular production activities and phenomena associated with it have a direct impact on the future of our planet. Recycling process, which in large enterprises often becomes an important and integral part of the production program, is usually in small and medium-sized enterprises problematic. We can specify a few factors, which have direct impact on the development and successful application of the effective reverse logistics system. Find the ways to economically acceptable model of reverse logistics, focusing on converting waste materials for renewable energy, is the task in progress.

  15. Robust Regression.

    PubMed

    Huang, Dong; Cabral, Ricardo; De la Torre, Fernando

    2016-02-01

    Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740

  16. Systematic Artifacts in Support Vector Regression-Based Compound Potency Prediction Revealed by Statistical and Activity Landscape Analysis

    PubMed Central

    Balfer, Jenny; Bajorath, Jürgen

    2015-01-01

    Support vector machines are a popular machine learning method for many classification tasks in biology and chemistry. In addition, the support vector regression (SVR) variant is widely used for numerical property predictions. In chemoinformatics and pharmaceutical research, SVR has become the probably most popular approach for modeling of non-linear structure-activity relationships (SARs) and predicting compound potency values. Herein, we have systematically generated and analyzed SVR prediction models for a variety of compound data sets with different SAR characteristics. Although these SVR models were accurate on the basis of global prediction statistics and not prone to overfitting, they were found to consistently mispredict highly potent compounds. Hence, in regions of local SAR discontinuity, SVR prediction models displayed clear limitations. Compared to observed activity landscapes of compound data sets, landscapes generated on the basis of SVR potency predictions were partly flattened and activity cliff information was lost. Taken together, these findings have implications for practical SVR applications. In particular, prospective SVR-based potency predictions should be considered with caution because artificially low predictions are very likely for highly potent candidate compounds, the most important prediction targets. PMID:25742011

  17. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  18. Packaging for logistical support

    NASA Astrophysics Data System (ADS)

    Twede, Diana; Hughes, Harold

    Logistical packaging is conducted to furnish protection, utility, and communication for elements of a logistical system. Once the functional requirements of space logistical support packaging have been identified, decision-makers have a reasonable basis on which to compare package alternatives. Flexible packages may be found, for example, to provide adequate protection and superior utility to that of rigid packages requiring greater storage and postuse waste volumes.

  19. A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

    PubMed

    Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

    2013-08-01

    It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences. PMID:22614763

  20. In vivo sensitized and in vitro activated B cells mediate tumor regression in cancer adoptive immunotherapy1

    PubMed Central

    Li, Qiao; Song, Hongbin; Teitz-Tennenbaum, Seagal; Donald, Elizabeth J.; Li, Mu; Chang, Alfred E.

    2011-01-01

    Adoptive cellular immunotherapy utilizing tumor-reactive T cells has proven to be a promising strategy for cancer treatment. However, we hypothesize that successful treatment strategies will have to appropriately stimulate not only cellular immunity, but also humoral immunity. We previously reported that B cells in tumor-draining lymph nodes (TDLN) may function as antigen-presenting cells. In this study, we identified TDLN B cells as effector cells in an adoptive immunotherapy model. In vivo primed and in vitro activated TDLN B cells alone mediated effective (p<0.05) tumor regression after adoptive transfer into two histologically distinct murine pulmonary metastatic tumor models. Prior lymphodepletion of the host with either chemotherapy or whole-body irradiation augmented the therapeutic efficacy of the adoptively transferred TDLN B cells in the treatment of subcutaneous tumors as well as metastatic pulmonary tumors. Furthermore, B cell plus T cell transfers resulted in substantially more efficient antitumor responses than B cells or T cells alone (p<0.05). Activated TDLN B cells conferred strong humoral responses to tumor. This was evident by the production of IgM, IgG and IgG2b, which bound specifically to tumor cells and led to specific tumor cell lysis in the presence of complement. Collectively, these data indicate that in vivo primed and in vitro activated B cells can be employed as effector cells for cancer therapy. The synergistic antitumor efficacy of co-transferred activated B effector cells and T effector cells represents a novel approach for cancer adoptive immunotherapy. PMID:19667089

  1. MDM2 small-molecule antagonist RG7112 activates p53 signaling and regresses human tumors in preclinical cancer models.

    PubMed

    Tovar, Christian; Graves, Bradford; Packman, Kathryn; Filipovic, Zoran; Higgins, Brian; Xia, Mingxuan; Tardell, Christine; Garrido, Rosario; Lee, Edmund; Kolinsky, Kenneth; To, Kwong-Him; Linn, Michael; Podlaski, Frank; Wovkulich, Peter; Vu, Binh; Vassilev, Lyubomir T

    2013-04-15

    MDM2 negatively regulates p53 stability and many human tumors overproduce MDM2 as a mechanism to restrict p53 function. Thus, inhibitors of p53-MDM2 binding that can reactivate p53 in cancer cells may offer an effective approach for cancer therapy. RG7112 is a potent and selective member of the nutlin family of MDM2 antagonists currently in phase I clinical studies. RG7112 binds MDM2 with high affinity (K(D) ~ 11 nmol/L), blocking its interactions with p53 in vitro. A crystal structure of the RG7112-MDM2 complex revealed that the small molecule binds in the p53 pocket of MDM2, mimicking the interactions of critical p53 amino acid residues. Treatment of cancer cells expressing wild-type p53 with RG7112 activated the p53 pathway, leading to cell-cycle arrest and apoptosis. RG7112 showed potent antitumor activity against a panel of solid tumor cell lines. However, its apoptotic activity varied widely with the best response observed in osteosarcoma cells with MDM2 gene amplification. Interestingly, inhibition of caspase activity did not change the kinetics of p53-induced cell death. Oral administration of RG7112 to human xenograft-bearing mice at nontoxic concentrations caused dose-dependent changes in proliferation/apoptosis biomarkers as well as tumor inhibition and regression. Notably, RG7112 was highly synergistic with androgen deprivation in LNCaP xenograft tumors. Our findings offer a preclinical proof-of-concept that RG7112 is effective in treatment of solid tumors expressing wild-type p53. PMID:23400593

  2. Computational Techniques for Spatial Logistic Regression with Large Datasets

    PubMed Central

    Paciorek, Christopher J.

    2007-01-01

    In epidemiological research, outcomes are frequently non-normal, sample sizes may be large, and effect sizes are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. I focus on binary outcomes, with the risk surface a smooth function of space, but the development herein is relevant for non-normal data in general. I compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface via the Fourier basis provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being reasonably computationally efficient. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. A Bayesian Markov random field model performs less well statistically than the spectral basis model, but is very computationally efficient. We illustrate the methods on a real dataset of cancer cases in Taiwan. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models. PMID:18443645

  3. Incremental logistic regression for customizing automatic diagnostic models.

    PubMed

    Tortajada, Salvador; Robles, Montserrat; García-Gómez, Juan Miguel

    2015-01-01

    In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence. PMID:25417079

  4. Background or Experience? Using Logistic Regression to Predict College Retention

    ERIC Educational Resources Information Center

    Synco, Tracee M.

    2012-01-01

    Tinto, Astin and countless others have researched the retention and attrition of students from college for more than thirty years. However, the six year graduation rate for all first-time full-time freshmen for the 2002 cohort was 57%. This study sought to determine the retention variables that predicted continued enrollment of entering freshmen…

  5. Assessing Lake Trophic Status: A Proportional Odds Logistic Regression Model

    EPA Science Inventory

    Lake trophic state classifications are good predictors of ecosystem condition and are indicative of both ecosystem services (e.g., recreation and aesthetics), and disservices (e.g., harmful algal blooms). Methods for classifying trophic state are based off the foundational work o...

  6. Significant Improvements to LOGIST.

    ERIC Educational Resources Information Center

    Wingersky, Marilyn S.

    The computer program LOGIST (Wingersky, Patrick, and Lord, 1988) estimates the item parameters and the examinee's abilities for Birnbaum's three-parameter logistic item response theory model using Newton's method for solving the joint maximum likelihood equations. In 1989, Martha Stocking discovered a problem with this procedure in that when the…

  7. Quantitative structure-antibacterial activity relationship modeling using a combination of piecewise linear regression-discriminant analysis (I): Quantum chemical, topographic, and topological descriptors

    NASA Astrophysics Data System (ADS)

    Molina, Enrique; Estrada, Ernesto; Nodarse, Delvin; Torres, Luis A.; González, Humberto; Uriarte, Eugenio

    Time-dependent antibacterial activity of 2-furylethylenes using quantum chemical, topographic, and topological indices is described as inhibition of respiration in E. coli. A QSAR strategy based on the combination of the linear piecewise regression and the discriminant analysis is used to predict the biological activity values of strong and moderates antibacterial furylethylenes. The breakpoint in the values of the biological activity was detected. The biological activities of the compounds are described by two linear regression equations. A discriminant analysis is carried out to classify the compounds in one of the biological activity two groups. The results showed using different kind of descriptors were compared. In all cases the piecewise linear regression - discriminant analysis (PLR-DA) method produced significantly better QSAR models than the linear regression analysis. The QSAR models were validated using an external validation previously extracted from the original data. A prediction of reported antibacterial activity analysis was carried out showing dependence between the probability of a good classification and the experimental antibacterial activity. Statistical parameters showed the quality of quantum-chemical descriptors based models prediction in LDA having an accuracy of 0.9 and a C of 0.9. The best PLR-DA model explains more than 92% of the variance of experimental activity. Models with best prediction results were those based on quantum-chemical descriptors. An interpretation of quantum-chemical descriptors entered in models was carried out.

  8. RESEARCH PAPER: Forecast daily indices of solar activity, F10.7, using support vector regression method

    NASA Astrophysics Data System (ADS)

    Huang, Cong; Liu, Dan-Dan; Wang, Jing-Song

    2009-06-01

    The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network.

  9. Quercetin, a Natural Flavonoid Interacts with DNA, Arrests Cell Cycle and Causes Tumor Regression by Activating Mitochondrial Pathway of Apoptosis

    PubMed Central

    Srivastava, Shikha; Somasagara, Ranganatha R.; Hegde, Mahesh; Nishana, Mayilaadumveettil; Tadi, Satish Kumar; Srivastava, Mrinal; Choudhary, Bibha; Raghavan, Sathees C.

    2016-01-01

    Naturally occurring compounds are considered as attractive candidates for cancer treatment and prevention. Quercetin and ellagic acid are naturally occurring flavonoids abundantly seen in several fruits and vegetables. In the present study, we evaluate and compare antitumor efficacies of quercetin and ellagic acid in animal models and cancer cell lines in a comprehensive manner. We found that quercetin induced cytotoxicity in leukemic cells in a dose-dependent manner, while ellagic acid showed only limited toxicity. Besides leukemic cells, quercetin also induced cytotoxicity in breast cancer cells, however, its effect on normal cells was limited or none. Further, quercetin caused S phase arrest during cell cycle progression in tested cancer cells. Quercetin induced tumor regression in mice at a concentration 3-fold lower than ellagic acid. Importantly, administration of quercetin lead to ~5 fold increase in the life span in tumor bearing mice compared to that of untreated controls. Further, we found that quercetin interacts with DNA directly, and could be one of the mechanisms for inducing apoptosis in both, cancer cell lines and tumor tissues by activating the intrinsic pathway. Thus, our data suggests that quercetin can be further explored for its potential to be used in cancer therapeutics and combination therapy. PMID:27068577

  10. Multivariate Regression with Calibration*

    PubMed Central

    Liu, Han; Wang, Lie; Zhao, Tuo

    2014-01-01

    We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861

  11. Green Logistics Management

    NASA Astrophysics Data System (ADS)

    Chang, Yoon S.; Oh, Chang H.

    Nowadays, environmental management becomes a critical business consideration for companies to survive from many regulations and tough business requirements. Most of world-leading companies are now aware that environment friendly technology and management are critical to the sustainable growth of the company. The environment market has seen continuous growth marking 532B in 2000, and 590B in 2004. This growth rate is expected to grow to 700B in 2010. It is not hard to see the environment-friendly efforts in almost all aspects of business operations. Such trends can be easily found in logistics area. Green logistics aims to make environmental friendly decisions throughout a product lifecycle. Therefore for the success of green logistics, it is critical to have real time tracking capability on the product throughout the product lifecycle and smart solution service architecture. In this chapter, we introduce an RFID based green logistics solution and service.

  12. Predictive Ability of Pender's Health Promotion Model for Physical Activity and Exercise in People with Spinal Cord Injuries: A Hierarchical Regression Analysis

    ERIC Educational Resources Information Center

    Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi

    2012-01-01

    The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…

  13. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…

  14. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  15. A novel active learning method for support vector regression to estimate biophysical parameters from remotely sensed images

    NASA Astrophysics Data System (ADS)

    Demir, Begüm; Bruzzone, Lorenzo

    2012-11-01

    This paper presents a novel active learning (AL) technique in the context of ɛ-insensitive support vector regression (SVR) to estimate biophysical parameters from remotely sensed images. The proposed AL method aims at selecting the most informative and representative unlabeled samples which have maximum uncertainty, diversity and density assessed according to the SVR estimation rule. This is achieved on the basis of two consecutive steps that rely on the kernel kmeans clustering. In the first step the most uncertain unlabeled samples are selected by removing the most certain ones from a pool of unlabeled samples. In SVR problems, the most uncertain samples are located outside or on the boundary of the ɛ-tube of SVR, as their target values have the lowest confidence to be correctly estimated. In order to select these samples, the kernel k-means clustering is applied to all unlabeled samples together with the training samples that are not SVs, i.e., those that are inside the ɛ-tube, (non-SVs). Then, clusters with non-SVs inside are rejected, whereas the unlabeled samples contained in the remained clusters are selected as the most uncertain samples. In the second step the samples located in the high density regions in the kernel space and as much diverse as possible to each other are chosen among the uncertain samples. The density and diversity of the unlabeled samples are evaluated on the basis of their clusters' information. To this end, initially the density of each cluster is measured by the ratio of the number of samples in the cluster to the distance of its two furthest samples. Then, the highest density clusters are chosen and the medoid samples closest to the centers of the selected clusters are chosen as the most informative ones. The diversity of samples is accomplished by selecting only one sample from each selected cluster. Experiments applied to the estimation of single-tree parameters, i.e., tree stem volume and tree stem diameter, show the

  16. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. PMID:23379851

  17. Mechanisms of neuroblastoma regression

    PubMed Central

    Brodeur, Garrett M.; Bagatell, Rochelle

    2014-01-01

    Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179

  18. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  19. Logistics Modeling for Lunar Exploration Systems

    NASA Technical Reports Server (NTRS)

    Andraschko, Mark R.; Merrill, R. Gabe; Earle, Kevin D.

    2008-01-01

    The extensive logistics required to support extended crewed operations in space make effective modeling of logistics requirements and deployment critical to predicting the behavior of human lunar exploration systems. This paper discusses the software that has been developed as part of the Campaign Manifest Analysis Tool in support of strategic analysis activities under the Constellation Architecture Team - Lunar. The described logistics module enables definition of logistics requirements across multiple surface locations and allows for the transfer of logistics between those locations. A key feature of the module is the loading algorithm that is used to efficiently load logistics by type into carriers and then onto landers. Attention is given to the capabilities and limitations of this loading algorithm, particularly with regard to surface transfers. These capabilities are described within the context of the object-oriented software implementation, with details provided on the applicability of using this approach to model other human exploration scenarios. Some challenges of incorporating probabilistics into this type of logistics analysis model are discussed at a high level.

  20. ISS Logistics Hardware Disposition and Metrics Validation

    NASA Technical Reports Server (NTRS)

    Rogers, Toneka R.

    2010-01-01

    I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.

  1. Processes and procedures for a worldwide biological samples distribution; product assurance and logistic activities to support the mice drawer system tissue sharing event

    NASA Astrophysics Data System (ADS)

    Benassai, Mario; Cotronei, Vittorio

    The Mice Drawer System (MDS) is a scientific payload developed by the Italian Space Agency (ASI), it hosted 6 mice on the International Space Station (ISS) and re-entered on ground on November 28, 2009 with the STS 129 at KSC. Linked to the MDS experiment, a Tissue Sharing Program (TSP), was developed in order to make available to 16 Payload Investigators (PI) (located in USA, Canada, EU -Italy, Belgium and Germany -and Japan) the biological samples coming from the mice. ALTEC SpA (a PPP owned by ASI, TAS-I and local institutions) was responsible to support the logistics aspects of the MDS samples for the first MDS mission, in the frame of Italian Space Agency (ASI) OSMA program (OSteoporosis and Muscle Atrophy). The TSP resulted in a complex scenario, as ASI, progressively, extended the original OSMA Team also to researchers from other ASI programs and from other Agencies (ESA, NASA, JAXA). The science coordination was performed by the University of Genova (UNIGE). ALTEC has managed all the logistic process with the support of a specialized freight forwarder agent during the whole shipping operation phases. ALTEC formalized all the steps from the handover of samples by the dissection Team to the packaging and shipping process in a dedicated procedure. ALTEC approached all the work in a structured way, performing: A study of the aspects connected to international shipments of biological samples. A coopera-tive work with UNIGE/ASI /PIs to identify all the needs of the various researchers and their compatibility. A complete revision and integration of shipment requirements (addresses, tem-peratures, samples, materials and so on). A complete definition of the final shipment scenario in terms of boxes, content, refrigerant and requirements. A formal approach to identification and selection of the most suited and specialized Freight Forwarder. A clear identification of all the processes from sample dissection by PI Team, sample processing, freezing, tube preparation

  2. Mini pressurized logistics module (MPLM)

    NASA Astrophysics Data System (ADS)

    Vallerani, E.; Brondolo, D.; Basile, L.

    1996-06-01

    The MPLM Program was initiated through a Memorandum of Understanding (MOU) between the United States' National Aeronautics and Space Administration (NASA) and Italy's ASI, the Italian Space Agency, that was signed on 6 December 1991. The MPLM is a pressurized logistics module that will be used to transport supplies and materials (up to 20,000 lb), including user experiments, between Earth and International Space Station Alpha (ISSA) using the Shuttle, to support active and passive storage, and to provide a habitable environment for two people when docked to the Station. The Italian Space Agency has selected Alenia Spazio to develop MPLM modules that have always been considered a key element for the new International Space Station taking benefit from its design flexibility and consequent possible cost saving based on the maximum utilization of the Shuttle launch capability for any mission. In the frame of the very recent agreement between the U.S. and Russia for cooperation in space, that foresees the utilization of MIR 1 hardware, the Italian MPLM will remain an important element of the logistics system, being the only pressurized module designed for re-entry. Within the new scenario of anticipated Shuttle flights to MIR 1 during Space Station phase 1, MPLM remains a candidate for one or more missions to provide MIR 1 resupply capabilities and advanced ISSA hardware/procedures verification. Based on the concept of Flexible Carriers, Alenia Spazio is providing NASA with three MPLM flight units that can be configured according to the requirements of the Human-Tended Capability (HTC) and Permanent Human Capability (PHC) of the Space Station. Configurability will allow transportation of passive cargo only, or a combination of passive and cold cargo accommodated in R/F racks. Having developed and qualified the baseline configuration with respect to the worst enveloping condition, each unit could be easily configured to the passive or active version depending upon the

  3. Space Shuttle Orbiter logistics - Managing in a dynamic environment

    NASA Technical Reports Server (NTRS)

    Renfroe, Michael B.; Bradshaw, Kimberly

    1990-01-01

    The importance and methods of monitoring logistics vital signs, logistics data sources and acquisition, and converting data into useful management information are presented. With the launch and landing site for the Shuttle Orbiter project at the Kennedy Space Center now totally responsible for its own supportability posture, it is imperative that logistics resource requirements and management be continually monitored and reassessed. Detailed graphs and data concerning various aspects of logistics activities including objectives, inventory operating levels, customer environment, and data sources are provided. Finally, some lessons learned from the Shuttle Orbiter project and logistics options which should be considered by other space programs are discussed.

  4. Quantitative structure-property relationship (QSPR) for the adsorption of organic compounds onto activated carbon cloth: Comparison between multiple linear regression and neural network

    SciTech Connect

    Brasquet, C.; Bourges, B.; Le Cloirec, P.

    1999-12-01

    The adsorption of 55 organic compounds is carried out onto a recently discovered adsorbent, activated carbon cloth. Isotherms are modeled using the Freundlich classical model, and the large database generated allows qualitative assumptions about the adsorption mechanism. However, to confirm these assumptions, a quantitative structure-property relationship methodology is used to assess the correlations between an adsorbability parameter (expressed using the Freundlich parameter K) and topological indices related to the compounds molecular structure (molecular connectivity indices, MCI). This correlation is set up by mean of two different statistical tools, multiple linear regression (MLR) and neural network (NN). A principal component analysis is carried out to generate new and uncorrelated variables. It enables the relations between the MCI to be analyzed, but the multiple linear regression assessed using the principal components (PCs) has a poor statistical quality and introduces high order PCs, too inaccurate for an explanation of the adsorption mechanism. The correlations are thus set up using the original variables (MCI), and both statistical tools, multiple linear regression and neutral network, are compared from a descriptive and predictive point of view. To compare the predictive ability of both methods, a test database of 10 organic compounds is used.

  5. Kernel Continuum Regression.

    PubMed

    Lee, Myung Hee; Liu, Yufeng

    2013-12-01

    The continuum regression technique provides an appealing regression framework connecting ordinary least squares, partial least squares and principal component regression in one family. It offers some insight on the underlying regression model for a given application. Moreover, it helps to provide deep understanding of various regression techniques. Despite the useful framework, however, the current development on continuum regression is only for linear regression. In many applications, nonlinear regression is necessary. The extension of continuum regression from linear models to nonlinear models using kernel learning is considered. The proposed kernel continuum regression technique is quite general and can handle very flexible regression model estimation. An efficient algorithm is developed for fast implementation. Numerical examples have demonstrated the usefulness of the proposed technique. PMID:24058224

  6. Logistics of Guinea Worm Disease Eradication in South Sudan

    PubMed Central

    Jones, Alexander H.; Becknell, Steven; Withers, P. Craig; Ruiz-Tiben, Ernesto; Hopkins, Donald R.; Stobbelaar, David; Makoy, Samuel Yibi

    2014-01-01

    From 2006 to 2012, the South Sudan Guinea Worm Eradication Program reduced new Guinea worm disease (dracunculiasis) cases by over 90%, despite substantial programmatic challenges. Program logistics have played a key role in program achievements to date. The program uses disease surveillance and program performance data and integrated technical–logistical staffing to maintain flexible and effective logistical support for active community-based surveillance and intervention delivery in thousands of remote communities. Lessons learned from logistical design and management can resonate across similar complex surveillance and public health intervention delivery programs, such as mass drug administration for the control of neglected tropical diseases and other disease eradication programs. Logistical challenges in various public health scenarios and the pivotal contribution of logistics to Guinea worm case reductions in South Sudan underscore the need for additional inquiry into the role of logistics in public health programming in low-income countries. PMID:24445199

  7. NASA Space Exploration Logistics Workshop Proceedings

    NASA Technical Reports Server (NTRS)

    deWeek, Oliver; Evans, William A.; Parrish, Joe; James, Sarah

    2006-01-01

    As NASA has embarked on a new Vision for Space Exploration, there is new energy and focus around the area of manned space exploration. These activities encompass the design of new vehicles such as the Crew Exploration Vehicle (CEV) and Crew Launch Vehicle (CLV) and the identification of commercial opportunities for space transportation services, as well as continued operations of the Space Shuttle and the International Space Station. Reaching the Moon and eventually Mars with a mix of both robotic and human explorers for short term missions is a formidable challenge in itself. How to achieve this in a safe, efficient and long-term sustainable way is yet another question. The challenge is not only one of vehicle design, launch, and operations but also one of space logistics. Oftentimes, logistical issues are not given enough consideration upfront, in relation to the large share of operating budgets they consume. In this context, a group of 54 experts in space logistics met for a two-day workshop to discuss the following key questions: 1. What is the current state-of the art in space logistics, in terms of architectures, concepts, technologies as well as enabling processes? 2. What are the main challenges for space logistics for future human exploration of the Moon and Mars, at the intersection of engineering and space operations? 3. What lessons can be drawn from past successes and failures in human space flight logistics? 4. What lessons and connections do we see from terrestrial analogies as well as activities in other areas, such as U.S. military logistics? 5. What key advances are required to enable long-term success in the context of a future interplanetary supply chain? These proceedings summarize the outcomes of the workshop, reference particular presentations, panels and breakout sessions, and record specific observations that should help guide future efforts.

  8. Examining Non-Linear Associations between Accelerometer-Measured Physical Activity, Sedentary Behavior, and All-Cause Mortality Using Segmented Cox Regression

    PubMed Central

    Lee, Paul H.

    2016-01-01

    Healthy adults are advised to perform at least 150 min of moderate-intensity physical activity weekly, but this advice is based on studies using self-reports of questionable validity. This study examined the dose-response relationship of accelerometer-measured physical activity and sedentary behaviors on all-cause mortality using segmented Cox regression to empirically determine the break-points of the dose-response relationship. Data from 7006 adult participants aged 18 or above in the National Health and Nutrition Examination Survey waves 2003–2004 and 2005–2006 were included in the analysis and linked with death certificate data using a probabilistic matching approach in the National Death Index through December 31, 2011. Physical activity and sedentary behavior were measured using ActiGraph model 7164 accelerometer over the right hip for 7 consecutive days. Each minute with accelerometer count <100; 1952–5724; and ≥5725 were classified as sedentary, moderate-intensity physical activity, and vigorous-intensity physical activity, respectively. Segmented Cox regression was used to estimate the hazard ratio (HR) of time spent in sedentary behaviors, moderate-intensity physical activity, and vigorous-intensity physical activity and all-cause mortality, adjusted for demographic characteristics, health behaviors, and health conditions. Data were analyzed in 2016. During 47,119 person-year of follow-up, 608 deaths occurred. Each additional hour per day of sedentary behaviors was associated with a HR of 1.15 (95% CI 1.01, 1.31) among participants who spend at least 10.9 h per day on sedentary behaviors, and each additional minute per day spent on moderate-intensity physical activity was associated with a HR of 0.94 (95% CI 0.91, 0.96) among participants with daily moderate-intensity physical activity ≤14.1 min. Associations of moderate physical activity and sedentary behaviors on all-cause mortality were independent of each other. To conclude, evidence from

  9. Space Shuttle operational logistics plan

    NASA Technical Reports Server (NTRS)

    Botts, J. W.

    1983-01-01

    The Kennedy Space Center plan for logistics to support Space Shuttle Operations and to establish the related policies, requirements, and responsibilities are described. The Directorate of Shuttle Management and Operations logistics responsibilities required by the Kennedy Organizational Manual, and the self-sufficiency contracting concept are implemented. The Space Shuttle Program Level 1 and Level 2 logistics policies and requirements applicable to KSC that are presented in HQ NASA and Johnson Space Center directives are also implemented.

  10. Jieke theory and logistic model

    SciTech Connect

    Cao, H.; Feng, G.

    1996-06-01

    What is a shell or a JIEKE (in Chinese) is introduced firstly, jieke is a sort of system boundary. From the concept of jieke theory, a new logistic model which takes account of the switch effect of the jieke is suggested. The model is analyzed and nonlinear mapping of the model is made. The results show the feature of the switch logistic model far differ from the original logistic model. {copyright} {ital 1996 American Institute of Physics.}

  11. Technical issues: logistics. AAMC.

    PubMed

    Stillman, P L

    1993-06-01

    The author states that she became interested in standardized patients (SPs) around 20 years ago as a means of developing a more uniform and effective way to provide instruction and evaluation of basic clinical skills. She reflects upon in detail: (1) the logistics of using SPs in teaching; (2) how SPs are used in assessment; (3) what aspects of performance SPs can be trained to record and evaluate; (4) issues concerning checklists; (5) evaluation of interviewing skills; (6) evaluation of written communication skills; (7) importance of defining what is being tested; (8) various kinds and uses of inter-station exercises and problems of scoring them; (9) case development and the various sources for case material; (10) ways to generate scores; (11) selecting and training SPs; (12) role of the faculty and primary importance of bedside training with real patients; and (13) pros and cons of national versus single-school efforts to use SPs. She concludes by cautioning that further research must be done before SPs can be used for high-stakes certifying and licensing examinations. PMID:8507311

  12. A note on Verhulst's logistic equation and related logistic maps

    NASA Astrophysics Data System (ADS)

    Ranferi Gutiérrez, M.; Reyes, M. A.; Rosu, H. C.

    2010-05-01

    We consider the Verhulst logistic equation and a couple of forms of the corresponding logistic maps. For the case of the logistic equation we show that using the general Riccati solution only changes the initial conditions of the equation. Next, we consider two forms of corresponding logistic maps reporting the following results. For the map xn + 1 = rxn(1 - xn) we propose a new way to write the solution for r = -2 which allows better precision of the iterative terms, while for the map xn + 1 - xn = rxn(1 - xn + 1) we show that it behaves identically to the logistic equation from the standpoint of the general Riccati solution, which is also provided herein for any value of the parameter r.

  13. Logistics Lessons Learned in NASA Space Flight

    NASA Technical Reports Server (NTRS)

    Evans, William A.; DeWeck, Olivier; Laufer, Deanna; Shull, Sarah

    2006-01-01

    Information System (LLIS) and verified that we received the same result using the internal version of LLIS for our logistics lesson searches. In conducting the research, information from multiple databases was consolidated into a single spreadsheet of 300 lessons learned. Keywords were applied for the purpose of sorting and evaluation. Once the lessons had been compiled, an analysis of the resulting data was performed, first sorting it by keyword, then finding duplication and root cause, and finally sorting by root cause. The data was then distilled into the top 7 lessons learned across programs, centers, and activities.

  14. Multiple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter describes multiple linear regression, a statistical approach used to describe the simultaneous associations of several variables with one continuous outcome. Important steps in using this approach include estimation and inference, variable selection in model building, and assessing model fit. The special cases of regression with interactions among the variables, polynomial regression, regressions with categorical (grouping) variables, and separate slopes models are also covered. Examples in microbiology are used throughout. PMID:18450050

  15. NCCS Regression Test Harness

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  16. Orthogonal Regression and Equivariance.

    ERIC Educational Resources Information Center

    Blankmeyer, Eric

    Ordinary least-squares regression treats the variables asymmetrically, designating a dependent variable and one or more independent variables. When it is not obvious how to make this distinction, a researcher may prefer to use orthogonal regression, which treats the variables symmetrically. However, the usual procedure for orthogonal regression is…

  17. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  18. Applying waste logistics modeling to regional planning

    SciTech Connect

    Holter, G.M.; Khawaja, A.; Shaver, S.R.; Peterson, K.L.

    1995-05-01

    Waste logistics modeling is a powerful analytical technique that can be used for effective planning of future solid waste storage, treatment, and disposal activities. Proper waste management is essential for preventing unacceptable environmental degradation from ongoing operations, and is also a critical part of any environmental remediation activity. Logistics modeling allows for analysis of alternate scenarios for future waste flowrates and routings, facility schedules, and processing or handling capacities. Such analyses provide an increased understanding of the critical needs for waste storage, treatment, transport, and disposal while there is still adequate lead time to plan accordingly. They also provide a basis for determining the sensitivity of these critical needs to the various system parameters. This paper discusses the application of waste logistics modeling concepts to regional planning. In addition to ongoing efforts to aid in planning for a large industrial complex, the Pacific Northwest Laboratory (PNL) is currently involved in implementing waste logistics modeling as part of the planning process for material recovery and recycling within a multi-city region in the western US.

  19. Logistic Discriminant Function Analysis for DIF Identification of Polytomously Scored Items.

    ERIC Educational Resources Information Center

    Miller, Timothy R.; Spray, Judith A.

    1993-01-01

    Presents logistic discriminant analysis as a means of detecting differential item functioning (DIF) in items that are polytomously scored. Provides examples of DIF detection using a 27-item mathematics test with 1,977 examinees. The proposed method is simpler and more practical than polytomous extensions of the logistic regression DIF procedure.…

  20. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation.

    PubMed

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-01-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1-10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633-0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926-0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation. PMID:27586851

  1. Poxvirus-Based Active Immunotherapy with PD-1 and LAG-3 Dual Immune Checkpoint Inhibition Overcomes Compensatory Immune Regulation, Yielding Complete Tumor Regression in Mice

    PubMed Central

    dela Cruz, Tracy; Cote, Joseph J.; Gordon, Evan J.; Kemp, Felicia; Xavier, Veronica; Franzusoff, Alex; Rountree, Ryan B.; Mandl, Stefanie J.

    2016-01-01

    Poxvirus-based active immunotherapies mediate anti-tumor efficacy by triggering broad and durable Th1 dominated T cell responses against the tumor. While monotherapy significantly delays tumor growth, it often does not lead to complete tumor regression. It was hypothesized that the induced robust infiltration of IFNγ-producing T cells into the tumor could provoke an adaptive immune evasive response by the tumor through the upregulation of PD-L1 expression. In therapeutic CT26-HER-2 tumor models, MVA-BN-HER2 poxvirus immunotherapy resulted in significant tumor growth delay accompanied by a robust, tumor-infiltrating T cell response that was characterized by low to mid-levels of PD-1 expression on T cells. As hypothesized, this response was countered by significantly increased PD-L1 expression on the tumor and, unexpectedly, also on infiltrating T cells. Synergistic benefit of anti-tumor therapy was observed when MVA-BN-HER2 immunotherapy was combined with PD-1 immune checkpoint blockade. Interestingly, PD-1 blockade stimulated a second immune checkpoint molecule, LAG-3, to be expressed on T cells. Combining MVA-BN-HER2 immunotherapy with dual PD-1 plus LAG-3 blockade resulted in comprehensive tumor regression in all mice treated with the triple combination therapy. Subsequent rejection of tumors lacking the HER-2 antigen by treatment-responsive mice without further therapy six months after the original challenge demonstrated long lasting memory and suggested that effective T cell immunity to novel, non-targeted tumor antigens (antigen spread) had occurred. These data support the clinical investigation of this triple therapy regimen, especially in cancer patients harboring PD-L1neg/low tumors unlikely to benefit from immune checkpoint blockade alone. PMID:26910562

  2. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    PubMed Central

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-01-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1–10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633–0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926–0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation. PMID:27586851

  3. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  4. Heteroscedastic transformation cure regression models.

    PubMed

    Chen, Chyong-Mei; Chen, Chen-Hsin

    2016-06-30

    Cure models have been applied to analyze clinical trials with cures and age-at-onset studies with nonsusceptibility. Lu and Ying (On semiparametric transformation cure model. Biometrika 2004; 91:331?-343. DOI: 10.1093/biomet/91.2.331) developed a general class of semiparametric transformation cure models, which assumes that the failure times of uncured subjects, after an unknown monotone transformation, follow a regression model with homoscedastic residuals. However, it cannot deal with frequently encountered heteroscedasticity, which may result from dispersed ranges of failure time span among uncured subjects' strata. To tackle the phenomenon, this article presents semiparametric heteroscedastic transformation cure models. The cure status and the failure time of an uncured subject are fitted by a logistic regression model and a heteroscedastic transformation model, respectively. Unlike the approach of Lu and Ying, we derive score equations from the full likelihood for estimating the regression parameters in the proposed model. The similar martingale difference function to their proposal is used to estimate the infinite-dimensional transformation function. Our proposed estimating approach is intuitively applicable and can be conveniently extended to other complicated models when the maximization of the likelihood may be too tedious to be implemented. We conduct simulation studies to validate large-sample properties of the proposed estimators and to compare with the approach of Lu and Ying via the relative efficiency. The estimating method and the two relevant goodness-of-fit graphical procedures are illustrated by using breast cancer data and melanoma data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887342

  5. Targeting Attenuated Interferon-α to Myeloma Cells with a CD38 Antibody Induces Potent Tumor Regression with Reduced Off-Target Activity.

    PubMed

    Pogue, Sarah L; Taura, Tetsuya; Bi, Mingying; Yun, Yong; Sho, Angela; Mikesell, Glen; Behrens, Collette; Sokolovsky, Maya; Hallak, Hussein; Rosenstock, Moti; Sanchez, Eric; Chen, Haiming; Berenson, James; Doyle, Anthony; Nock, Steffen; Wilson, David S

    2016-01-01

    Interferon-α (IFNα) has been prescribed to effectively treat multiple myeloma (MM) and other malignancies for decades. Its use has waned in recent years, however, due to significant toxicity and a narrow therapeutic index (TI). We sought to improve IFNα's TI by, first, attaching it to an anti-CD38 antibody, thereby directly targeting it to MM cells, and, second, by introducing an attenuating mutation into the IFNα portion of the fusion protein rendering it relatively inactive on normal, CD38 negative cells. This anti-CD38-IFNα(attenuated) immunocytokine, or CD38-Attenukine™, exhibits 10,000-fold increased specificity for CD38 positive cells in vitro compared to native IFNα and, significantly, is ~6,000-fold less toxic to normal bone marrow cells in vitro than native IFNα. Moreover, the attenuating mutation significantly decreases IFNα biomarker activity in cynomolgus macaques indicating that this approach may yield a better safety profile in humans than native IFNα or a non-attenuated IFNα immunocytokine. In human xenograft MM tumor models, anti-CD38-IFNα(attenuated) exerts potent anti-tumor activity in mice, inducing complete tumor regression in most cases. Furthermore, anti-CD38-IFNα(attenuated) is more efficacious than standard MM treatments (lenalidomide, bortezomib, dexamethasone) and exhibits strong synergy with lenalidomide and with bortezomib in xenograft models. Our findings suggest that tumor-targeted attenuated cytokines such as IFNα can promote robust tumor killing while minimizing systemic toxicity. PMID:27611189

  6. Tailored logistics: the next advantage.

    PubMed

    Fuller, J B; O'Conor, J; Rawlinson, R

    1993-01-01

    How many top executives have ever visited with managers who move materials from the factory to the store? How many still reduce the costs of logistics to the rent of warehouses and the fees charged by common carriers? To judge by hours of senior management attention, logistics problems do not rank high. But logistics have the potential to become the next governing element of strategy. Whether they know it or not, senior managers of every retail store and diversified manufacturing company compete in logistically distinct businesses. Customer needs vary, and companies can tailor their logistics systems to serve their customers better and more profitably. Companies do not create value for customers and sustainable advantage for themselves merely by offering varieties of goods. Rather, they offer goods in distinct ways. A particular can of Coca-Cola, for example, might be a can of Coca-Cola going to a vending machine, or a can of Coca-Cola that comes with billing services. There is a fortune buried in this distinction. The goal of logistics strategy is building distinct approaches to distinct groups of customers. The first step is organizing a cross-functional team to proceed through the following steps: segmenting customers according to purchase criteria, establishing different standards of service for different customer segments, tailoring logistics pipelines to support each segment, and creating economics of scale to determine which assets can be shared among various pipelines. The goal of establishing logistically distinct businesses is familiar: improved knowledge of customers and improved means of satisfying them. PMID:10126157

  7. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Neeley, James R.; Jones, James V.; Watson, Michael D.; Bramon, Christopher J.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle and is scheduled for its first mission in 2017. The goal of the first mission, which will be uncrewed, is to demonstrate the integrated system performance of the SLS rocket and spacecraft before a crewed flight in 2021. SLS has many of the same logistics challenges as any other large scale program. Common logistics concerns for SLS include integration of discreet programs geographically separated, multiple prime contractors with distinct and different goals, schedule pressures and funding constraints. However, SLS also faces unique challenges. The new program is a confluence of new hardware and heritage, with heritage hardware constituting seventy-five percent of the program. This unique approach to design makes logistics concerns such as commonality especially problematic. Additionally, a very low manifest rate of one flight every four years makes logistics comparatively expensive. That, along with the SLS architecture being developed using a block upgrade evolutionary approach, exacerbates long-range planning for supportability considerations. These common and unique logistics challenges must be clearly identified and tackled to allow SLS to have a successful program. This paper will address the common and unique challenges facing the SLS programs, along with the analysis and decisions the NASA Logistics engineers are making to mitigate the threats posed by each.

  8. Prediction in Multiple Regression.

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2000-01-01

    Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)

  9. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  10. Morse-Smale Regression

    PubMed Central

    Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-01

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424

  11. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  12. Linear and nonlinear modeling of antifungal activity of some heterocyclic ring derivatives using multiple linear regression and Bayesian-regularized neural networks.

    PubMed

    Caballero, Julio; Fernández, Michael

    2006-01-01

    Antifungal activity was modeled for a set of 96 heterocyclic ring derivatives (2,5,6-trisubstituted benzoxazoles, 2,5-disubstituted benzimidazoles, 2-substituted benzothiazoles and 2-substituted oxazolo(4,5-b)pyridines) using multiple linear regression (MLR) and Bayesian-regularized artificial neural network (BRANN) techniques. Inhibitory activity against Candida albicans (log(1/C)) was correlated with 3D descriptors encoding the chemical structures of the heterocyclic compounds. Training and test sets were chosen by means of k-Means Clustering. The most appropriate variables for linear and nonlinear modeling were selected using a genetic algorithm (GA) approach. In addition to the MLR equation (MLR-GA), two nonlinear models were built, model BRANN employing the linear variable subset and an optimum model BRANN-GA obtained by a hybrid method that combined BRANN and GA approaches (BRANN-GA). The linear model fit the training set (n = 80) with r2 = 0.746, while BRANN and BRANN-GA gave higher values of r2 = 0.889 and r2 = 0.937, respectively. Beyond the improvement of training set fitting, the BRANN-GA model was superior to the others by being able to describe 87% of test set (n = 16) variance in comparison with 78 and 81% the MLR-GA and BRANN models, respectively. Our quantitative structure-activity relationship study suggests that the distributions of atomic mass, volume and polarizability have relevant relationships with the antifungal potency of the compounds studied. Furthermore, the ability of the six variables selected nonlinearly to differentiate the data was demonstrated when the total data set was well distributed in a Kohonen self-organizing neural network (KNN). PMID:16205958

  13. Boosted Beta Regression

    PubMed Central

    Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

    2013-01-01

    Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706

  14. Lessons in logistics from Somalia.

    PubMed

    Kemball-Cook, D; Stephenson, R

    1984-03-01

    By February 1981 the refugee relief operation in Somalia was close to breakdown. The Governor of Somalia and the United Nations High Commission for Refugees (UNHCR) contracted the agency CARE to manage the logistics of the operation. By August 1981 over 99 % of food received at Mogadishu was reaching the camps. Here we describe this apparent success, and attempt to diagnose the contributing factors. Chief among these are dynamic leadership, 'systems' management, adaptability of personnel, the use of professional Indian food monitors in the camps, and the support given by the Government. The chief qualification on the success of the operation has been the continued dependency on expatriate expertise. General conclusions are offered relating to the management of logistics in relief operations. The most important conclusion is that there is a prime need for logistics to be centralized in a single organization at the start of major emergencies. We point to the current inadequacy in an international relief system which fails to ensure this, and suggest that a new or existing part of the United Nations family be given a 'brief for in-country logistics' to become a UN Emergency Logistics Office. PMID:20958559

  15. George: Gaussian Process regression

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel

    2015-11-01

    George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.

  16. Continual Improvement in Shuttle Logistics

    NASA Technical Reports Server (NTRS)

    Flowers, Jean; Schafer, Loraine

    1995-01-01

    It has been said that Continual Improvement (CI) is difficult to apply to service oriented functions, especially in a government agency such as NASA. However, a constrained budget and increasing requirements are a way of life at NASA Kennedy Space Center (KSC), making it a natural environment for the application of CI tools and techniques. This paper describes how KSC, and specifically the Space Shuttle Logistics Project, a key contributor to KSC's mission, has embraced the CI management approach as a means of achieving its strategic goals and objectives. An overview of how the KSC Space Shuttle Logistics Project has structured its CI effort and examples of some of the initiatives are provided.

  17. Active Travel to School: Findings from the Survey of US Health Behavior in School-Aged Children, 2009-2010

    ERIC Educational Resources Information Center

    Yang, Yong; Ivey, Stephanie S.; Levy, Marian C.; Royne, Marla B.; Klesges, Lisa M.

    2016-01-01

    Background: Whereas children's active travel to school (ATS) has confirmed benefits, only a few large national surveys of ATS exist. Methods: Using data from the Health Behavior in School-aged Children (HBSC) 2009-2010 US survey, we conducted a logistic regression model to estimate the odds ratios of ATS and a linear regression model to estimate…

  18. Development of immune memory to glial brain tumors after tumor regression induced by immunotherapeutic Toll-like receptor 7/8 activation.

    PubMed

    Stathopoulos, Apostolis; Pretto, Chrystel; Devillers, Laurent; Pierre, Denis; Hofman, Florence M; Kruse, Carol; Jadus, Martin; Chen, Thomas C; Schijns, Virgil E J C

    2012-05-01

    The efficacy of immunotherapeutic TLR7/8 activation by resiquimod (R848) was evaluated in vivo, in the CNS-1 rat glioma model syngeneic to Lewis rats. The immune treatment was compared with cytotoxic cyclophosphamide chemotherapy, and as well, was compared with the combination cytotoxic and immunotherapeutic treatments. We found that parenteral treatment with the TLR7/8 agonist, resiquimod, eventually induced complete tumor regression of CNS-1 glioblastoma tumors in Lewis rats. Cyclophosphamide (CY) treatment also resulted in dramatic CNS-1 remission, while the combined treatment showed similar antitumor effects. The resiquimod efficacy appeared not to be associated with direct injury to CNS-1 growth, while CY proved to exert tumoricidal cytotoxicity to the tumor cells. Rats that were cured by treatment with the innate immune response modifier resiquimod proved to be fully immune to secondary CNS-1 tumor rechallenge. They all remained tumor-free and survived. In contrast, rats that controlled CNS-1 tumor growth as a result of CY treatment did not develop immune memory, as demonstrated by their failure to reject a secondary CNS-1 tumor challenge; they showed a concomittant outgrowth of the primary tumor upon secondary tumor exposure. Rechallenge of rats that initially contained tumor growth by combination chemo-immunotherapy also failed to reject secondary tumor challenge, indicating that the cytotoxic effect of the CY likely extended to the endogenous memory immune cells as well as to the tumor. These data demonstrate strong therapeutic antitumor efficacy for the immune response modifier resiquimod leading to immunological memory, and suggest that CY treatment, although effective as chemotherapeutic agent, may be deleterious to maintenance of long-term antitumor immune memory. These data also highlight the importance of the sequence in which a multi-modal therapy is administered. PMID:22737605

  19. Moderated Multiple Regression, Spurious Interaction Effects, and IRT

    ERIC Educational Resources Information Center

    Kang, Sun-Mee; Waller, Niels G.

    2005-01-01

    Two Monte Carlo studies were conducted to explore the Type I error rates in moderated multiple regression (MMR) of observed scores and estimated latent trait scores from a two-parameter logistic item response theory (IRT) model. The results of both studies showed that MMR Type I error rates were substantially higher than the nominal alpha levels…

  20. Logistics support of space facilities

    NASA Technical Reports Server (NTRS)

    Lewis, William C.

    1988-01-01

    The logistic support of space facilities is described, with special attention given to the problem of sizing the inventory of ready spares kept at the space facility. Where possible, data from the Space Shuttle Orbiter is extrapolated to provide numerical estimates for space facilities. Attention is also given to repair effort estimation and long duration missions.

  1. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Bramon, Chris; Neeley, James R.; Jones, James V.; Watson, Michael D.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle in development and is scheduled for its first mission in 2017. SLS has many of the same logistics challenges as any other large scale program. However, SLS also faces unique challenges. This presentation will address the SLS challenges, along with the analysis and decisions to mitigate the threats posed by each.

  2. Regression versus No Regression in the Autistic Disorder: Developmental Trajectories

    ERIC Educational Resources Information Center

    Bernabei, P.; Cerquiglini, A.; Cortesi, F.; D' Ardia, C.

    2007-01-01

    Developmental regression is a complex phenomenon which occurs in 20-49% of the autistic population. Aim of the study was to assess possible differences in the development of regressed and non-regressed autistic preschoolers. We longitudinally studied 40 autistic children (18 regressed, 22 non-regressed) aged 2-6 years. The following developmental…

  3. Regression to the Mean and Predictors of MRI Disease Activity in RRMS Placebo Cohorts - Is There a Place for Baseline-to-Treatment Studies in MS?

    PubMed Central

    Stellmann, Jan-Patrick; Stürner, Klarissa Hanja; Young, Kim Lea; Siemonsen, Susanne; Friede, Tim; Heesen, Christoph

    2015-01-01

    Background Gadolinium-enhancing (GD+) lesions and T2 lesions are MRI outcomes for phase-2 treatment trials in relapsing-remitting Multiple Sclerosis (RRMS). Little is known about predictors of lesion development and regression-to-the-mean, which is an important aspect in early baseline-to-treatment trials. Objectives To quantify regression-to-the-mean and identify predictors of MRI lesion development in placebo cohorts. Methods 21 Phase-2 and Phase-3 trials were identified by a systematic literature research. Random-effects meta-analyses were performed to estimate development of T2 and GD+ after 6 months (phase-2) or 2 years (phase-3). Predictors of lesion development were evaluated with mixed-effect meta-regression. Results The mean number of GD+-lesions per scan was similar after 6 months (1.19, 95%CI: 0.87-1.51) and 2 years (1.19, 95%CI: 1.00-1.39). 39% of the patients were without new T2-lesion after 6 month and 19% after 2 years (95%CI: 12-25%). Mean number of baseline GD+-lesions was the best predictor for new lesions after 6 months. Conclusion Baseline GD-enhancing lesions predict evolution of Gd- and T2 lesions after 6 months and might be used to control for regression to the mean effects. Overall, proof-of-concept studies with a baseline to treatment design have to face a regression to 1.2 GD+lesions per scan within 6 months. PMID:25659100

  4. Investigating the Quantitative Structure-Activity Relationships for Antibody Recognition of Two Immunoassays for Polycyclic Aromatic Hydrocarbons by Multiple Regression Methods

    PubMed Central

    Zhang, Yan-Feng; Zhang, Li; Gao, Zhi-Xian; Dai, Shu-Gui

    2012-01-01

    Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous contaminants found in the environment. Immunoassays represent useful analytical methods to complement traditional analytical procedures for PAHs. Cross-reactivity (CR) is a very useful character to evaluate the extent of cross-reaction of a cross-reactant in immunoreactions and immunoassays. The quantitative relationships between the molecular properties and the CR of PAHs were established by stepwise multiple linear regression, principal component regression and partial least square regression, using the data of two commercial enzyme-linked immunosorbent assay (ELISA) kits. The objective is to find the most important molecular properties that affect the CR, and predict the CR by multiple regression methods. The results show that the physicochemical, electronic and topological properties of the PAH molecules have an integrated effect on the CR properties for the two ELISAs, among which molar solubility (Sm) and valence molecular connectivity index (3χv) are the most important factors. The obtained regression equations for RisC kit are all statistically significant (p < 0.005) and show satisfactory ability for predicting CR values, while equations for RaPID kit are all not significant (p > 0.05) and not suitable for predicting. It is probably because that the RisC immunoassay employs a monoclonal antibody, while the RaPID kit is based on polyclonal antibody. Considering the important effect of solubility on the CR values, cross-reaction potential (CRP) is calculated and used as a complement of CR for evaluation of cross-reactions in immunoassays. Only the compounds with both high CR and high CRP can cause intense cross-reactions in immunoassays. PMID:23012547

  5. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  6. CORRELATION AND REGRESSION

    EPA Science Inventory

    Webcast entitled Statistical Tools for Making Sense of Data, by the National Nutrient Criteria Support Center, N-STEPS (Nutrients-Scientific Technical Exchange Partnership. The section "Correlation and Regression" provides an overview of these two techniques in the context of nut...

  7. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  8. Partial covariate adjusted regression

    PubMed Central

    Şentürk, Damla; Nguyen, Danh V.

    2008-01-01

    Covariate adjusted regression (CAR) is a recently proposed adjustment method for regression analysis where both the response and predictors are not directly observed (Şentürk and Müller, 2005). The available data has been distorted by unknown functions of an observable confounding covariate. CAR provides consistent estimators for the coefficients of the regression between the variables of interest, adjusted for the confounder. We develop a broader class of partial covariate adjusted regression (PCAR) models to accommodate both distorted and undistorted (adjusted/unadjusted) predictors. The PCAR model allows for unadjusted predictors, such as age, gender and demographic variables, which are common in the analysis of biomedical and epidemiological data. The available estimation and inference procedures for CAR are shown to be invalid for the proposed PCAR model. We propose new estimators and develop new inference tools for the more general PCAR setting. In particular, we establish the asymptotic normality of the proposed estimators and propose consistent estimators of their asymptotic variances. Finite sample properties of the proposed estimators are investigated using simulation studies and the method is also illustrated with a Pima Indians diabetes data set. PMID:20126296

  9. Bayesian ARTMAP for regression.

    PubMed

    Sasu, L M; Andonie, R

    2013-10-01

    Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. PMID:23665468

  10. Multisource information fusion for logistics

    NASA Astrophysics Data System (ADS)

    Woodley, Robert; Petrov, Plamen; Noll, Warren

    2011-05-01

    Current Army logistical systems and databases contain massive amounts of data that need an effective method to extract actionable information. The databases do not contain root cause and case-based analysis needed to diagnose or predict breakdowns. A system is needed to find data from as many sources as possible, process it in an integrated fashion, and disseminate information products on the readiness of the fleet vehicles. 21st Century Systems, Inc. introduces the Agent- Enabled Logistics Enterprise Intelligence System (AELEIS) tool, designed to assist logistics analysts with assessing the availability and prognostics of assets in the logistics pipeline. AELEIS extracts data from multiple, heterogeneous data sets. This data is then aggregated and mined for data trends. Finally, data reasoning tools and prognostics tools evaluate the data for relevance and potential issues. Multiple types of data mining tools may be employed to extract the data and an information reasoning capability determines what tools are needed to apply them to extract information. This can be visualized as a push-pull system where data trends fire a reasoning engine to search for corroborating evidence and then integrate the data into actionable information. The architecture decides on what reasoning engine to use (i.e., it may start with a rule-based method, but, if needed, go to condition based reasoning, and even a model-based reasoning engine for certain types of equipment). Initial results show that AELEIS is able to indicate to the user of potential fault conditions and root-cause information mined from a database.

  11. Logistics, electronic commerce, and the environment

    NASA Astrophysics Data System (ADS)

    Sarkis, Joseph; Meade, Laura; Talluri, Srinivas

    2002-02-01

    Organizations realize that a strong supporting logistics or electronic logistics (e-logistics) function is important from both commercial and consumer perspectives. The implications of e-logistics models and practices cover the forward and reverse logistics functions of organizations. They also have direct and profound impact on the natural environment. This paper will focus on a discussion of forward and reverse e-logistics and their relationship to the natural environment. After discussion of the many pertinent issues in these areas, directions of practice and implications for study and research are then described.

  12. A new method for dealing with measurement error in explanatory variables of regression models.

    PubMed

    Freedman, Laurence S; Fainberg, Vitaly; Kipnis, Victor; Midthune, Douglas; Carroll, Raymond J

    2004-03-01

    We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables. PMID:15032787

  13. Ridge Regression: A Regression Procedure for Analyzing Correlated Independent Variables.

    ERIC Educational Resources Information Center

    Rakow, Ernest A.

    Ridge regression is presented as an analytic technique to be used when predictor variables in a multiple linear regression situation are highly correlated, a situation which may result in unstable regression coefficients and difficulties in interpretation. Ridge regression avoids the problem of selection of variables that may occur in stepwise…

  14. Ridge Regression Signal Processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.

  15. Fast Censored Linear Regression

    PubMed Central

    HUANG, YIJIAN

    2013-01-01

    Weighted log-rank estimating function has become a standard estimation method for the censored linear regression model, or the accelerated failure time model. Well established statistically, the estimator defined as a consistent root has, however, rather poor computational properties because the estimating function is neither continuous nor, in general, monotone. We propose a computationally efficient estimator through an asymptotics-guided Newton algorithm, in which censored quantile regression methods are tailored to yield an initial consistent estimate and a consistent derivative estimate of the limiting estimating function. We also develop fast interval estimation with a new proposal for sandwich variance estimation. The proposed estimator is asymptotically equivalent to the consistent root estimator and barely distinguishable in samples of practical size. However, computation time is typically reduced by two to three orders of magnitude for point estimation alone. Illustrations with clinical applications are provided. PMID:24347802

  16. Comparing the Discrete and Continuous Logistic Models

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.

    2008-01-01

    The solutions of the discrete logistic growth model based on a difference equation and the continuous logistic growth model based on a differential equation are compared and contrasted. The investigation is conducted using a dynamic interactive spreadsheet. (Contains 5 figures.)

  17. Research on 6R Military Logistics Network

    NASA Astrophysics Data System (ADS)

    Jie, Wan; Wen, Wang

    The building of military logistics network is an important issue for the construction of new forces. This paper has thrown out a concept model of 6R military logistics network model based on JIT. Then we conceive of axis spoke y logistics centers network, flexible 6R organizational network, lean 6R military information network based grid. And then the strategy and proposal for the construction of the three sub networks of 6Rmilitary logistics network are given.

  18. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  19. Liver Fibrosis Regression Measured by Transient Elastography in Human Immunodeficiency Virus (HIV)-Hepatitis B Virus (HBV)-Coinfected Individuals on Long-Term HBV-Active Combination Antiretroviral Therapy

    PubMed Central

    Audsley, Jennifer; Robson, Christopher; Aitchison, Stacey; Matthews, Gail V.; Iser, David; Sasadeusz, Joe; Lewin, Sharon R.

    2016-01-01

    Background. Advanced fibrosis occurs more commonly in human immunodeficiency virus (HIV)-hepatitis B virus (HBV) coinfected individuals; therefore, fibrosis monitoring is important in this population. However, transient elastography (TE) data in HIV-HBV coinfection are lacking. We aimed to assess liver fibrosis using TE in a cross-sectional study of HIV-HBV coinfected individuals receiving combination HBV-active (lamivudine and/or tenofovir/tenofovir-emtricitabine) antiretroviral therapy, identify factors associated with advanced fibrosis, and examine change in fibrosis in those with >1 TE assessment. Methods. We assessed liver fibrosis in 70 HIV-HBV coinfected individuals on HBV-active combination antiretroviral therapy (cART). Change in fibrosis over time was examined in a subset with more than 1 TE result (n = 49). Clinical and laboratory variables at the time of the first TE were collected, and associations with advanced fibrosis (≥F3, Metavir scoring system) and fibrosis regression (of least 1 stage) were examined. Results. The majority of the cohort (64%) had mild to moderate fibrosis at the time of the first TE, and we identified alanine transaminase, platelets, and detectable HIV ribonucleic acid as associated with advanced liver fibrosis. Alanine transaminase and platelets remained independently advanced in multivariate modeling. More than 28% of those with >1 TE subsequently showed liver fibrosis regression, and higher baseline HBV deoxyribonucleic acid was associated with regression. Prevalence of advanced fibrosis (≥F3) decreased 12.3% (32.7%–20.4%) over a median of 31 months. Conclusions. The observed fibrosis regression in this group supports the beneficial effects of cART on liver stiffness. It would be important to study a larger group of individuals with more advanced fibrosis to more definitively assess factors associated with liver fibrosis regression. PMID:27006960

  20. Nowcasting sunshine number using logistic modeling

    NASA Astrophysics Data System (ADS)

    Brabec, Marek; Badescu, Viorel; Paulescu, Marius

    2013-04-01

    In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.

  1. Accounting for the correlation between fellow eyes in regression analysis.

    PubMed

    Glynn, R J; Rosner, B

    1992-03-01

    Regression techniques that appropriately use all available eyes have infrequently been applied in the ophthalmologic literature, despite advances both in the development of statistical models and in the availability of computer software to fit these models. We considered the general linear model and polychotomous logistic regression approaches of Rosner and the estimating equation approach of Liang and Zeger, applied to both linear and logistic regression. Methods were illustrated with the use of two real data sets: (1) impairment of visual acuity in patients with retinitis pigmentosa and (2) overall visual field impairment in elderly patients evaluated for glaucoma. We discuss the interpretation of coefficients from these models and the advantages of these approaches compared with alternative approaches, such as treating individuals rather than eyes as the unit of analysis, separate regression analyses of right and left eyes, or utilization of ordinary regression techniques without accounting for the correlation between fellow eyes. Specific advantages include enhanced statistical power, more interpretable regression coefficients, greater precision of estimation, and less sensitivity to missing data for some eyes. We concluded that these models should be used more frequently in ophthalmologic research, and we provide guidelines for choosing between alternative models. PMID:1543458

  2. Logistics Handbook, 1976. Colorado Outward Bound School.

    ERIC Educational Resources Information Center

    Colorado Outward Bound School, Denver.

    Logistics, a support mission, is vital to the successful operation of the Colorado Outward Bound School (COBS) courses. Logistics is responsible for purchasing, maintaining, transporting, and replenishing a wide variety of items, i.e., food, mountaineering and camping equipment, medical and other supplies, and vehicles. The Logistics coordinator…

  3. Time series modeling by a regression approach based on a latent process.

    PubMed

    Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice

    2009-01-01

    Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations. PMID:19616918

  4. Dysglycaemia and Other Predictors for Progression or Regression from Impaired Fasting Glucose to Diabetes or Normoglycaemia

    PubMed Central

    de Abreu, L.; Holloway, Kara L.; Kotowicz, Mark A.; Pasco, Julie A.

    2015-01-01

    Aims. Diabetes mellitus is a growing health problem worldwide. This study aimed to describe dysglycaemia and determine the impact of body composition and clinical and lifestyle factors on the risk of progression or regression from impaired fasting glucose (IFG) to diabetes or normoglycaemia in Australian women. Methods. This study included 1167 women, aged 20–94 years, enrolled in the Geelong Osteoporosis Study. Multivariable logistic regression was used to identify predictors for progression to diabetes or regression to normoglycaemia (from IFG), over 10 years of follow-up. Results. At baseline the proportion of women with IFG was 33.8% and 6.5% had diabetes. Those with fasting dysglycaemia had higher obesity-related factors, lower serum HDL cholesterol, and lower physical activity. Over a decade, the incidence of progression from IFG to diabetes was 18.1 per 1,000 person-years (95% CI, 10.7–28.2). Fasting plasma glucose and serum triglycerides were important factors in both progression to diabetes and regression to normoglycaemia. Conclusions. Our results show a transitional process; those with IFG had risk factors intermediate to normoglycaemics and those with diabetes. This investigation may help target interventions to those with IFG at high risk of progression to diabetes and thereby prevent cases of diabetes. PMID:26273669

  5. Information logistics: A production-line approach to information services

    NASA Technical Reports Server (NTRS)

    Adams, Dennis; Lee, Chee-Seng

    1991-01-01

    Logistics can be defined as the process of strategically managing the acquisition, movement, and storage of materials, parts, and finished inventory (and the related information flow) through the organization and its marketing channels in a cost effective manner. It is concerned with delivering the right product to the right customer in the right place at the right time. The logistics function is composed of inventory management, facilities management, communications unitization, transportation, materials management, and production scheduling. The relationship between logistics and information systems is clear. Systems such as Electronic Data Interchange (EDI), Point of Sale (POS) systems, and Just in Time (JIT) inventory management systems are important elements in the management of product development and delivery. With improved access to market demand figures, logisticians can decrease inventory sizes and better service customer demand. However, without accurate, timely information, little, if any, of this would be feasible in today's global markets. Information systems specialists can learn from logisticians. In a manner similar to logistics management, information logistics is concerned with the delivery of the right data, to the ring customer, at the right time. As such, information systems are integral components of the information logistics system charged with providing customers with accurate, timely, cost-effective, and useful information. Information logistics is a management style and is composed of elements similar to those associated with the traditional logistics activity: inventory management (data resource management), facilities management (distributed, centralized and decentralized information systems), communications (participative design and joint application development methodologies), unitization (input/output system design, i.e., packaging or formatting of the information), transportations (voice, data, image, and video communication systems

  6. Incremental hierarchical discriminant regression.

    PubMed

    Weng, Juyang; Hwang, Wey-Shiuan

    2007-03-01

    This paper presents incremental hierarchical discriminant regression (IHDR) which incrementally builds a decision tree or regression tree for very high-dimensional regression or decision spaces by an online, real-time learning system. Biologically motivated, it is an approximate computational model for automatic development of associative cortex, with both bottom-up sensory inputs and top-down motor projections. At each internal node of the IHDR tree, information in the output space is used to automatically derive the local subspace spanned by the most discriminating features. Embedded in the tree is a hierarchical probability distribution model used to prune very unlikely cases during the search. The number of parameters in the coarse-to-fine approximation is dynamic and data-driven, enabling the IHDR tree to automatically fit data with unknown distribution shapes (thus, it is difficult to select the number of parameters up front). The IHDR tree dynamically assigns long-term memory to avoid the loss-of-memory problem typical with a global-fitting learning algorithm for neural networks. A major challenge for an incrementally built tree is that the number of samples varies arbitrarily during the construction process. An incrementally updated probability model, called sample-size-dependent negative-log-likelihood (SDNLL) metric is used to deal with large sample-size cases, small sample-size cases, and unbalanced sample-size cases, measured among different internal nodes of the IHDR tree. We report experimental results for four types of data: synthetic data to visualize the behavior of the algorithms, large face image data, continuous video stream from robot navigation, and publicly available data sets that use human defined features. PMID:17385628

  7. The Impact of Additional Weekdays of Active Commuting to School on Children Achieving a Criterion of 300+ Minutes of Moderate-to-Vigorous Physical Activity

    ERIC Educational Resources Information Center

    Daly-Smith, Andy J. W.; McKenna, Jim; Radley, Duncan; Long, Jonathan

    2011-01-01

    Objective: To investigate the value of additional days of active commuting for meeting a criterion of 300+ minutes of moderate-to-vigorous physical activity (MVPA; 60+ mins/day x 5) during the school week. Methods: Based on seven-day diaries supported by teachers, binary logistic regression analyses were used to predict achievement of MVPA…

  8. Ridge regression processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.

  9. Logistic equation of arbitrary order

    NASA Astrophysics Data System (ADS)

    Grabowski, Franciszek

    2010-08-01

    The paper is concerned with the new logistic equation of arbitrary order which describes the performance of complex executive systems X vs. number of tasks N, operating at limited resources K, at non-extensive, heterogeneous self-organization processes characterized by parameter f. In contrast to the classical logistic equation which exclusively relates to the special case of sub-extensive homogeneous self-organization processes at f=1, the proposed model concerns both homogeneous and heterogeneous processes in sub-extensive and super-extensive areas. The parameter of arbitrary order f, where -∞

  10. Comparison of aroma-active compounds and sensory characteristics of durian (Durio zibethinus L.) wines using strains of Saccharomyces cerevisiae with odor activity values and partial least-squares regression.

    PubMed

    Zhu, JianCai; Chen, Feng; Wang, LingYing; Niu, YunWei; Shu, Chang; Chen, HeXing; Xiao, ZuoBing

    2015-02-25

    The study evaluated the effects of five different strains (GRE, RC212, Lalvin D254, CGMCC2.4, and CGMCC2.23) of the yeast Saccharomyces cerevisiae on the aromatic characteristics of fermented durian musts. In this work, 38 and 43 compounds in durian juices and wines were analyzed by gas chromatography-mass spectrometry (GC-MS) and GC-pulsed flame photometric detection (GC-PFPD) with the aid of stir bar sorptive extraction (SBSE), respectively. According to the measured odor activity values (OAV), only 11 and 15 aroma compounds had OAVs >1 in durian juices or wines, among which 2,3-butanedione, 3-methylbutanol, dimethyl sulfide, dimethyl disulfide, methyl ethyl disulfide, ethyl 2-methylbutanoate, ethyl butanoate, and ethyl octanoate were major contributors to the aroma of juices and wines. Partial least-squares regression (PLSR) was used to detect positive correlations between sensory analysis and aroma compounds. The results showed that the attributes were closely related to aroma compounds. PMID:25620380

  11. Leisure Activity Patterns and Their Associations with Overweight: A Prospective Study among Adolescents

    ERIC Educational Resources Information Center

    Lajunen, Hanna-Reetta; Keski-Rahkonen, Anna; Pulkkinen, Lea; Rose, Richard J.; Rissanen, Aila; Kaprio, Jaakko

    2009-01-01

    We examined longitudinal associations between individual leisure activities (television viewing, video viewing, computer games, listening to music, board games, musical instrument playing, reading, arts, crafts, socializing, clubs or scouts, sports, outdoor activities) and being overweight using logistic regression and latent class analysis in a…

  12. Recursive Algorithm For Linear Regression

    NASA Technical Reports Server (NTRS)

    Varanasi, S. V.

    1988-01-01

    Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.

  13. Space station synergetic RAM-logistics analysis

    NASA Technical Reports Server (NTRS)

    Dejulio, Edmund T.; Leet, Joel H.

    1988-01-01

    NASA's Space Station Maintenance Planning and Analysis (MP&A) Study is a step in the overall Space Station Program to define optimum approaches for on-orbit maintenance planning and logistics support. The approach used in the MP&A study and the analysis process used are presented. Emphasis is on maintenance activities and processes that can be accomplished on orbit within the known design and support constraints of the Space Station. From these analyses, recommendations for maintainability/maintenance requirements are established. The ultimate goal of the study is to reduce on-orbit maintenance requirements to a practical and safe minimum, thereby conserving crew time for productive endeavors. The reliability, availability, and maintainability (RAM) and operations performance evaluation models used were assembled and developed as part of the MP&A study and are described. A representative space station system design is presented to illustrate the analysis process.

  14. Support vector machines classifiers of physical activities in preschoolers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  15. Bullied Status and Physical Activity in Texas Adolescents

    ERIC Educational Resources Information Center

    Case, Kathleen R.; Pérez, Adriana; Saxton, Debra L.; Hoelscher, Deanna M.; Springer, Andrew E.

    2016-01-01

    This study examined the association between having been bullied at school during the past 6 months ("bullied status") and not meeting physical activity (PA) recommendations of 60 minutes of daily PA during the past week among 8th- and 11th-grade Texas adolescents. Multiple logistic regression analysis was conducted to examine this…

  16. Logistic Regression Analysis of the Response of Winter Wheat to Components of Artificial Freezing Episodes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Improvement of cold tolerance of winter wheat (Triticum aestivum L.) through breeding methods has been problematic. A better understanding of how individual wheat cultivars respond to components of the freezing process may provide new information that can be used to develop more cold tolerance culti...

  17. Analysis of Variables: Predicting Sophomore Persistence Using Logistic Regression Analysis at the University of South Florida

    ERIC Educational Resources Information Center

    Miller, Thomas E.; Herreid, Charlene H.

    2009-01-01

    This is the fifth in a series of articles describing an attrition prediction and intervention project at the University of South Florida (USF) in Tampa. The project was originally presented in the 83(2) issue (Miller 2007). The statistical model for predicting attrition was described in the 83(3) issue (Miller and Herreid 2008). The methods and…

  18. Variation in torus mandibularis prevalence in Norway. A statistical analysis using logistic regression.

    PubMed

    Eggen, S; Natvig, B

    1991-02-01

    The prevalence of torus mandibularis was assessed in two groups of dental patients, altogether 2010 individuals over 10 yr of age: 1181 individuals native to the Lofoten Islands in North Norway, situated at 68 degrees latitude; and 829 patients indigenous to Gudbrandsdal, an inland district in the southeastern part of the country at 61 degrees latitude. Both groups were supposed to be of the same Caucasian stock and, therefore, to have similar genetic predisposition to torus on the average. The following observations were found: 1) the prevalence of torus mandibularis was much greater in Gudbrandsdal than in Lofoten (P much less than 0.001); 2) the prevalence decreased among persons above 50 yr of age as compared with those of the age classes 10-49 yr (P less than 0.01); and 3) it was smaller among women than men (P less than 0.05), mainly due to such a decrease in Lofoten. In a recent investigation of people living in Gudbrandsdal the fraction of the variation of torus that was attributable to genetic differences was estimated as about 30%, whereas approximately 70% of the causes seemed to be ascribable to environmental influences in terms of occlusal stress. It is suggested that dietary habits and number of existing teeth seemed to be environmental variables with an influence on the observed variation of torus prevalence between geographical regions, age classes, and sexes. The question about a possible sexual difference as to the genetic component of liability to torus mandibularis was outside the scope of the present study. PMID:2019087

  19. Investigating the Effect of Complexity Factors in Stoichiometry Problems Using Logistic Regression and Eye Tracking

    ERIC Educational Resources Information Center

    Tang, Hui; Kirk, John; Pienta, Norbert J.

    2014-01-01

    This paper includes two experiments, one investigating complexity factors in stoichiometry word problems, and the other identifying students' problem-solving protocols by using eye-tracking technology. The word problems used in this study had five different complexity factors, which were randomly assigned by a Web-based tool that we…

  20. Changepoint Detection in Multinomial Logistic Regression with Application to Sky-Cloudiness Conditions in Canada

    NASA Astrophysics Data System (ADS)

    Lu, Q.; Wang, X. L.

    2009-04-01

    Detecting changepoints in a sequence of continuous random variables has been extensively explored in both statistics and climatology literature. There is little, however, for studying the case with multicategory random variables. For instance, the sky-cloudiness condition in Canada is reported in tenths of the sky dome and thus has 11 categories (from 0 for clear sky, to 10 tenths for overcast). This study develops an overall likelihood-ratio test statistic for detecting a sudden change in the parameters of the continuation-ratio logit random intercept model for a sequence of multinomial variables. A method of partitioning the overall test statistic is also proposed, which allows one to assess the significance of the effect of the detected change on individual categories. An application of this new technique to real sky cloudiness data is also presented.

  1. Fitting Proportional Odds Models to Educational Data in Ordinal Logistic Regression Using Stata, SAS and SPSS

    ERIC Educational Resources Information Center

    Liu, Xing

    2008-01-01

    The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…

  2. AP® Potential Predicted by PSAT/NMSQT® Scores Using Logistic Regression. Statistical Report 2014-1

    ERIC Educational Resources Information Center

    Zhang, Xiuyuan; Patel, Priyank; Ewing, Maureen

    2014-01-01

    AP Potential™ is an educational guidance tool that uses PSAT/NMSQT® scores to identify students who have the potential to do well on one or more Advanced Placement® (AP®) Exams. Students identified as having AP potential, perhaps students who would not have been otherwise identified, should consider enrolling in the corresponding AP course if they…

  3. Using Label Noise Robust Logistic Regression for Automated Updating of Topographic Geospatial Databases

    NASA Astrophysics Data System (ADS)

    Maas, A.; Rottensteiner, F.; Heipke, C.

    2016-06-01

    Supervised classification of remotely sensed images is a classical method to update topographic geospatial databases. The task requires training data in the form of image data with known class labels, whose generation is time-consuming. To avoid this problem one can use the labels from the outdated database for training. As some of these labels may be wrong due to changes in land cover, one has to use training techniques that can cope with wrong class labels in the training data. In this paper we adapt a label noise tolerant training technique to the problem of database updating. No labelled data other than the existing database are necessary. The resulting label image and transition matrix between the labels can help to update the database and to detect changes between the two time epochs. Our experiments are based on different test areas, using real images with simulated existing databases. Our results show that this method can indeed detect changes that would remain undetected if label noise were not considered in training.

  4. Role of Social Performance in Predicting Learning Problems: Prediction of Risk Using Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Del Prette, Zilda Aparecida Pereira; Prette, Almir Del; De Oliveira, Lael Almeida; Gresham, Frank M.; Vance, Michael J.

    2012-01-01

    Social skills are specific behaviors that individuals exhibit in order to successfully complete social tasks whereas social competence represents judgments by significant others that these social tasks have been successfully accomplished. The present investigation identified the best sociobehavioral predictors obtained from different raters…

  5. Logistic regression analysis to predict weaning-to-estrous interval in first-litter gilts

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Delayed return to estrus after weaning is a significant problem for swine producers. In this study, we investigated the relationships between weaning-to-estrous interval (WEI) and body weight (BW), back fat (BF), plasma leptin (L), glucose (G), albumin (A), urea nitrogen (PUN) concentrations and lit...

  6. Sourcing for Parameter Estimation and Study of Logistic Differential Equation

    ERIC Educational Resources Information Center

    Winkel, Brian J.

    2012-01-01

    This article offers modelling opportunities in which the phenomena of the spread of disease, perception of changing mass, growth of technology, and dissemination of information can be described by one differential equation--the logistic differential equation. It presents two simulation activities for students to generate real data, as well as…

  7. Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning

    ERIC Educational Resources Information Center

    MacLellan, Christopher J.; Liu, Ran; Koedinger, Kenneth R.

    2015-01-01

    Additive Factors Model (AFM) and Performance Factors Analysis (PFA) are two popular models of student learning that employ logistic regression to estimate parameters and predict performance. This is in contrast to Bayesian Knowledge Tracing (BKT) which uses a Hidden Markov Model formalism. While all three models tend to make similar predictions,…

  8. Optimal distributions for multiplex logistic networks.

    PubMed

    Solá Conde, Luis E; Used, Javier; Romance, Miguel

    2016-06-01

    This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates. PMID:27368801

  9. Integrated Computer System of Management in Logistics

    NASA Astrophysics Data System (ADS)

    Chwesiuk, Krzysztof

    2011-06-01

    This paper aims at presenting a concept of an integrated computer system of management in logistics, particularly in supply and distribution chains. Consequently, the paper includes the basic idea of the concept of computer-based management in logistics and components of the system, such as CAM and CIM systems in production processes, and management systems for storage, materials flow, and for managing transport, forwarding and logistics companies. The platform which integrates computer-aided management systems is that of electronic data interchange.

  10. Optimal distributions for multiplex logistic networks

    NASA Astrophysics Data System (ADS)

    Solá Conde, Luis E.; Used, Javier; Romance, Miguel

    2016-06-01

    This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates.

  11. Bayesian Spatial Quantile Regression

    PubMed Central

    Reich, Brian J.; Fuentes, Montserrat; Dunson, David B.

    2013-01-01

    Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997–2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794

  12. Bayesian Spatial Quantile Regression.

    PubMed

    Reich, Brian J; Fuentes, Montserrat; Dunson, David B

    2011-03-01

    Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997-2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794

  13. Canonical variate regression.

    PubMed

    Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun

    2016-07-01

    In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. PMID:26861909

  14. Logistics Reduction Technologies for Exploration Missions

    NASA Technical Reports Server (NTRS)

    Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.

    2014-01-01

    Human exploration missions under study are limited by the launch mass capacity of existing and planned launch vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Although mass is typically the focus of exploration missions, due to its strong impact on launch vehicle and habitable volume for the crew, logistics volume also needs to be considered. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing six logistics technologies guided by a systems engineering cradle-to-grave approach to enable after-use crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. Reduction of mass has a corresponding and significant impact to logistical volume. The reduction of logistical volume can reduce the overall pressurized vehicle mass directly, or indirectly benefit the mission by allowing for an increase in habitable volume during the mission. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as mission durations increase. Early studies have shown that the use of advanced logistics technologies can save approximately 20 m(sup 3) of volume during transit alone for a six-person Mars conjunction class mission.

  15. Analysis of Jingdong Mall Logistics Distribution Model

    NASA Astrophysics Data System (ADS)

    Shao, Kang; Cheng, Feng

    In recent years, the development of electronic commerce in our country to speed up the pace. The role of logistics has been highlighted, more and more electronic commerce enterprise are beginning to realize the importance of logistics in the success or failure of the enterprise. In this paper, the author take Jingdong Mall for example, performing a SWOT analysis of their current situation of self-built logistics system, find out the problems existing in the current Jingdong Mall logistics distribution and give appropriate recommendations.

  16. Linear regression in astronomy. I

    NASA Technical Reports Server (NTRS)

    Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

    1990-01-01

    Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.

  17. FUTURE LOGISTICS AND OPERATIONAL ADAPTABILITY

    SciTech Connect

    Houck, Roger P.

    2009-10-01

    While we cannot predict the future, we can ascertain trends and examine them through the use of alternative futures methodologies and tools. From a logistics perspective, we know that many different futures are possible, all of which are obviously dependent on decisions we make in the present. As professional logisticians we are obligated to provide the field - our Soldiers - with our best professional opinion of what will result in success on the battlefield. Our view of the future should take history and contemporary conflict into account, but it must also consider that continuity with the past cannot be taken for granted. If we are too focused on past and current experience, then our vision of the future will be limited indeed. On the one hand, the future must be explained in language that does not defy common sense. On the other hand, the pace of change is such that we must conduct qualitative and quantitative trend analyses, forecasting, and explorative scenario development in ways that allow for significant breaks - or "shocks" - that may "change the game". We will need capabilities and solutions that are constantly evolving - and improving - to match the operational tempo of a radically changing threat environment. For those who provide quartermaster services, this article will briefly examine what this means from the perspective of creating what might be termed a preferred future.

  18. Biomass supply logistics and infrastructure.

    PubMed

    Sokhansanj, Shahabaddine; Hess, J Richard

    2009-01-01

    Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass. PMID:19768612

  19. Biomass Supply Logistics and Infrastructure

    SciTech Connect

    Sokhansanj, Shahabaddine

    2009-04-01

    Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the Biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews methods of estimating the quantities of biomass followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and Transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.

  20. Biomass Supply Logistics and Infrastructure

    NASA Astrophysics Data System (ADS)

    Sokhansanj, Shahabaddine; Hess, J. Richard

    Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.

  1. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  2. Annual report procurement and logistics management center Sandia National Laboratories fiscal year 2002.

    SciTech Connect

    Palmer, David L.

    2003-05-01

    This report summarizes the purchasing and transportation activities of the Procurement and Logistics Management Center for Fiscal Year 2002. Activities for both the New Mexico and California locations are included.

  3. Visualizing the Logistic Map with a Microcontroller

    ERIC Educational Resources Information Center

    Serna, Juan D.; Joshi, Amitabh

    2012-01-01

    The logistic map is one of the simplest nonlinear dynamical systems that clearly exhibits the route to chaos. In this paper, we explore the evolution of the logistic map using an open-source microcontroller connected to an array of light-emitting diodes (LEDs). We divide the one-dimensional domain interval [0,1] into ten equal parts, an associate…

  4. Exploration Mission Benefits From Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Broyan, James Lee, Jr.; Ewert, Michael K.; Schlesinger, Thilini

    2016-01-01

    Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logistical mass can have a significant impact because it also reduces the packaging burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting, and trash processing technologies are under development to increase habitable volume and improve protection against solar storm events. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and to help locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio-frequency-identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and their mission benefits for exploration missions.

  5. On some generalized discrete logistic maps

    PubMed Central

    Radwan, Ahmed G.

    2012-01-01

    Recently, conventional logistic maps have been used in different vital applications like modeling and security. However, unfortunately the conventional logistic maps can tolerate only one changeable parameter. In this paper, three different generalized logistic maps are introduced with arbitrary powers which can be reduced to the conventional logistic map. The added parameter (arbitrary power) increases the degree of freedom of each map and gives us a versatile response that can fit many applications. Therefore, the conventional logistic map is considered only a special case from each proposed map. This new parameter increases the flexibility of the system, and illustrates the performance of the conventional system within any required neighborhood. Many cases will be illustrated showing the effect of the arbitrary power and the equation parameter on the number of equilibrium points, their locations, stability conditions, and bifurcation diagrams up to the chaotic behavior. PMID:25685414

  6. Comparison of logistic equations for population growth.

    PubMed

    Jensen, A L

    1975-12-01

    Two different forms of the logistic equation for population growth appear in the ecological literature. In the form of the logistic equation that appears in recent ecology textbooks the parameters are the instantaneous rate of natural increase per individual and the carrying capacity of the environment. In the form of the logistic equation that appears in some older literature the parameters are the instantaneous birth rate per individual and the carrying capacity. The decision whether to use one form or the other depends on which form of the equation is biologically more realistic. In this study the form of the logistic equation in which the instantaneous birth rate per individual is a parameter is shown to be more realistic in terms of the birth and death processes of population growth. Application of the logistic equation to calculate yield from an exploited fish population also shows that the parameters must be the instantaneous birth rate per individual and the carrying capacity. PMID:1203427

  7. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  8. Quantile regression for climate data

    NASA Astrophysics Data System (ADS)

    Marasinghe, Dilhani Shalika

    Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.

  9. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  10. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035

  11. A SAS macro for residual deviance of ordinal regression analysis.

    PubMed

    Wan, J Y; Wang, W; Bromberg, J

    1994-12-01

    In this paper, a SAS macro is described for calculating the likelihood of the 'saturated' model in the analysis of ordinal regression. The outcome variable is multinomial on an ordinal scale, while the explanatory variables can be nominal or ordinal. Several ordinal regression models may be fitted to the data. One method of testing for the goodness of fit of these regression models is by comparing the residual deviance with the chi 2 distribution. In SAS, PROC LOGISTIC may be used to fit this type of data with proportional odds model. Unfortunately, the residual deviance is not available from the output. Our SAS macro will supplement the SAS output so that the residual deviance test may be carried out. The data from an ongoing HIV study is used as an illustration. PMID:7736732

  12. Exploration Mission Benefits From Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Broyan, James Lee, Jr.; Schlesinger, Thilini; Ewert, Michael K.

    2016-01-01

    Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logical mass can have a significant impact because it also reduces the packing burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting and trash processing technologies to increase habitable volume and improve protection against solar storm events are under development. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio frequency identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and there mission benefits for exploration missions.

  13. Lunar Surface Architecture Utilization and Logistics Support Assessment

    NASA Astrophysics Data System (ADS)

    Bienhoff, Dallas; Findiesen, William; Bayer, Martin; Born, Andrew; McCormick, David

    2008-01-01

    Crew and equipment utilization and logistics support needs for the point of departure lunar outpost as presented by the NASA Lunar Architecture Team (LAT) and alternative surface architectures were assessed for the first ten years of operation. The lunar surface architectures were evaluated and manifests created for each mission. Distances between Lunar Surface Access Module (LSAM) landing sites and emplacement locations were estimated. Physical characteristics were assigned to each surface element and operational characteristics were assigned to each surface mobility element. Stochastic analysis was conducted to assess probable times to deploy surface elements, conduct exploration excursions, and perform defined crew activities. Crew time is divided into Outpost-related, exploration and science, overhead, and personal activities. Outpost-related time includes element deployment, EVA maintenance, IVA maintenance, and logistics resupply. Exploration and science activities include mapping, geological surveys, science experiment deployment, sample analysis and categorizing, and physiological and biological tests in the lunar environment. Personal activities include sleeping, eating, hygiene, exercising, and time off. Overhead activities include precursor or close-out tasks that must be accomplished but don't fit into the other three categories such as: suit donning and doffing, airlock cycle time, suit cleaning, suit maintenance, post-landing safing actions, and pre-departure preparations. Equipment usage time, spares, maintenance actions, and Outpost consumables are also estimated to provide input into logistics support planning. Results are normalized relative to the NASA LAT point of departure lunar surface architecture.

  14. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  15. Ecological Regression and Voting Rights.

    ERIC Educational Resources Information Center

    Freedman, David A.; And Others

    1991-01-01

    The use of ecological regression in voting rights cases is discussed in the context of a lawsuit against Los Angeles County (California) in 1990. Ecological regression assumes that systematic voting differences between precincts are explained by ethnic differences. An alternative neighborhood model is shown to lead to different conclusions. (SLD)

  16. [Regression grading in gastrointestinal tumors].

    PubMed

    Tischoff, I; Tannapfel, A

    2012-02-01

    Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790

  17. A novel HSD17B10 mutation impairing the activities of the mitochondrial RNase P complex causes X-linked intractable epilepsy and neurodevelopmental regression.

    PubMed

    Falk, Marni J; Gai, Xiaowu; Shigematsu, Megumi; Vilardo, Elisa; Takase, Ryuichi; McCormick, Elizabeth; Christian, Thomas; Place, Emily; Pierce, Eric A; Consugar, Mark; Gamper, Howard B; Rossmanith, Walter; Hou, Ya-Ming

    2016-05-01

    We report a Caucasian boy with intractable epilepsy and global developmental delay. Whole-exome sequencing identified the likely genetic etiology as a novel p.K212E mutation in the X-linked gene HSD17B10 for mitochondrial short-chain dehydrogenase/reductase SDR5C1. Mutations in HSD17B10 cause the HSD10 disease, traditionally classified as a metabolic disorder due to the role of SDR5C1 in fatty and amino acid metabolism. However, SDR5C1 is also an essential subunit of human mitochondrial RNase P, the enzyme responsible for 5'-processing and methylation of purine-9 of mitochondrial tRNAs. Here we show that the p.K212E mutation impairs the SDR5C1-dependent mitochondrial RNase P activities, and suggest that the pathogenicity of p.K212E is due to a general mitochondrial dysfunction caused by reduction in SDR5C1-dependent maturation of mitochondrial tRNAs. PMID:26950678

  18. Regression approaches in the test-negative study design for assessment of influenza vaccine effectiveness.

    PubMed

    Bond, H S; Sullivan, S G; Cowling, B J

    2016-06-01

    Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data. PMID:26732691

  19. Logistics modeling of future solid waste storage, treatment, and disposal

    SciTech Connect

    Holter, G.M.; Stiles, D.L.; Shaver, S.R.; Armacost, L.L.

    1993-11-01

    Logistics modeling is a powerful analytical technique for effective planning of waste storage, treatment, and disposal activities. Logistics modeling facilitates analyses of alternate scenarios for future waste flows, facility schedules, and processing or handling capacities. These analyses provide an increased understanding of the specific needs for waste storage, treatment, and disposal while adequate time remains to plan accordingly. They also help to determine the sensitivity of these needs to various system parameters. This paper discusses a logistics modeling system developed by the Pacific Northwest Laboratory (PNL) to aid in solid waste planning for a large industrial complex managing many different types and classifications of waste. The basic needs for such a system are outlined, and the approach adopted in developing the system is described. A key component of this approach is the development of a conceptual model that provides a flexible framework for modeling the waste management system and addressing the range of logistics and economic issues involved. Developing an adequate description of the waste management system being analyzed is discussed. Examples are then provided of the types of analyses that have been conducted. The potential application of this modeling system to different settings is also examined.

  20. Vehicle scheduling schemes for commercial and emergency logistics integration.

    PubMed

    Li, Xiaohui; Tan, Qingmei

    2013-01-01

    In modern logistics operations, large-scale logistics companies, besides active participation in profit-seeking commercial business, also play an essential role during an emergency relief process by dispatching urgently-required materials to disaster-affected areas. Therefore, an issue has been widely addressed by logistics practitioners and caught researchers' more attention as to how the logistics companies achieve maximum commercial profit on condition that emergency tasks are effectively and performed satisfactorily. In this paper, two vehicle scheduling models are proposed to solve the problem. One is a prediction-related scheme, which predicts the amounts of disaster-relief materials and commercial business and then accepts the business that will generate maximum profits; the other is a priority-directed scheme, which, firstly groups commercial and emergency business according to priority grades and then schedules both types of business jointly and simultaneously by arriving at the maximum priority in total. Moreover, computer-based simulations are carried out to evaluate the performance of these two models by comparing them with two traditional disaster-relief tactics in China. The results testify the feasibility and effectiveness of the proposed models. PMID:24391724

  1. Vehicle Scheduling Schemes for Commercial and Emergency Logistics Integration

    PubMed Central

    Li, Xiaohui; Tan, Qingmei

    2013-01-01

    In modern logistics operations, large-scale logistics companies, besides active participation in profit-seeking commercial business, also play an essential role during an emergency relief process by dispatching urgently-required materials to disaster-affected areas. Therefore, an issue has been widely addressed by logistics practitioners and caught researchers' more attention as to how the logistics companies achieve maximum commercial profit on condition that emergency tasks are effectively and performed satisfactorily. In this paper, two vehicle scheduling models are proposed to solve the problem. One is a prediction-related scheme, which predicts the amounts of disaster-relief materials and commercial business and then accepts the business that will generate maximum profits; the other is a priority-directed scheme, which, firstly groups commercial and emergency business according to priority grades and then schedules both types of business jointly and simultaneously by arriving at the maximum priority in total. Moreover, computer-based simulations are carried out to evaluate the performance of these two models by comparing them with two traditional disaster-relief tactics in China. The results testify the feasibility and effectiveness of the proposed models. PMID:24391724

  2. Parental Youth Assets and Sexual Activity: Differences by Race/Ethnicity

    ERIC Educational Resources Information Center

    Tolma, Eleni L.; Oman, Roy F.; Vesely, Sara K.; Aspy, Cheryl B.; Beebe, Laura; Fluhr, Janene

    2011-01-01

    Objectives: To examine how the relationship between parental-related youth assets and youth sexual activity differed by race/ethnicity. Methods: A random sample of 976 youth and their parents living in a Midwestern city participated in the study. Multivariate logistic regression analyses were conducted for 3 major ethnic groups controlling for the…

  3. Obesity, Physical Activity, and Sedentary Behavior of Youth with Learning Disabilities and ADHD

    ERIC Educational Resources Information Center

    Cook, Bryan G.; Li, Dongmei; Heinrich, Katie M.

    2015-01-01

    Obesity, physical activity, and sedentary behavior in childhood are important indicators of present and future health and are associated with school-related outcomes such as academic achievement, behavior, peer relationships, and self-esteem. Using logistic regression models that controlled for gender, age, ethnicity/race, and socioeconomic…

  4. Out-of-School Time Activity Participation among US-Immigrant Youth

    ERIC Educational Resources Information Center

    Yu, Stella M.; Newport-Berra, McHale; Liu, Jihong

    2015-01-01

    Background: Structured out-of-school time (OST) activities are associated with positive academic and psychosocial outcomes. Methods: Data came from the 2007 National Survey of Children's Health, restricted to 36,132 youth aged 12-17?years. Logistic regression models were used to examine the joint effects of race/ethnicity and immigrant family type…

  5. Front-End Analysis Cornerstone of Logistics

    NASA Technical Reports Server (NTRS)

    Nager, Paul J.

    2000-01-01

    The presentation provides an overview of Front-End Logistics Support Analysis (FELSA), when it should be performed, benefits of performing FELSA and why it should be performed, how it is conducted, and examples.

  6. In-space propellant logistics. Volume 4: Project planning data

    NASA Technical Reports Server (NTRS)

    1972-01-01

    The prephase A conceptual project planning data as it pertains to the development of the selected logistics module configuration transported into earth orbit by the space shuttle orbiter. The data represents the test, implementation, and supporting research and technology requirements for attaining the propellant transfer operational capability for early 1985. The plan is based on a propellant module designed to support the space-based tug with cryogenic oxygen-hydrogen propellants. A logical sequence of activities that is required to define, design, develop, fabricate, test, launch, and flight test the propellant logistics module is described. Included are the facility and ground support equipment requirements. The schedule of activities are based on the evolution and relationship between the R and T, the development issues, and the resultant test program.

  7. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  8. Splines for Diffeomorphic Image Regression

    PubMed Central

    Singh, Nikhil; Niethammer, Marc

    2016-01-01

    This paper develops a method for splines on diffeomorphisms for image regression. In contrast to previously proposed methods to capture image changes over time, such as geodesic regression, the method can capture more complex spatio-temporal deformations. In particular, it is a first step towards capturing periodic motions for example of the heart or the lung. Starting from a variational formulation of splines the proposed approach allows for the use of temporal control points to control spline behavior. This necessitates the development of a shooting formulation for splines. Experimental results are shown for synthetic and real data. The performance of the method is compared to geodesic regression. PMID:25485370

  9. Logistics Reduction Technologies for Exploration Missions

    NASA Technical Reports Server (NTRS)

    Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.

    2014-01-01

    Human exploration missions under study are very limited by the launch mass capacity of existing and planned vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Consequently, crew item logistical mass is typically competing with vehicle systems for mass allocation. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing five logistics technologies guided by a systems engineering cradle-to-grave approach to enable used crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as the mission duration increases. This paper provides a description and the challenges of the five technologies under development and the estimated overall mission benefits of each technology.

  10. Abstract Expression Grammar Symbolic Regression

    NASA Astrophysics Data System (ADS)

    Korns, Michael F.

    This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.

  11. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  12. Time-Warped Geodesic Regression

    PubMed Central

    Hong, Yi; Singh, Nikhil; Kwitt, Roland; Niethammer, Marc

    2016-01-01

    We consider geodesic regression with parametric time-warps. This allows, for example, to capture saturation effects as typically observed during brain development or degeneration. While highly-flexible models to analyze time-varying image and shape data based on generalizations of splines and polynomials have been proposed recently, they come at the cost of substantially more complex inference. Our focus in this paper is therefore to keep the model and its inference as simple as possible while allowing to capture expected biological variation. We demonstrate that by augmenting geodesic regression with parametric time-warp functions, we can achieve comparable flexibility to more complex models while retaining model simplicity. In addition, the time-warp parameters provide useful information of underlying anatomical changes as demonstrated for the analysis of corpora callosa and rat calvariae. We exemplify our strategy for shape regression on the Grassmann manifold, but note that the method is generally applicable for time-warped geodesic regression. PMID:25485368

  13. Basis Selection for Wavelet Regression

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)

    1998-01-01

    A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.

  14. Regression methods for spatial data

    NASA Technical Reports Server (NTRS)

    Yakowitz, S. J.; Szidarovszky, F.

    1982-01-01

    The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.

  15. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  16. Definition of main pollen season using a logistic model.

    PubMed

    Ribeiro, Helena; Cunha, Mário; Abreu, Ilda

    2007-01-01

    This paper proposes a method to unify the definition of the main pollen season based on statistical analysis. For this, an aerobiological study was carried out in Porto region (Portugal), from 2003-2005 using a 7-day Hirst-type volumetric spore trap. To define the main pollen season, a non-linear logistic regression model was fitted to the values of the accumulated sum of the daily airborne pollen concentration from several allergological species. An important feature of this method is that the main pollen season will be characterized by the model parameters calculated. These parameters are identifiable aspects of the flowering phenology, and determine not only the beginning and end of the main pollen season, but are also influenced by the meteorological conditions. The results obtained with the proposed methodology were also compared with two of the most used percentage methods. The logistic model fitted well with the sum of accumulated pollen. The explained variance was always higher than 97%, and the exponential part of the predicted curve was well adjusted to the time when higher atmospheric pollen concentration was sampled. The comparison between the different methods tested showed large divergence in the duration and end dates of the main pollen season of the studied species. PMID:18247462

  17. ISS Update: Logistics Reduction and Repurposing (Part 2)

    NASA Video Gallery

    Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...

  18. ISS Update: Logistics Reduction and Repurposing (Part 1)

    NASA Video Gallery

    Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...

  19. Estimating Contraceptive Prevalence Using Logistics Data for Short-Acting Methods: Analysis Across 30 Countries

    PubMed Central

    Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana

    2015-01-01

    Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP

  20. Reverse logistics in the construction industry.

    PubMed

    Hosseini, M Reza; Rameezdeen, Raufdeen; Chileshe, Nicholas; Lehmann, Steffen

    2015-06-01

    Reverse logistics in construction refers to the movement of products and materials from salvaged buildings to a new construction site. While there is a plethora of studies looking at various aspects of the reverse logistics chain, there is no systematic review of literature on this important subject as applied to the construction industry. Therefore, the objective of this study is to integrate the fragmented body of knowledge on reverse logistics in construction, with the aim of promoting the concept among industry stakeholders and the wider construction community. Through a qualitative meta-analysis, the study synthesises the findings of previous studies and presents some actions needed by industry stakeholders to promote this concept within the real-life context. First, the trend of research and terminology related with reverse logistics is introduced. Second, it unearths the main advantages and barriers of reverse logistics in construction while providing some suggestions to harness the advantages and mitigate these barriers. Finally, it provides a future research direction based on the review. PMID:26018543