Science.gov

Sample records for activities logistic regression

  1. Logistic Regression

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.

  2. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  3. [Understanding logistic regression].

    PubMed

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  4. Steganalysis using logistic regression

    NASA Astrophysics Data System (ADS)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  5. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  6. Short Term Forecasts of Volcanic Activity Using An Event Tree Analysis System and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Junek, W. N.; Jones, W. L.; Woods, M. T.

    2011-12-01

    An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.

  7. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  8. Should metacognition be measured by logistic regression?

    PubMed

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria.

  9. Satellite rainfall retrieval by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  10. Predicting Social Trust with Binary Logistic Regression

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  11. Comparative study of two sparse multinomial logistic regression models in decoding visual stimuli from brain activity of fMRI

    NASA Astrophysics Data System (ADS)

    Song, Sutao; Chen, Gongxiang; Zhan, Yu; Zhang, Jiacai; Yao, Li

    2014-03-01

    Recently, sparse algorithms, such as Sparse Multinomial Logistic Regression (SMLR), have been successfully applied in decoding visual information from functional magnetic resonance imaging (fMRI) data, where the contrast of visual stimuli was predicted by a classifier. The contrast classifier combined brain activities of voxels with sparse weights. For sparse algorithms, the goal is to learn a classifier whose weights distributed as sparse as possible by introducing some prior belief about the weights. There are two ways to introduce a sparse prior constraints for weights: the Automatic Relevance Determination (ARD-SMLR) and Laplace prior (LAP-SMLR). In this paper, we presented comparison results between the ARD-SMLR and LAP-SMLR models in computational time, classification accuracy and voxel selection. Results showed that, for fMRI data, no significant difference was found in classification accuracy between these two methods when voxels in V1 were chosen as input features (totally 1017 voxels). As for computation time, LAP-SMLR was superior to ARD-SMLR; the survived voxels for ARD-SMLR was less than LAP-SMLR. Using simulation data, we confirmed the classification performance for the two SMLR models was sensitive to the sparsity of the initial features, when the ratio of relevant features to the initial features was larger than 0.01, ARD-SMLR outperformed LAP-SMLR; otherwise, LAP-SMLR outperformed LAP-SMLR. Simulation data showed ARD-SMLR was more efficient in selecting relevant features.

  12. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression.

  13. Model selection for logistic regression models

    NASA Astrophysics Data System (ADS)

    Duller, Christine

    2012-09-01

    Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.

  14. Transfer Learning Based on Logistic Regression

    NASA Astrophysics Data System (ADS)

    Paul, A.; Rottensteiner, F.; Heipke, C.

    2015-08-01

    In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.

  15. Logistic Regression Applied to Seismic Discrimination

    SciTech Connect

    BG Amindan; DN Hagedorn

    1998-10-08

    The usefulness of logistic discrimination was examined in an effort to learn how it performs in a regional seismic setting. Logistic discrimination provides an easily understood method, works with user-defined models and few assumptions about the population distributions, and handles both continuous and discrete data. Seismic event measurements from a data set compiled by Los Alamos National Laboratory (LANL) of Chinese events recorded at station WMQ were used in this demonstration study. PNNL applied logistic regression techniques to the data. All possible combinations of the Lg and Pg measurements were tried, and a best-fit logistic model was created. The best combination of Lg and Pg frequencies for predicting the source of a seismic event (earthquake or explosion) used Lg{sub 3.0-6.0} and Pg{sub 3.0-6.0} as the predictor variables. A cross-validation test was run, which showed that this model was able to correctly predict 99.7% earthquakes and 98.0% explosions for this given data set. Two other models were identified that used Pg and Lg measurements from the 1.5 to 3.0 Hz frequency range. Although these other models did a good job of correctly predicting the earthquakes, they were not as effective at predicting the explosions. Two possible biases were discovered which affect the predicted probabilities for each outcome. The first bias was due to this being a case-controlled study. The sampling fractions caused a bias in the probabilities that were calculated using the models. The second bias is caused by a change in the proportions for each event. If at a later date the proportions (a priori probabilities) of explosions versus earthquakes change, this would cause a bias in the predicted probability for an event. When using logistic regression, the user needs to be aware of the possible biases and what affect they will have on the predicted probabilities.

  16. Supporting Regularized Logistic Regression Privately and Efficiently.

    PubMed

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  17. Supporting Regularized Logistic Regression Privately and Efficiently

    PubMed Central

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  18. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  19. [Logistic regression against a divergent Bayesian network].

    PubMed

    Sánchez Trujillo, Noel Antonio

    2015-02-03

    This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered); we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.

  20. Multi-label ℓ2-regularized logistic regression for predicting activation/inhibition relationships in human protein-protein interaction networks

    PubMed Central

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Protein-protein interaction (PPI) networks are naturally viewed as infrastructure to infer signalling pathways. The descriptors of signal events between two interacting proteins such as upstream/downstream signal flow, activation/inhibition relationship and protein modification are indispensable for inferring signalling pathways from PPI networks. However, such descriptors are not available in most cases as most PPI networks are seldom semantically annotated. In this work, we extend ℓ2-regularized logistic regression to the scenario of multi-label learning for predicting the activation/inhibition relationships in human PPI networks. The phenomenon that both activation and inhibition relationships exist between two interacting proteins is computationally modelled by multi-label learning framework. The problem of GO (gene ontology) sparsity is tackled by introducing the homolog knowledge as independent homolog instances. ℓ2-regularized logistic regression is accordingly adopted here to penalize the homolog noise and to reduce the computational complexity of the double-sized training data. Computational results show that the proposed method achieves satisfactory multi-label learning performance and outperforms the existing phenotype correlation method on the experimental data of Drosophila melanogaster. Several predictions have been validated against recent literature. The predicted activation/inhibition relationships in human PPI networks are provided in the supplementary file for further biomedical research. PMID:27819359

  1. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    PubMed Central

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most

  2. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    ERIC Educational Resources Information Center

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  3. Preserving Institutional Privacy in Distributed binary Logistic Regression.

    PubMed

    Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.

  4. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  5. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  6. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  7. [Unconditioned logistic regression and sample size: a bibliographic review].

    PubMed

    Ortega Calvo, Manuel; Cayuela Domínguez, Aurelio

    2002-01-01

    Unconditioned logistic regression is a highly useful risk prediction method in epidemiology. This article reviews the different solutions provided by different authors concerning the interface between the calculation of the sample size and the use of logistics regression. Based on the knowledge of the information initially provided, a review is made of the customized regression and predictive constriction phenomenon, the design of an ordinal exposition with a binary output, the event of interest per variable concept, the indicator variables, the classic Freeman equation, etc. Some skeptical ideas regarding this subject are also included.

  8. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  9. Differentially private distributed logistic regression using private and public data

    PubMed Central

    2014-01-01

    Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786

  10. Identification of high leverage points in binary logistic regression

    NASA Astrophysics Data System (ADS)

    Fitrianto, Anwar; Wendy, Tham

    2016-10-01

    Leverage points are those which measures uncommon observations in x space of regression diagnostics. Detection of high leverage points plays a vital role because it is responsible in masking outlier. In regression, high observations which made at extreme in the space of explanatory variables and they are far apart from the average of the data are called as leverage points. In this project, a method for identification of high leverage point in logistic regression was shown using numerical example. We investigate the effects of high leverage point in the logistic regression model. The comparison of the result in the model with and without leverage model is being discussed. Some graphical analyses based on the result of the analysis are presented. We found that the presence of HLP have effect on the hii, estimated probability, estimated coefficients, p-value of variable, odds ratio and regression equation.

  11. Improving Predictions in Imbalanced Data Using Pairwise Expanded Logistic Regression

    PubMed Central

    Jiang, Xiaoqian; El-Kareh, Robert; Ohno-Machado, Lucila

    2011-01-01

    Building classifiers for medical problems often involves dealing with rare, but important events. Imbalanced datasets pose challenges to ordinary classification algorithms such as Logistic Regression (LR) and Support Vector Machines (SVM). The lack of effective strategies for dealing with imbalanced training data often results in models that exhibit poor discrimination. We propose a novel approach to estimate class memberships based on the evaluation of pairwise relationships in the training data. The method we propose, Pairwise Expanded Logistic Regression, improved discrimination and had higher accuracy when compared to existing methods in two imbalanced datasets, thus showing promise as a potential remedy for this problem. PMID:22195118

  12. Antimicrobial activity of aroma compounds against Saccharomyces cerevisiae and improvement of microbiological stability of soft drinks as assessed by logistic regression.

    PubMed

    Belletti, Nicoletta; Kamdem, Sylvain Sado; Patrignani, Francesca; Lanciotti, Rosalba; Covelli, Alessandro; Gardini, Fausto

    2007-09-01

    The combined effects of a mild heat treatment (55 degrees C) and the presence of three aroma compounds [citron essential oil, citral, and (E)-2-hexenal] on the spoilage of noncarbonated beverages inoculated with different amounts of a Saccharomyces cerevisiae strain were evaluated. The results, expressed as growth/no growth, were elaborated using a logistic regression in order to assess the probability of beverage spoilage as a function of thermal treatment length, concentration of flavoring agents, and yeast inoculum. The logit models obtained for the three substances were extremely precise. The thermal treatment alone, even if prolonged for 20 min, was not able to prevent yeast growth. However, the presence of increasing concentrations of aroma compounds improved the stability of the products. The inhibiting effect of the compounds was enhanced by a prolonged thermal treatment. In fact, it influenced the vapor pressure of the molecules, which can easily interact within microbial membranes when they are in gaseous form. (E)-2-Hexenal showed a threshold level, related to initial inoculum and thermal treatment length, over which yeast growth was rapidly inhibited. Concentrations over 100 ppm of citral and thermal treatment longer than 16 min allowed a 90% probability of stability for bottles inoculated with 10(5) CFU/bottle. Citron gave the most interesting responses: beverages with 500 ppm of essential oil needed only 3 min of treatment to prevent yeast growth. In this framework, the logistic regression proved to be an important tool to study alternative hurdle strategies for the stabilization of noncarbonated beverages.

  13. Model building strategy for logistic regression: purposeful selection.

    PubMed

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  14. Sugarcane Land Classification with Satellite Imagery using Logistic Regression Model

    NASA Astrophysics Data System (ADS)

    Henry, F.; Herwindiati, D. E.; Mulyono, S.; Hendryli, J.

    2017-03-01

    This paper discusses the classification of sugarcane plantation area from Landsat-8 satellite imagery. The classification process uses binary logistic regression method with time series data of normalized difference vegetation index as input. The process is divided into two steps: training and classification. The purpose of training step is to identify the best parameter of the regression model using gradient descent algorithm. The best fit of the model can be utilized to classify sugarcane and non-sugarcane area. The experiment shows high accuracy and successfully maps the sugarcane plantation area which obtained best result of Cohen’s Kappa value 0.7833 (strong) with 89.167% accuracy.

  15. Saddlepoint approximations for small sample logistic regression problems.

    PubMed

    Platt, R W

    2000-02-15

    Double saddlepoint approximations provide quick and accurate approximations to exact conditional tail probabilities in a variety of situations. This paper describes the use of these approximations in two logistic regression problems. An investigation of regression analysis of the log-odds ratio in a sequence or set of 2x2 tables via simulation studies shows that in practical settings the saddlepoint methods closely approximate exact conditional inference. The double saddlepoint approximation in the test for trend in a sequence of binomial random variates is also shown, via simulation studies, to be an effective approximation to exact conditional inference.

  16. Computing group cardinality constraint solutions for logistic regression problems.

    PubMed

    Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M

    2017-01-01

    We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints.

  17. Logistic Regression Model on Antenna Control Unit Autotracking Mode

    DTIC Science & Technology

    2015-10-20

    412TW-PA-15240 Logistic Regression Model on Antenna Control Unit Autotracking Mode DANIEL T. LAIRD AIR FORCE TEST CENTER EDWARDS AFB, CA...OCTOBER 20 2015 4 1 2 T W Approved for public release; distribution is unlimited. 412TW-PA-15240 AIR FORCE TEST ...unlimited. 13. SUPPLEMENTARY NOTES CA: Air Force Test Center Edwards AFB CA CC: 012100 14. ABSTRACT Over the past several years

  18. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  19. Using logistic regression to estimate delay-discounting functions.

    PubMed

    Wileyto, E Paul; Audrain-McGovern, Janet; Epstein, Leonard H; Lerman, Caryn

    2004-02-01

    The monetary choice questionnaire (MCQ) and similar computer tasks ask preference questions in order to ascertain indifference, the perceived equivalence of immediate versus larger delayed rewards. Indifference data are then fitted with a hyperbolic function, summarizing the decline in perceived value with delay time. We present a fitting method that estimates the hyperbolic parameter k directly from survey responses. Binary preferences are modeled as a function of time (X2) and a transformed reward ratio (X1), yielding logistic regression coefficients beta 2 and beta 1. The hyperbolic parameter emerges as k = beta 2/beta 1, where the logistic predicted p = .5 (the definition of indifference). The MCQ was administered to 1,073 adolescents and was scored using both standard and logistic methods. The means for In(k) were similar (standard, -4.53; logistic, -4.51), and the results were highly correlated (rho = .973). Simulated MCQ data showed that k was unbiased, except where beta 1 > or = -1, indicating a vague survey response. Jackknife standard errors provided excellent coverage.

  20. Detecting nonsense for Chinese comments based on logistic regression

    NASA Astrophysics Data System (ADS)

    Zhuolin, Ren; Guang, Chen; Shu, Chen

    2016-07-01

    To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.

  1. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  2. Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.

    PubMed

    Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina

    2016-10-26

    This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history (P < .01) and vaccine decision made before the visit (P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history (P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.

  3. Metadata for ReVA logistic regression dataset

    USGS Publications Warehouse

    LaMotte, Andrew E.

    2004-01-01

    The U.S. Geological Survey in cooperation with the U.S. Environmental Protection Agency's Regional Vulnerability Assessment Program, has developed a set of statistical tools to support regional-scale, ground-water quality and vulnerability assessments. The Regional Vulnerability Assessment Program goals are to develop and demonstrate approaches to comprehensive, regional-scale assessments that effectively inform water-resources managers and decision-makers as to the magnitude, extent, distribution, and uncertainty of current and anticipated environmental risks. The U.S. Geological Survey is developing and exploring the use of statistical probability models to characterize the relation between ground-water quality and geographic factors in the Mid-Atlantic Region. Available water-quality data obtained from U.S. Geological Survey National Water-Quality Assessment Program studies conducted in the Mid-Atlantic Region were used in association with geographic data (land cover, geology, soils, and others) to develop logistic-regression equations that use explanatory variables to predict the presence of a selected water-quality parameter exceeding specified management concentration thresholds. The resulting logistic-regression equations were transformed to determine the probability, P(X), of a water-quality parameter exceeding a specified management threshold. Additional statistical procedures modified by the U.S. Geological Survey were used to compare the observed values to model-predicted values at each sample point. In addition, procedures to evaluate the confidence of the model predictions and estimate the uncertainty of the probability value were developed and applied. The resulting logistic-regression models were applied to the Mid-Atlantic Region to predict the spatial probability of nitrate concentrations exceeding specified management thresholds. These thresholds are usually set or established by regulators or managers at national or local levels. At management

  4. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    ERIC Educational Resources Information Center

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  5. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  6. Risk factors for temporomandibular disorder: Binary logistic regression analysis

    PubMed Central

    Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.

    2014-01-01

    Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706

  7. Application of Bayesian logistic regression to mining biomedical data.

    PubMed

    Avali, Viji R; Cooper, Gregory F; Gopalakrishnan, Vanathi

    2014-01-01

    Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve.

  8. Autoregressive logistic regression applied to atmospheric circulation patterns

    NASA Astrophysics Data System (ADS)

    Guanche, Y.; Mínguez, R.; Méndez, F. J.

    2014-01-01

    Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.

  9. Predicting β-Turns in Protein Using Kernel Logistic Regression

    PubMed Central

    Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, FangXiang; Li, Min

    2013-01-01

    A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Qtotal of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. PMID:23509793

  10. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  11. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  12. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  13. Prevalence and Predictive Factors of Sexual Dysfunction in Iranian Women: Univariate and Multivariate Logistic Regression Analyses

    PubMed Central

    Direkvand-Moghadam, Ashraf; Suhrabi, Zainab; Akbari, Malihe

    2016-01-01

    Background Female sexual dysfunction, which can occur during any stage of a normal sexual activity, is a serious condition for individuals and couples. The present study aimed to determine the prevalence and predictive factors of female sexual dysfunction in women referred to health centers in Ilam, the Western Iran, in 2014. Methods In the present cross-sectional study, 444 women who attended health centers in Ilam were enrolled from May to September 2014. Participants were selected according to the simple random sampling method. Univariate and multivariate logistic regression analyses were used to predict the risk factors of female sexual dysfunction. Diffe rences with an alpha error of 0.05 were regarded as statistically significant. Results Overall, 75.9% of the study population exhibited sexual dysfunction. Univariate logistic regression analysis demonstrated that there was a significant association between female sexual dysfunction and age, menarche age, gravidity, parity, and education (P<0.05). Multivariate logistic regression analysis indicated that, menarche age (odds ratio, 1.26), education level (odds ratio, 1.71), and gravida (odds ratio, 1.59) were independent predictive variables for female sexual dysfunction. Conclusion The majority of Iranian women suffer from sexual dysfunction. A lack of awareness of Iranian women's sexual pleasure and formal training on sexual function and its influencing factors, such as menarche age, gravida, and level of education, may lead to a high prevalence of female sexual dysfunction. PMID:27688863

  14. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  15. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework.

  16. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

    PubMed Central

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152

  17. Comparison of artificial neural networks with logistic regression for detection of obesity.

    PubMed

    Heydari, Seyed Taghi; Ayatollahi, Seyed Mohammad Taghi; Zare, Najaf

    2012-08-01

    Obesity is a common problem in nutrition, both in the developed and developing countries. The aim of this study was to classify obesity by artificial neural networks and logistic regression. This cross-sectional study comprised of 414 healthy military personnel in southern Iran. All subjects completed questionnaires on their socio-economic status and their anthropometric measures were measured by a trained nurse. Classification of obesity was done by artificial neural networks and logistic regression. The mean age±SD of participants was 34.4 ± 7.5 years. A total of 187 (45.2%) were obese. In regard to logistic regression and neural networks the respective values were 80.2% and 81.2% when correctly classified, 80.2 and 79.7 for sensitivity and 81.9 and 83.7 for specificity; while the area under Receiver-Operating Characteristic (ROC) curve were 0.888 and 0.884 and the Kappa statistic were 0.600 and 0.629 for logistic regression and neural networks model respectively. We conclude that the neural networks and logistic regression both were good classifier for obesity detection but they were not significantly different in classification.

  18. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    PubMed

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO2, SO2, O3 and PM2.5) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O3>PM2.5>NO2>humidity followed at a significant distance by the effects of SO2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space.

  19. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  20. Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping

    PubMed Central

    2013-01-01

    Background Complex binary traits are influenced by many factors including the main effects of many quantitative trait loci (QTLs), the epistatic effects involving more than one QTLs, environmental effects and the effects of gene-environment interactions. Although a number of QTL mapping methods for binary traits have been developed, there still lacks an efficient and powerful method that can handle both main and epistatic effects of a relatively large number of possible QTLs. Results In this paper, we use a Bayesian logistic regression model as the QTL model for binary traits that includes both main and epistatic effects. Our logistic regression model employs hierarchical priors for regression coefficients similar to the ones used in the Bayesian LASSO linear model for multiple QTL mapping for continuous traits. We develop efficient empirical Bayesian algorithms to infer the logistic regression model. Our simulation study shows that our algorithms can easily handle a QTL model with a large number of main and epistatic effects on a personal computer, and outperform five other methods examined including the LASSO, HyperLasso, BhGLM, RVM and the single-QTL mapping method based on logistic regression in terms of power of detection and false positive rate. The utility of our algorithms is also demonstrated through analysis of a real data set. A software package implementing the empirical Bayesian algorithms in this paper is freely available upon request. Conclusions The EBLASSO logistic regression method can handle a large number of effects possibly including the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTLs mapping for complex binary traits. PMID:23410082

  1. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  2. Use and interpretation of logistic regression in habitat-selection studies

    USGS Publications Warehouse

    Keating, Kim A.; Cherry, Steve

    2004-01-01

     Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in

  3. Mixed-Effects Logistic Regression for Estimating Transitional Probabilities in Sequentially Coded Observational Data

    ERIC Educational Resources Information Center

    Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman

    2007-01-01

    This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…

  4. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    ERIC Educational Resources Information Center

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  5. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    ERIC Educational Resources Information Center

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  6. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    ERIC Educational Resources Information Center

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  7. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    ERIC Educational Resources Information Center

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  8. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  9. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  10. Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy

    PubMed Central

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031

  11. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  12. Comparison of a Bayesian network with a logistic regression model to forecast IgA nephropathy.

    PubMed

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.

  13. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  14. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  15. Monte Carlo Evaluation of Two-Level Logistic Regression for Assessing Person Fit

    ERIC Educational Resources Information Center

    Woods, Carol M.

    2008-01-01

    Person fit is the degree to which an item response model fits for individual examinees. Reise (2000) described how two-level logistic regression can be used to detect heterogeneity in person fit, evaluate potential predictors of person fit heterogeneity, and identify potentially aberrant individuals. The method has apparently never been applied to…

  16. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  17. Comparing the Classification Accuracy among Nonparametric, Parametric Discriminant Analysis and Logistic Regression Methods.

    ERIC Educational Resources Information Center

    Ferrer, Alvaro J. Arce; Wang, Lin

    This study compared the classification performance among parametric discriminant analysis, nonparametric discriminant analysis, and logistic regression in a two-group classification application. Field data from an organizational survey were analyzed and bootstrapped for additional exploration. The data were observed to depart from multivariate…

  18. An Additional Measure of Overall Effect Size for Logistic Regression Models

    ERIC Educational Resources Information Center

    Allen, Jeff; Le, Huy

    2008-01-01

    Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…

  19. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  20. Iterative Purification and Effect Size Use with Logistic Regression for Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Maller, Susan J.

    2007-01-01

    Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…

  1. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    ERIC Educational Resources Information Center

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  2. Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.

    ERIC Educational Resources Information Center

    Meshbane, Alice; Morris, John D.

    A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…

  3. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  4. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  5. Regressive logistic models for familial diseases: a formulation assuming an underlying liability model.

    PubMed Central

    Demenais, F M

    1991-01-01

    Statistical models have been developed to delineate the major-gene and non-major-gene factors accounting for the familial aggregation of complex diseases. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affection is defined by a threshold on the liability scale. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype, phenotypes of antecedents and other covariates. An equivalence between these two approaches cannot be derived analytically. I propose a formulation of the regressive logistic models on the supposition of an underlying liability model of disease. Relatives are assumed to have correlated liabilities to the disease; affected persons have liabilities exceeding an estimable threshold. Under the assumption that the correlation structure of the relatives' liabilities follows a regressive model, the regression coefficients on antecedents are expressed in terms of the relevant familial correlations. A parsimonious parameterization is a consequence of the assumed liability model, and a one-to-one correspondence with the parameters of the mixed model can be established. The logits, derived under the class A regressive model and under the class D regressive model, can be extended to include a large variety of patterns of family dependence, as well as gene-environment interactions. PMID:1897524

  6. Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression

    NASA Astrophysics Data System (ADS)

    Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin

    2016-06-01

    Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.

  7. Investigating flight response of Pacific brant to helicopters at Izembek Lagoon, Alaska by using logistic regression

    USGS Publications Warehouse

    Erickson, Wallace P.; Nick, Todd G.; Ward, David H.; Peck, Roxy; Haugh, Larry D.; Goodman, Arnold

    1998-01-01

    Izembek Lagoon, an estuary in Alaska, is a very important staging area for Pacific brant, a small migratory goose. Each fall, nearly the entire Pacific Flyway population of 130,000 brant flies to Izembek Lagoon and feeds on eelgrass to accumulate fat reserves for nonstop transoceanic migration to wintering areas as distant as Mexico. In the past 10 years, offshore drilling activities in this area have increased, and, as a result, the air traffic in and out of the nearby Cold Bay airport has also increased. There has been a concern that this increased air traffic could affect the brant by disturbing them from their feeding and resting activities, which in turn could result in reduced energy intake and buildup. This may increase the mortality rates during their migratory journey. Because of these concerns, a study was conducted to investigate the flight response of brant to overflights of large helicopters. Response was measured on flocks during experimental overflights of large helicopters flown at varying altitudes and lateral (perpendicular) distances from the flocks. Logistic regression models were developed for predicting probability of flight response as a function of these distance variables. Results of this study may be used in the development of new FAA guidelines for aircraft near Izembek Lagoon.

  8. Predicting Outcomes of Hospitalization for Heart Failure Using Logistic Regression and Knowledge Discovery Methods

    PubMed Central

    Phillips, Kirk T.; Street, W. Nick

    2005-01-01

    The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367

  9. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.

    PubMed

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.

  10. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients

    PubMed Central

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients’ changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling. PMID:28269842

  11. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-01-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651

  12. Assessing the effects of different types of covariates for binary logistic regression

    NASA Astrophysics Data System (ADS)

    Hamid, Hamzah Abdul; Wah, Yap Bee; Xie, Xian-Jin; Rahman, Hezlin Aryani Abd

    2015-02-01

    It is well known that the type of data distribution in the independent variable(s) may affect many statistical procedures. This paper investigates and illustrates the effect of different types of covariates on the parameter estimation of a binary logistic regression model. A simulation study with different sample sizes and different types of covariates (uniform, normal, skewed) was carried out. Results showed that parameter estimation of binary logistic regression model is severely overestimated when sample size is less than 150 for covariate which have normal and uniform distribution while the parameter is underestimated when the distribution of covariate is skewed. Parameter estimation improves for all types of covariates when sample size is large, that is at least 500.

  13. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  14. A logistic regression model for predicting axillary lymph node metastases in early breast carcinoma patients.

    PubMed

    Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing

    2012-01-01

    Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy.

  15. A Logistic Regression Model for Predicting Axillary Lymph Node Metastases in Early Breast Carcinoma Patients

    PubMed Central

    Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing

    2012-01-01

    Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy. PMID:23012578

  16. Variable Selection in Logistic Regression for Detecting SNP-SNP Interactions: the Rheumatoid Arthritis Example

    PubMed Central

    Lin, H. Y.; Desmond, R.; Liu, Y. H.; Bridges, S. L.; Soong, S. J.

    2013-01-01

    Summary Many complex disease traits are observed to be associated with single nucleotide polymorphism (SNP) interactions. In testing small-scale SNP-SNP interactions, variable selection procedures in logistic regressions are commonly used. The empirical evidence of variable selection for testing interactions in logistic regressions is limited. This simulation study was designed to compare nine variable selection procedures in logistic regressions for testing SNP-SNP interactions. Data on 10 SNPs were simulated for 400 and 1000 subjects (case/control ratio=1). The simulated model included one main effect and two 2-way interactions. The variable selection procedures included automatic selection (stepwise, forward and backward), common 2-step selection, AIC- and BIC-based selection. The hierarchical rule effect, in which all main effects and lower order terms of the highest-order interaction term are included in the model regardless of their statistical significance, was also examined. We found that the stepwise variable selection without the hierarchical rule which had reasonably high authentic (true positive) proportion and low noise (false positive) proportion, is a better method compared to other variable selection procedures. The procedure without the hierarchical rule requires fewer terms in testing interactions, so it can accommodate more SNPs than the procedure with the hierarchical rule. For testing interactions, the procedures without the hierarchical rule had higher authentic proportion and lower noise proportion compared with ones with the hierarchical rule. These variable selection procedures were also applied and compared in a rheumatoid arthritis study. PMID:18231122

  17. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    PubMed

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.

  18. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    PubMed

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.

  19. Application of likelihood ratio and logistic regression models to landslide susceptibility mapping using GIS.

    PubMed

    Lee, Saro

    2004-08-01

    For landslide susceptibility mapping, this study applied and verified a Bayesian probability model, a likelihood ratio and statistical model, and logistic regression to Janghung, Korea, using a Geographic Information System (GIS). Landslide locations were identified in the study area from interpretation of IRS satellite imagery and field surveys; and a spatial database was constructed from topographic maps, soil type, forest cover, geology and land cover. The factors that influence landslide occurrence, such as slope gradient, slope aspect, and curvature of topography, were calculated from the topographic database. Soil texture, material, drainage, and effective depth were extracted from the soil database, while forest type, diameter, and density were extracted from the forest database. Land cover was classified from Landsat TM satellite imagery using unsupervised classification. The likelihood ratio and logistic regression coefficient were overlaid to determine each factor's rating for landslide susceptibility mapping. Then the landslide susceptibility map was verified and compared with known landslide locations. The logistic regression model had higher prediction accuracy than the likelihood ratio model. The method can be used to reduce hazards associated with landslides and to land cover planning.

  20. Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers.

    PubMed

    Austin, Peter C; Steyerberg, Ewout W

    2014-02-10

    Predicting the probability of the occurrence of a binary outcome or condition is important in biomedical research. While assessing discrimination is an essential issue in developing and validating binary prediction models, less attention has been paid to methods for assessing model calibration. Calibration refers to the degree of agreement between observed and predicted probabilities and is often assessed by testing for lack-of-fit. The objective of our study was to examine the ability of graphical methods to assess the calibration of logistic regression models. We examined lack of internal calibration, which was related to misspecification of the logistic regression model, and external calibration, which was related to an overfit model or to shrinkage of the linear predictor. We conducted an extensive set of Monte Carlo simulations with a locally weighted least squares regression smoother (i.e., the loess algorithm) to examine the ability of graphical methods to assess model calibration. We found that loess-based methods were able to provide evidence of moderate departures from linearity and indicate omission of a moderately strong interaction. Misspecification of the link function was harder to detect. Visual patterns were clearer with higher sample sizes, higher incidence of the outcome, or higher discrimination. Loess-based methods were also able to identify the lack of calibration in external validation samples when an overfit regression model had been used. In conclusion, loess-based smoothing methods are adequate tools to graphically assess calibration and merit wider application.

  1. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  2. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

  3. The assessment of groundwater nitrate contamination by using logistic regression model in a representative rural area

    NASA Astrophysics Data System (ADS)

    Ko, K.; Cheong, B.; Koh, D.

    2010-12-01

    Groundwater has been used a main source to provide a drinking water in a rural area with no regional potable water supply system in Korea. More than 50 percent of rural area residents depend on groundwater as drinking water. Thus, research on predicting groundwater pollution for the sustainable groundwater usage and protection from potential pollutants was demanded. This study was carried out to know the vulnerability of groundwater nitrate contamination reflecting the effect of land use in Nonsan city of a representative rural area of South Korea. About 47% of the study area is occupied by cultivated land with high vulnerable area to groundwater nitrate contamination because it has higher nitrogen fertilizer input of 62.3 tons/km2 than that of country’s average of 44.0 tons/km2. The two vulnerability assessment methods, logistic regression and DRASTIC model, were tested and compared to know more suitable techniques for the assessment of groundwater nitrate contamination in Nonsan area. The groundwater quality data were acquired from the collection of analyses of 111 samples of small potable supply system in the study area. The analyzed values of nitrate were classified by land use such as resident, upland, paddy, and field area. One dependent and two independent variables were addressed for logistic regression analysis. One dependent variable was a binary categorical data with 0 or 1 whether or not nitrate exceeding thresholds of 1 through 10 mg/L. The independent variables were one continuous data of slope indicating topography and multiple categorical data of land use which are classified by resident, upland, paddy, and field area. The results of the Levene’s test and T-test for slope and land use were showed the significant difference of mean values among groups in 95% confidence level. From the logistic regression, we could know the negative correlation between slope and nitrate which was caused by the decrease of contaminants inputs into groundwater with

  4. Comparing performances of heuristic and logistic regression models for a spatial landslide susceptibility assessment in Maramures, County, Northwestern Romania

    NASA Astrophysics Data System (ADS)

    Mǎgut, F. L.; Zaharia, S.; Glade, T.; Irimuş, I. A.

    2012-04-01

    Various methods exist in analyzing spatial landslide susceptibility and classing the results in susceptibility classes. The prediction of spatial landslide distribution can be performed by using a variety of methods based on GIS techniques. The two very common methods of a heuristic assessment and a logistic regression model are employed in this study in order to compare their performance in predicting the spatial distribution of previously mapped landslides for a study area located in Maramureš County, in Northwestern Romania. The first model determines a susceptibility index by combining the heuristic approach with GIS techniques of spatial data analysis. The criteria used for quantifying each susceptibility factor and the expression used to determine the susceptibility index are taken from the Romanian legislation (Governmental Decision 447/2003). This procedure is followed in any Romanian state-ordered study which relies on financial support. The logistic regression model predicts the spatial distribution of landslides by statistically calculating regressive coefficients which describe the dependency of previously mapped landslides on different factors. The identified shallow landslides correspond generally to Pannonian marl and Quaternary contractile clay deposits. The study region is located in the Northwestern part of Romania, including the Baia Mare municipality, the capital of Maramureš County. The study focuses on the former piedmontal region situated to the south of the volcanic mountains Gutâi, in the Baia Mare Depression, where most of the landslide activity has been recorded. In addition, a narrow sector of the volcanic mountains which borders the city of Baia Mare to the north has also been included to test the accuracy of the models in different lithologic units. The results of both models indicate a general medium landslide susceptibility of the study area. The more detailed differences will be discussed with respect to the advantages and

  5. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features

    PubMed Central

    Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran

    2015-01-01

    Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087

  6. A comparison of discriminant analysis and logistic regression for the prediction of coliform mastitis in dairy cows.

    PubMed Central

    Montgomery, M E; White, M E; Martin, S W

    1987-01-01

    Results from discriminant analysis and logistic regression were compared using two data sets from a study on predictors of coliform mastitis in dairy cows. Both techniques selected the same set of variables as important predictors and were of nearly equal value in classifying cows as having, or not having mastitis. The logistic regression model made fewer classification errors. The magnitudes of the effects were considerably different for some variables. Given the failure to meet the underlying assumptions of discriminant analysis, the coefficients from logistic regression are preferable. PMID:3453271

  7. A logistic regression equation for estimating the probability of a stream in Vermont having intermittent flow

    USGS Publications Warehouse

    Olson, Scott A.; Brouillette, Michael C.

    2006-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.

  8. A secure distributed logistic regression protocol for the detection of rare adverse drug events

    PubMed Central

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-01-01

    Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for

  9. Mining pharmacovigilance data using Bayesian logistic regression with James-Stein type shrinkage estimation.

    PubMed

    An, Lihua; Fung, Karen Y; Krewski, Daniel

    2010-09-01

    Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.

  10. Semi-parametric estimation of random effects in a logistic regression model using conditional inference.

    PubMed

    Petersen, Jørgen Holm

    2016-01-15

    This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study.

  11. Analyzing Beijing's in-use vehicle emissions test results using logistic regression.

    PubMed

    Chang, Cheng; Ortolano, Leonard

    2008-10-01

    A logistic regression model was built using vehicle emissions test data collected in 2003 for 129 604 motor vehicles in Beijing. The regression model uses vehicle model, model year, inspection station, ownership, and vehicle registration area as covariates to predict the probability that a vehicle fails an annual emissions test on the first try. Vehicle model is the most influential predictor variable: some vehicle models are much more likely to fail in emissions tests than an "average" vehicle. Five out of 14 vehicle models that performed the worst (out of a total of 52 models) were manufactured by foreign companies or by their joint ventures with Chinese enterprises. These 14 vehicle model types may have failed at relatively high rates because of design and manufacturing deficiencies, and such deficiencies cannot be easily detected and corrected without further efforts, such as programs for in-use surveillance and vehicle recall.

  12. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    PubMed

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed.

  13. Simultaneous confidence bands for log-logistic regression with applications in risk assessment.

    PubMed

    Kerns, Lucy X

    2017-01-27

    In risk assessment, it is often desired to make inferences on the low dose levels at which a specific benchmark risk is attained. Applications of simultaneous hyperbolic confidence bands for low-dose risk estimation with quantal data under different dose-response models (multistage, Abbott-adjusted Weibull, and Abbott-adjusted log-logistic models) have appeared in the literature. The use of simultaneous three-segment bands under the multistage model has also been proposed recently. In this article, we present explicit formulas for constructing asymptotic one-sided simultaneous hyperbolic and three-segment bands for the simple log-logistic regression model. We use the simultaneous construction to estimate upper hyperbolic and three-segment confidence bands on extra risk and to obtain lower limits on the benchmark dose by inverting the upper bands on risk under the Abbott-adjusted log-logistic model. Monte Carlo simulations evaluate the characteristics of the simultaneous limits. An example is given to illustrate the use of the proposed methods and to compare the two types of simultaneous limits at very low dose levels.

  14. Sub-resolution assist feature (SRAF) printing prediction using logistic regression

    NASA Astrophysics Data System (ADS)

    Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei

    2015-03-01

    In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.

  15. Predicting students' success at pre-university studies using linear and logistic regressions

    NASA Astrophysics Data System (ADS)

    Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir

    2014-09-01

    The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.

  16. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh

    PubMed Central

    2011-01-01

    Background The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (< -3.0), moderately undernourished (-3.0 to -2.01) and nourished (≥-2.0). Since nutrition status is ordinal, an OLR model-proportional odds model (POM) can be developed instead of two separate BLR models to find predictors of both malnutrition and severe malnutrition if the proportional odds assumption satisfies. The assumption is satisfied with low p-value (0.144) due to violation of the assumption for one co-variate. So partial proportional odds model (PPOM) and two BLR models have also been developed to check the applicability of the OLR model. Graphical test has also been adopted for checking the proportional odds assumption. Results All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. Conclusion These findings clearly justify that OLR models (POM and PPOM) are appropriate to find predictors of malnutrition instead of BLR models. PMID:22082256

  17. An application of a microcomputer compiler program to multiple logistic regression analysis.

    PubMed

    Sakai, R

    1988-01-01

    Microcomputer programs for multiple logistic regression analysis were written in BASIC language to determine the usefulness of microcomputers for multivariate analysis, which is an important method in epidemiological studies. The program, carried out by an interpreter system, required a comparatively long computing time for a small amount of data. For example, it took approximately thirty minutes to compute the data of 6 independent variables and 63 matched sets of case and controls (1:4). The majority of the calculation time was spent computing a matrix. The matrix computation time increased cumulatively in proportion to additions in the number of subjects, and increased exponentially with the number of variables. A BASIC compiler was utilized for the program of multiple logistic regression analysis. The compiled program carried out the same computations as above, but within 4 minutes. Therefore, it is evident that a compiler can be an extremely convenient tool for computing multivariate analysis. The two programs produced here were also easily linked with spreadsheet packages to enter data.

  18. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    NASA Astrophysics Data System (ADS)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  19. Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression

    NASA Astrophysics Data System (ADS)

    Dineva, S.; Prodanova, K.; Mlachkova, D.

    2013-12-01

    The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.

  20. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    NASA Astrophysics Data System (ADS)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  1. Using GIS and logistic regression to estimate agricultural chemical concentrations in rivers of the midwestern USA

    USGS Publications Warehouse

    Battaglin, W.A.

    1996-01-01

    Agricultural chemicals (herbicides, insecticides, other pesticides and fertilizers) in surface water may constitute a human health risk. Recent research on unregulated rivers in the midwestern USA documents that elevated concentrations of herbicides occur for 1-4 months following application in spring and early summer. In contrast, nitrate concentrations in unregulated rivers are elevated during the fall, winter and spring. Natural and anthropogenic variables of river drainage basins, such as soil permeability, the amount of agricultural chemicals applied or percentage of land planted in corn, affect agricultural chemical concentrations in rivers. Logistic regression (LGR) models are used to investigate relations between various drainage basin variables and the concentration of selected agricultural chemicals in rivers. The method is successful in contributing to the understanding of agricultural chemical concentration in rivers. Overall accuracies of the best LGR models, defined as the number of correct classifications divided by the number of attempted classifications, averaged about 66%.

  2. Confidence interval of difference of proportions in logistic regression in presence of covariates.

    PubMed

    Reeve, Russell

    2016-03-16

    Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.

  3. A modified approach to estimating sample size for simple logistic regression with one continuous covariate.

    PubMed

    Novikov, I; Fund, N; Freedman, L S

    2010-01-15

    Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.

  4. Application of Logistic Regression for Landslide Susceptibility Mapping in the Alishan Area, Southern Taiwan

    NASA Astrophysics Data System (ADS)

    Chan, H. C.; Chang, C. C.; Laio, P. Y.

    2012-04-01

    Landslide susceptibility analysis usually combines several factors, including the terrain, geology, and hydrology. The analysis tries to find a suitable combination of these factors in order to establish a landslide susceptibility model and calculate the susceptibility value. A potential landslide map can be established by using the calculated the susceptibility value of landslide. This study took Alishan area as an example and aimed to assess landslide susceptibility analysis by Logistic regression, a multivariate analysis method. In order to select the factors efficiently, the calibration and selection procedure were performed. The results were verified by a previous typhoon event. The classification error matrix was used to evaluate the accuracy of landslide predicted by the present model. Finally, this study applied 10-, 25-, 50-, and 100-year return periods precipitation to estimate the susceptibility values for the study area. The landslide susceptibilities were separated into four levels, including high, medium-high, medium, and low, to delineate the map of potential landslide.

  5. Maximum likelihood training of connectionist models: comparison with least squares back-propagation and logistic regression.

    PubMed Central

    Spackman, K. A.

    1991-01-01

    This paper presents maximum likelihood back-propagation (ML-BP), an approach to training neural networks. The widely reported original approach uses least squares back-propagation (LS-BP), minimizing the sum of squared errors (SSE). Unfortunately, least squares estimation does not give a maximum likelihood (ML) estimate of the weights in the network. Logistic regression, on the other hand, gives ML estimates for single layer linear models only. This report describes how to obtain ML estimates of the weights in a multi-layer model, and compares LS-BP to ML-BP using several examples. It shows that in many neural networks, least squares estimation gives inferior results and should be abandoned in favor of maximum likelihood estimation. Questions remain about the potential uses of multi-level connectionist models in such areas as diagnostic systems and risk-stratification in outcomes research. PMID:1807606

  6. Logistic regression analysis of pedestrian casualty risk in passenger vehicle collisions in China.

    PubMed

    Kong, Chunyu; Yang, Jikuang

    2010-07-01

    A large number of pedestrian fatalities were reported in China since the 1990s, however the exposure of pedestrians in public traffic has never been measured quantitatively using in-depth accident data. This study aimed to investigate the association between the impact speed and risk of pedestrian casualties in passenger vehicle collisions based on real-world accident cases in China. The cases were selected from a database of in-depth investigation of vehicle accidents in Changsha-IVAC. The sampling criteria were defined as (1) the accident was a frontal impact that occurred between 2003 and 2009; (2) the pedestrian age was above 14; (3) the injury according to the Abbreviated Injury Scale (AIS) was 1+; (4) the accident involved passenger cars, SUVs, or MPVs; and (5) the vehicle impact speed can be determined. The selected IVAC data set, which included 104 pedestrian accident cases, was weighted based on the national traffic accident data. The logistical regression models of the risks for pedestrian fatalities and AIS 3+ injuries were developed in terms of vehicle impact speed using the unweighted and weighted data sets. A multiple logistic regression model on the risk of pedestrian AIS 3+ injury was developed considering the age and impact speed as two variables. It was found that the risk of pedestrian fatality is 26% at 50 km/h, 50% at 58 km/h, and 82% at 70 km/h. At an impact speed of 80 km/h, the pedestrian rarely survives. The weighted risk curves indicated that the risks of pedestrian fatality and injury in China were higher than that in other high-income countries, whereas the risks of pedestrian casualty was lower than in these countries 30 years ago. The findings could have a contribution to better understanding of the exposures of pedestrians in urban traffic in China, and provide background knowledge for the development of strategies for pedestrian protection.

  7. Modeling Group Size and Scalar Stress by Logistic Regression from an Archaeological Perspective

    PubMed Central

    Alberti, Gianmarco

    2014-01-01

    Johnson’s scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups’ size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson’s theory and on Dunbar’s findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122–132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147–170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion. PMID:24626241

  8. Optimization of Game Formats in U-10 Soccer Using Logistic Regression Analysis

    PubMed Central

    Amatria, Mario; Arana, Javier; Anguera, M. Teresa; Garzón, Belén

    2016-01-01

    Abstract Small-sided games provide young soccer players with better opportunities to develop their skills and progress as individual and team players. There is, however, little evidence on the effectiveness of different game formats in different age groups, and furthermore, these formats can vary between and even within countries. The Royal Spanish Soccer Association replaced the traditional grassroots 7-a-side format (F-7) with the 8-a-side format (F-8) in the 2011-12 season and the country’s regional federations gradually followed suit. The aim of this observational methodology study was to investigate which of these formats best suited the learning needs of U-10 players transitioning from 5-aside futsal. We built a multiple logistic regression model to predict the success of offensive moves depending on the game format and the area of the pitch in which the move was initiated. Success was defined as a shot at the goal. We also built two simple logistic regression models to evaluate how the game format influenced the acquisition of technicaltactical skills. It was found that the probability of a shot at the goal was higher in F-7 than in F-8 for moves initiated in the Creation Sector-Own Half (0.08 vs 0.07) and the Creation Sector-Opponent's Half (0.18 vs 0.16). The probability was the same (0.04) in the Safety Sector. Children also had more opportunities to control the ball and pass or take a shot in the F-7 format (0.24 vs 0.20), and these were also more likely to be successful in this format (0.28 vs 0.19). PMID:28031768

  9. Modeling group size and scalar stress by logistic regression from an archaeological perspective.

    PubMed

    Alberti, Gianmarco

    2014-01-01

    Johnson's scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.

  10. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-01-01

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569

  11. Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes.

    PubMed

    Lin, Wan-Yu; Schaid, Daniel J

    2009-04-01

    Recently, a genomic distance-based regression for multilocus associations was proposed (Wessel and Schork [2006] Am. J. Hum. Genet. 79:792-806) in which either locus or haplotype scoring can be used to measure genetic distance. Although it allows various measures of genomic similarity and simultaneous analyses of multiple phenotypes, its power relative to other methods for case-control analyses is not well known. We compare the power of traditional methods with this new distance-based approach, for both locus-scoring and haplotype-scoring strategies. We discuss the relative power of these association methods with respect to five properties: (1) the marker informativity; (2) the number of markers; (3) the causal allele frequency; (4) the preponderance of the most common high-risk haplotype; (5) the correlation between the causal single-nucleotide polymorphism (SNP) and its flanking markers. We found that locus-based logistic regression and the global score test for haplotypes suffered from power loss when many markers were included in the analyses, due to many degrees of freedom. In contrast, the distance-based approach was not as vulnerable to more markers or more haplotypes. A genotype counting measure was more sensitive to the marker informativity and the correlation between the causal SNP and its flanking markers. After examining the impact of the five properties on power, we found that on average, the genomic distance-based regression that uses a matching measure for diplotypes was the most powerful and robust method among the seven methods we compared.

  12. Logistic regression and artificial neural network models for mapping of regional-scale landslide susceptibility in volcanic mountains of West Java (Indonesia)

    NASA Astrophysics Data System (ADS)

    Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.

    2016-05-01

    West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.

  13. Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization

    PubMed Central

    Huang, Hai-Hui; Liu, Xiao-Ying; Liang, Yong

    2016-01-01

    Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods. PMID:27136190

  14. Multiple logistic regression model of signalling practices of drivers on urban highways

    NASA Astrophysics Data System (ADS)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  15. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  16. A logistic regression based approach for the prediction of flood warning threshold exceedance

    NASA Astrophysics Data System (ADS)

    Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara

    2016-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances

  17. Geographical variation of unmet medical needs in Italy: a multivariate logistic regression analysis

    PubMed Central

    2013-01-01

    Background Unmet health needs should be, in theory, a minor issue in Italy where a publicly funded and universally accessible health system exists. This, however, does not seem to be the case. Moreover, in the last two decades responsibilities for health care have been progressively decentralized to regional governments, which have differently organized health service delivery within their territories. Regional decision-making has affected the use of health care services, further increasing the existing geographical disparities in the access to care across the country. This study aims at comparing self-perceived unmet needs across Italian regions and assessing how the reported reasons - grouped into the categories of availability, accessibility and acceptability – vary geographically. Methods Data from the 2006 Italian component of the European Union Statistics on Income and Living Conditions are employed to explore reasons and predictors of self-reported unmet medical needs among 45,175 Italian respondents aged 18 and over. Multivariate logistic regression models are used to determine adjusted rates for overall unmet medical needs and for each of the three categories of reasons. Results Results show that, overall, 6.9% of the Italian population stated having experienced at least one unmet medical need during the last 12 months. The unadjusted rates vary markedly across regions, thus resulting in a clear-cut north–south divide (4.6% in the North-East vs. 10.6% in the South). Among those reporting unmet medical needs, the leading reason was problems of accessibility related to cost or transportation (45.5%), followed by acceptability (26.4%) and availability due to the presence of too long waiting lists (21.4%). In the South, more than one out of two individuals with an unmet need refrained from seeing a physician due to economic reasons. In the northern regions, working and family responsibilities contribute relatively more to the underutilization of medical

  18. Alternatives for logistic regression in cross-sectional studies: an empirical comparison of models that directly estimate the prevalence ratio

    PubMed Central

    Barros, Aluísio JD; Hirakata, Vânia N

    2003-01-01

    Background Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. Methods We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Results Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Conclusions Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations. PMID:14567763

  19. An epidemiological survey on road traffic crashes in Iran: application of the two logistic regression models.

    PubMed

    Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid

    2014-01-01

    Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.

  20. Exploring the performance of logistic regression model types on growth/no growth data of Listeria monocytogenes.

    PubMed

    Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F

    2007-03-20

    Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In

  1. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    PubMed

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed.

  2. Penalization, bias reduction, and default priors in logistic and related categorical and survival regressions.

    PubMed

    Greenland, Sander; Mansournia, Mohammad Ali

    2015-10-15

    Penalization is a very general method of stabilizing or regularizing estimates, which has both frequentist and Bayesian rationales. We consider some questions that arise when considering alternative penalties for logistic regression and related models. The most widely programmed penalty appears to be the Firth small-sample bias-reduction method (albeit with small differences among implementations and the results they provide), which corresponds to using the log density of the Jeffreys invariant prior distribution as a penalty function. The latter representation raises some serious contextual objections to the Firth reduction, which also apply to alternative penalties based on t-distributions (including Cauchy priors). Taking simplicity of implementation and interpretation as our chief criteria, we propose that the log-F(1,1) prior provides a better default penalty than other proposals. Penalization based on more general log-F priors is trivial to implement and facilitates mean-squared error reduction and sensitivity analyses of penalty strength by varying the number of prior degrees of freedom. We caution however against penalization of intercepts, which are unduly sensitive to covariate coding and design idiosyncrasies.

  3. Predictive occurrence models for coastal wetland plant communities: delineating hydrologic response surfaces with multinomial logistic regression

    USGS Publications Warehouse

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-01-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  4. Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model

    NASA Astrophysics Data System (ADS)

    Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd

    2016-10-01

    Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.

  5. [Clinical research XX. From clinical judgment to multiple logistic regression model].

    PubMed

    Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O

    2014-01-01

    The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account.

  6. Validation data-based adjustments for outcome misclassification in logistic regression: an illustration.

    PubMed

    Lyles, Robert H; Tang, Li; Superak, Hillary M; King, Caroline C; Celentano, David D; Lo, Yungtai; Sobel, Jack D

    2011-07-01

    Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.

  7. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model

    PubMed Central

    Wang, Liguo; Park, Hyun Jung; Dasari, Surendra; Wang, Shengqin; Kocher, Jean-Pierre; Li, Wei

    2013-01-01

    Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and ‘hidden’ transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly. PMID:23335781

  8. Extreme value distribution based gene selection criteria for discriminant microarray data analysis using logistic regression.

    PubMed

    Li, Wentian; Sun, Fengzhu; Grosse, Ivo

    2004-01-01

    One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression models, gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, L(D|M), and the expected maximum likelihood of the model given an ensemble of surrogate data with randomly permuted label, L(D(0)|M). Typically, the computational burden for obtaining L(D(0)M) is immense, often exceeding the limits of available computing resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme-value problem. We present the derivation of an asymptotic distribution of the extreme-value as well as its mean, median, and variance. Using this distribution, we propose two gene selection criteria, and we apply them to two microarray datasets and three classification tasks for illustration.

  9. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  10. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    ERIC Educational Resources Information Center

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  11. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  12. A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

    ERIC Educational Resources Information Center

    Guler, Nese; Penfield, Randall D.

    2009-01-01

    In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

  13. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    NASA Astrophysics Data System (ADS)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  14. Differential Item Functioning Detection and Effect Size: A Comparison between Logistic Regression and Mantel-Haenszel Procedures

    ERIC Educational Resources Information Center

    Hidalgo, M. Dolores; Lopez-Pina, Jose Antonio

    2004-01-01

    This article compares several procedures in their efficacy for detecting differential item functioning (DIF): logistic regression analysis, the Mantel-Haenszel (MH) procedure, and the modified Mantel-Haenszel procedure by Mazor, Clauser, and Hambleton. It also compares the effect size measures that these procedures provide. In this study,…

  15. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    SciTech Connect

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  16. Loss of Power in Logistic, Ordinal Logistic, and Probit Regression When an Outcome Variable Is Coarsely Categorized

    ERIC Educational Resources Information Center

    Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.

    2006-01-01

    Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…

  17. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  18. A logistic regression equation for estimating the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Archfield, Stacey A.

    2002-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis

  19. LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS.

    PubMed

    Almquist, Zack W; Butts, Carter T

    2014-08-01

    Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach.

  20. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA.

    PubMed

    Mair, Alan; El-Kadi, Aly I

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (>1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.

  1. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA

    NASA Astrophysics Data System (ADS)

    Mair, Alan; El-Kadi, Aly I.

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (> 1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.

  2. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.

    PubMed

    Heinze, Georg; Puhr, Rainer

    2010-03-30

    Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.

  3. LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS

    PubMed Central

    Almquist, Zack W.; Butts, Carter T.

    2015-01-01

    Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218

  4. Shock Index Correlates with Extravasation on Angiographs of Gastrointestinal Hemorrhage: A Logistics Regression Analysis

    SciTech Connect

    Nakasone, Yutaka Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori

    2007-09-15

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.

  5. Robust Inference from Conditional Logistic Regression Applied to Movement and Habitat Selection Analysis

    PubMed Central

    Duchesne, Thierry; Fortin, Daniel

    2017-01-01

    Conditional logistic regression (CLR) is widely used to analyze habitat selection and movement of animals when resource availability changes over space and time. Observations used for these analyses are typically autocorrelated, which biases model-based variance estimation of CLR parameters. This bias can be corrected using generalized estimating equations (GEE), an approach that requires partitioning the data into independent clusters. Here we establish the link between clustering rules in GEE and their effectiveness to remove statistical biases in variance estimation of CLR parameters. The current lack of guidelines is such that broad variation in clustering rules can be found among studies (e.g., 14–450 clusters) with unknown consequences on the robustness of statistical inference. We simulated datasets reflecting conditions typical of field studies. Longitudinal data were generated based on several parameters of habitat selection with varying strength of autocorrelation and some individuals having more observations than others. We then evaluated how changing the number of clusters impacted the effectiveness of variance estimators. Simulations revealed that 30 clusters were sufficient to get unbiased and relatively precise estimates of variance of parameter estimates. The use of destructive sampling to increase the number of independent clusters was successful at removing statistical bias, but only when observations were temporally autocorrelated and the strength of inter-individual heterogeneity was weak. GEE also provided robust estimates of variance for different magnitudes of unbalanced datasets. Our simulations demonstrate that GEE should be estimated by assigning each individual to a cluster when at least 30 animals are followed, or by using destructive sampling for studies with fewer individuals having intermediate level of behavioural plasticity in selection and temporally autocorrelated observations. The simulations provide valuable information to

  6. Binary logistic regression analysis of hard palate dimensions for sexing human crania

    PubMed Central

    Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna

    2016-01-01

    Sex determination is the preliminary step in every forensic investigation and the hard palate assumes significance in cranial sexing in cases involving burns and explosions due to its resistant nature and secluded location. This study analyzes the sexing potential of incisive foramen to posterior nasal spine length, palatine process of maxilla length, horizontal plate of palatine bone length and transverse length between the greater palatine foramina. The study deviates from the conventional method of measuring the maxillo-alveolar length and breadth as the dimensions considered in this study are more heat resistant and useful in situations with damaged alveolar margins. The study involves 50 male and 50 female adult dry skulls of Indian ethnic group. The dimensions measured were statistically analyzed using Student's t test, binary logistic regression and receiver operating characteristic curve. It was observed that the incisive foramen to posterior nasal spine length is a definite sex marker with sex predictability of 87.2%. The palatine process of maxilla length with 66.8% sex predictability and the horizontal plate of palatine bone length with 71.9% sex predictability cannot be relied upon as definite sex markers. The transverse length between the greater palatine foramina is statistically insignificant in sexing crania (P=0.318). Considering a significant overlap of values in both the sexes the palatal dimensions singularly cannot be relied upon for sexing. Nevertheless, considering the high sex predictability of incisive foramen to posterior nasal spine length this dimension can definitely be used to supplement other sexing evidence available to precisely conclude the cranial sex. PMID:27382518

  7. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach

    PubMed Central

    2016-01-01

    Background Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. Objective The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Methods Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. Results College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. Conclusion The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of e

  8. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health

    PubMed Central

    Wagner, Philippe; Ghith, Nermin; Leckie, George

    2016-01-01

    Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible

  9. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  10. What's the Risk? A Simple Approach for Estimating Adjusted Risk Measures from Nonlinear Models Including Logistic Regression

    PubMed Central

    Kleinman, Lawrence C; Norton, Edward C

    2009-01-01

    Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213

  11. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  12. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  13. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    USGS Publications Warehouse

    Li, J.; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  14. Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

    PubMed

    Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

    2013-08-01

    As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

  15. The identification of menstrual blood in forensic samples by logistic regression modeling of miRNA expression.

    PubMed

    Hanson, Erin K; Mirza, Mohid; Rekab, Kamel; Ballantyne, Jack

    2014-11-01

    We report the identification of sensitive and specific miRNA biomarkers for menstrual blood, a tissue that might provide probative information in certain specialized instances. We incorporated these biomarkers into qPCR assays and developed a quantitative statistical model using logistic regression that permits the prediction of menstrual blood in a forensic sample with a high, and measurable, degree of accuracy. Using the developed model, we achieved 100% accuracy in determining the body fluid of interest for a set of test samples (i.e. samples not used in model development). The development, and details, of the logistic regression model are described. Testing and evaluation of the finalized logistic regression modeled assay using a small number of samples was carried out to preliminarily estimate the limit of detection (LOD), specificity in admixed samples and expression of the menstrual blood miRNA biomarkers throughout the menstrual cycle (25-28 days). The LOD was <1 ng of total RNA, the assay performed as expected with admixed samples and menstrual blood was identified only during the menses phase of the female reproductive cycle in two donors.

  16. Logistic Regression Analysis of Contrast-Enhanced Ultrasound and Conventional Ultrasound Characteristics of Sub-centimeter Thyroid Nodules.

    PubMed

    Zhao, Rui-Na; Zhang, Bo; Yang, Xiao; Jiang, Yu-Xin; Lai, Xing-Jian; Zhang, Xiao-Yan

    2015-12-01

    The purpose of the study described here was to determine specific characteristics of thyroid microcarcinoma (TMC) and explore the value of contrast-enhanced ultrasound (CEUS) combined with conventional ultrasound (US) in the diagnosis of TMC. Characteristics of 63 patients with TMC and 39 with benign sub-centimeter thyroid nodules were retrospectively analyzed. Multivariate logistic regression analysis was performed to determine independent risk factors. Four variables were included in the logistic regression models: age, shape, blood flow distribution and enhancement pattern. The area under the receiver operating characteristic curve was 0.919. With 0.113 selected as the cutoff value, sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 90.5%, 82.1%, 89.1%, 84.2% and 87.3%, respectively. Independent risk factors for TMC determined with the combination of CEUS and conventional US were age, shape, blood flow distribution and enhancement pattern. Age was negatively correlated with malignancy, whereas shape, blood flow distribution and enhancement pattern were positively correlated. The logistic regression model involving CEUS and conventional US was found to be effective in the diagnosis of sub-centimeter thyroid nodules.

  17. Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy)

    NASA Astrophysics Data System (ADS)

    Conoscenti, Christian; Angileri, Silvia; Cappadonia, Chiara; Rotigliano, Edoardo; Agnesi, Valerio; Märker, Michael

    2014-01-01

    This research aims at characterizing susceptibility conditions to gully erosion by means of GIS and multivariate statistical analysis. The study area is a 9.5 km2 river catchment in central-northern Sicily, where agriculture activities are limited by intense erosion. By means of field surveys and interpretation of aerial images, we prepared a digital map of the spatial distribution of 260 gullies in the study area. In addition, from available thematic maps, a 5 m cell size digital elevation model and field checks, we derived 27 environmental attributes that describe the variability of lithology, land use, topography and road position. These attributes were selected for their potential influence on erosion processes, while the dependent variable was given by presence or absence of gullies within two different types of mapping units: 5 m grid cells and slope units (average size = 2.66 ha). The functional relationships between gully occurrence and the controlling factors were obtained from forward stepwise logistic regression to calculate the probability to host a gully for each mapping unit. In order to train and test the predictive models, three calibration and three validation subsets, of both grid cells and slope units, were randomly selected. Results of validation, based on ROC (receiving operating characteristic) curves, attest for acceptable to excellent accuracies of the models, showing better predictive skill and more stable performance of the susceptibility model based on grid cells.

  18. Deep Human Parsing with Active Template Regression.

    PubMed

    Liang, Xiaodan; Liu, Si; Shen, Xiaohui; Yang, Jianchao; Liu, Luoqi; Dong, Jian; Lin, Liang; Yan, Shuicheng

    2015-12-01

    In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an active template regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches 64.38 percent by our ATR framework, significantly higher than 44.76 percent based on the state-of-the-art algorithm [28].

  19. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire

  20. An investigation of the speeding-related crash designation through crash narrative reviews sampled via logistic regression.

    PubMed

    Fitzpatrick, Cole D; Rakasi, Saritha; Knodler, Michael A

    2017-01-01

    Speed is one of the most important factors in traffic safety as higher speeds are linked to increased crash risk and higher injury severities. Nearly a third of fatal crashes in the United States are designated as "speeding-related", which is defined as either "the driver behavior of exceeding the posted speed limit or driving too fast for conditions." While many studies have utilized the speeding-related designation in safety analyses, no studies have examined the underlying accuracy of this designation. Herein, we investigate the speeding-related crash designation through the development of a series of logistic regression models that were derived from the established speeding-related crash typologies and validated using a blind review, by multiple researchers, of 604 crash narratives. The developed logistic regression model accurately identified crashes which were not originally designated as speeding-related but had crash narratives that suggested speeding as a causative factor. Only 53.4% of crashes designated as speeding-related contained narratives which described speeding as a causative factor. Further investigation of these crashes revealed that the driver contributing code (DCC) of "driving too fast for conditions" was being used in three separate situations. Additionally, this DCC was also incorrectly used when "exceeding the posted speed limit" would likely have been a more appropriate designation. Finally, it was determined that the responding officer only utilized one DCC in 82% of crashes not designated as speeding-related but contained a narrative indicating speed as a contributing causal factor. The use of logistic regression models based upon speeding-related crash typologies offers a promising method by which all possible speeding-related crashes could be identified.

  1. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  2. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  3. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  4. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  5. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis

    PubMed Central

    Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients’ functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan’s community-based occupational therapy (OT) service referral based on experts’ beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services. PMID:26863544

  6. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    PubMed

    Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  7. Applicability of the Ricketts’ posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians

    PubMed Central

    Perez, Ivan; Chavez, Allison K.; Ponce, Dario

    2016-01-01

    Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732

  8. Estimating the susceptibility of surface water in Texas to nonpoint-source contamination by use of logistic regression modeling

    USGS Publications Warehouse

    Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby

    2003-01-01

    In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.

  9. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS

    PubMed Central

    Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian

    2016-01-01

    Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135

  10. Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

    PubMed

    Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar

    2016-08-15

    Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Moment Adjusted Imputation for Multivariate Measurement Error Data with Applications to Logistic Regression

    PubMed Central

    Thomas, Laine; Stefanski, Leonard A.; Davidian, Marie

    2013-01-01

    In clinical studies, covariates are often measured with error due to biological fluctuations, device error and other sources. Summary statistics and regression models that are based on mismeasured data will differ from the corresponding analysis based on the “true” covariate. Statistical analysis can be adjusted for measurement error, however various methods exhibit a tradeo between convenience and performance. Moment Adjusted Imputation (MAI) is method for measurement error in a scalar latent variable that is easy to implement and performs well in a variety of settings. In practice, multiple covariates may be similarly influenced by biological fluctuastions, inducing correlated multivariate measurement error. The extension of MAI to the setting of multivariate latent variables involves unique challenges. Alternative strategies are described, including a computationally feasible option that is shown to perform well. PMID:24072947

  12. Predictors of psychiatric disorders in liver transplantation candidates: logistic regression models.

    PubMed

    Rocca, Paola; Cocuzza, Elena; Rasetti, Roberta; Rocca, Giuseppe; Zanalda, Enrico; Bogetto, Filippo

    2003-07-01

    This study has two goals. The first goal is to assess the prevalence of psychiatric disorders in orthotopic liver transplantation (OLT) candidates by means of standardized procedures because there has been little research concerning psychiatric problems of potential OLT candidates using standardized instruments. The second goal focuses on identifying predictors of these psychiatric disorders. One hundred sixty-five elective OLT candidates were assessed by our unit. Psychiatric diagnoses were based on the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition. Patients also were assessed using the Hamilton Depression Rating Scale (HDRS) and the Spielberger Anxiety Index, State and Trait forms (STAI-X1 and STAI-X2). Severity of cirrhosis was assessed by applying Child-Pugh score criteria. Chi-squared and general linear model analysis of variance were used to test the univariate association between patient characteristics and both clinical psychiatric diagnoses and severity of psychiatric diseases. Variables with P less than.10 in univariate analyses were included in multiple regression models. Forty-three percent of patients presented at least one psychiatric diagnosis. Child-Pugh score and previous psychiatric diagnoses were independent significant predictors of depressive disorders. Severity of psychiatric symptoms measured by psychometric scales (HDRS, STAI-X1, and STAI-X2) was associated with Child-Pugh score in the multiple regression model. Our data suggest a high rate of psychiatric disorders, particularly adjustment disorders, in our sample of OLT candidates. Severity of liver disease emerges as the most important variable in predicting severity of psychiatric disorders in these patients.

  13. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election.

    PubMed

    Almquist, Zack W; Butts, Carter T

    2013-10-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention-designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs.

  14. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election

    PubMed Central

    2013-01-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060

  15. Trinculo: Bayesian and frequentist multinomial logistic regression for genome-wide association studies of multi-category phenotypes

    PubMed Central

    Jostins, Luke; McVean, Gilean

    2016-01-01

    Motivation: For many classes of disease the same genetic risk variants underly many related phenotypes or disease subtypes. Multinomial logistic regression provides an attractive framework to analyze multi-category phenotypes, and explore the genetic relationships between these phenotype categories. We introduce Trinculo, a program that implements a wide range of multinomial analyses in a single fast package that is designed to be easy to use by users of standard genome-wide association study software. Availability and implementation: An open source C implementation, with code and binaries for Linux and Mac OSX, is available for download at http://sourceforge.net/projects/trinculo Supplementary information: Supplementary data are available at Bioinformatics online. Contact: lj4@well.ox.ac.uk PMID:26873930

  16. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression.

    PubMed

    Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y

    2016-08-01

    Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs.

  17. Analysis of vegetation distribution in Interior Alaska and sensitivity to climate change using a logistic regression approach

    USGS Publications Warehouse

    Calef, M.P.; McGuire, A.D.; Epstein, H.E.; Rupp, T.S.; Shugart, H.H.

    2005-01-01

    Aim: To understand drivers of vegetation type distribution and sensitivity to climate change. Location: Interior Alaska. Methods: A logistic regression model was developed that predicts the potential equilibrium distribution of four major vegetation types: tundra, deciduous forest, black spruce forest and white spruce forest based on elevation, aspect, slope, drainage type, fire interval, average growing season temperature and total growing season precipitation. The model was run in three consecutive steps. The hierarchical logistic regression model was used to evaluate how scenarios of changes in temperature, precipitation and fire interval may influence the distribution of the four major vegetation types found in this region. Results: At the first step, tundra was distinguished from forest, which was mostly driven by elevation, precipitation and south to north aspect. At the second step, forest was separated into deciduous and spruce forest, a distinction that was primarily driven by fire interval and elevation. At the third step, the identification of black vs. white spruce was driven mainly by fire interval and elevation. The model was verified for Interior Alaska, the region used to develop the model, where it predicted vegetation distribution among the steps with an accuracy of 60-83%. When the model was independently validated for north-west Canada, it predicted vegetation distribution among the steps with an accuracy of 53-85%. Black spruce remains the dominant vegetation type under all scenarios, potentially expanding most under warming coupled with increasing fire interval. White spruce is clearly limited by moisture once average growing season temperatures exceeded a critical limit (+2 ??C). Deciduous forests expand their range the most when any two of the following scenarios are combined: decreasing fire interval, warming and increasing precipitation. Tundra can be replaced by forest under warming but expands under precipitation increase. Main

  18. Prediction of Foreign Object Debris/Damage (FOD) type for elimination in the aeronautics manufacturing environment through logistic regression model

    NASA Astrophysics Data System (ADS)

    Espino, Natalia V.

    Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.

  19. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  20. Modeling Small Unmanned Aerial System Mishaps Using Logistic Regression and Artificial Neural Networks

    DTIC Science & Technology

    2012-03-22

    preconditions for unsafe acts which ultimately result in active failures (mishaps). The DoD HFACS classification system has been used to categorize the...reoccur as future individuals repeat the same locally rational acts , while a classification scheme on these errors merely provides a label to what in...boundaries, or public endangerment ”. The second document, AFRLMAN 99-103, defines Class A through C mishaps much like the DoD classification

  1. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands

    PubMed Central

    Franklin, Erik C.; Kelley, Christopher; Frazer, L. Neil; Toonen, Robert J.

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence”) threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i. PMID:27441122

  2. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    PubMed

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.

  3. Comparative Analysis of Decision Trees with Logistic Regression in Predicting Fault-Prone Classes

    NASA Astrophysics Data System (ADS)

    Singh, Yogesh; Takkar, Arvinder Kaur; Malhotra, Ruchika

    There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Decision Tree (DT) methods have been successfully applied for solving classification problems in many applications. This paper evaluates the capability of three DT methods and compares its performance with statistical method in predicting fault prone software classes using publicly available NASA data set. The results indicate that the prediction performance of DT is generally better than statistical model. However, similar types of studies are required to be carried out in order to establish the acceptability of the DT models.

  4. Integration of geographic information systems and logistic multiple regression for aquatic macrophyte modeling

    SciTech Connect

    Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.

    1994-06-01

    Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.

  5. Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio

    2015-08-01

    In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are

  6. The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score.

    PubMed

    Wyss, Richard; Ellis, Alan R; Brookhart, M Alan; Girman, Cynthia J; Jonsson Funk, Michele; LoCasale, Robert; Stürmer, Til

    2014-09-15

    The covariate-balancing propensity score (CBPS) extends logistic regression to simultaneously optimize covariate balance and treatment prediction. Although the CBPS has been shown to perform well in certain settings, its performance has not been evaluated in settings specific to pharmacoepidemiology and large database research. In this study, we use both simulations and empirical data to compare the performance of the CBPS with logistic regression and boosted classification and regression trees. We simulated various degrees of model misspecification to evaluate the robustness of each propensity score (PS) estimation method. We then applied these methods to compare the effect of initiating glucagonlike peptide-1 agonists versus sulfonylureas on cardiovascular events and all-cause mortality in the US Medicare population in 2007-2009. In simulations, the CBPS was generally more robust in terms of balancing covariates and reducing bias compared with misspecified logistic PS models and boosted classification and regression trees. All PS estimation methods performed similarly in the empirical example. For settings common to pharmacoepidemiology, logistic regression with balance checks to assess model specification is a valid method for PS estimation, but it can require refitting multiple models until covariate balance is achieved. The CBPS is a promising method to improve the robustness of PS models.

  7. Extension of the Peters–Belson method to estimate health disparities among multiple groups using logistic regression with survey data

    PubMed Central

    Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.

    2015-01-01

    Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235

  8. Quantifying determinants of cash crop expansion and their relative effects using logistic regression modeling and variance partitioning

    NASA Astrophysics Data System (ADS)

    Xiao, Rui; Su, Shiliang; Mai, Gengchen; Zhang, Zhonghao; Yang, Chenxue

    2015-02-01

    Cash crop expansion has been a major land use change in tropical and subtropical regions worldwide. Quantifying the determinants of cash crop expansion should provide deeper spatial insights into the dynamics and ecological consequences of cash crop expansion. This paper investigated the process of cash crop expansion in Hangzhou region (China) from 1985 to 2009 using remotely sensed data. The corresponding determinants (neighborhood, physical, and proximity) and their relative effects during three periods (1985-1994, 1994-2003, and 2003-2009) were quantified by logistic regression modeling and variance partitioning. Results showed that the total area of cash crops increased from 58,874.1 ha in 1985 to 90,375.1 ha in 2009, with a net growth of 53.5%. Cash crops were more likely to grow in loam soils. Steep areas with higher elevation would experience less likelihood of cash crop expansion. A consistently higher probability of cash crop expansion was found on places with abundant farmland and forest cover in the three periods. Besides, distance to river and lake, distance to county center, and distance to provincial road were decisive determinants for farmers' choice of cash crop plantation. Different categories of determinants and their combinations exerted different influences on cash crop expansion. The joint effects of neighborhood and proximity determinants were the strongest, and the unique effect of physical determinants decreased with time. Our study contributed to understanding of the proximate drivers of cash crop expansion in subtropical regions.

  9. A revised logistic regression equation and an automated procedure for mapping the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Steeves, Peter A.

    2006-01-01

    A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial

  10. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    NASA Astrophysics Data System (ADS)

    Mousavi, S. Mostafa; Horton, Stephen P.; Langston, Charles A.; Samei, Borhan

    2016-10-01

    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocentres and seismic features of the recorded signals in time, frequency and time-frequency domains. We applied the technique to 440 microearthquakes -1.7 < Mw < 1.29, induced by an underground cavern collapse in the Napoleonville Salt Dome in Bayou Corne, Louisiana. Forty different seismic attributes of whole seismograms including degree of polarization and spectral attributes were measured. A selected set of features was then used to train the system to discriminate between deep and shallow events based on the knowledge gained from existing patterns. The cross-validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 and 2000 m with 88 per cent and 90.7 per cent accuracy using logistic regression and artificial neural network models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2-D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine-learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.

  11. Financial performance monitoring of the technical efficiency of critical access hospitals: a data envelopment analysis and logistic regression modeling approach.

    PubMed

    Wilson, Asa B; Kerr, Bernard J; Bastian, Nathaniel D; Fulton, Lawrence V

    2012-01-01

    From 1980 to 1999, rural designated hospitals closed at a disproportionally high rate. In response to this emergent threat to healthcare access in rural settings, the Balanced Budget Act of 1997 made provisions for the creation of a new rural hospital--the critical access hospital (CAH). The conversion to CAH and the associated cost-based reimbursement scheme significantly slowed the closure rate of rural hospitals. This work investigates which methods can ensure the long-term viability of small hospitals. This article uses a two-step design to focus on a hypothesized relationship between technical efficiency of CAHs and a recently developed set of financial monitors for these entities. The goal is to identify the financial performance measures associated with efficiency. The first step uses data envelopment analysis (DEA) to differentiate efficient from inefficient facilities within a data set of 183 CAHs. Determining DEA efficiency is an a priori categorization of hospitals in the data set as efficient or inefficient. In the second step, DEA efficiency is the categorical dependent variable (efficient = 0, inefficient = 1) in the subsequent binary logistic regression (LR) model. A set of six financial monitors selected from the array of 20 measures were the LR independent variables. We use a binary LR to test the null hypothesis that recently developed CAH financial indicators had no predictive value for categorizing a CAH as efficient or inefficient, (i.e., there is no relationship between DEA efficiency and fiscal performance).

  12. Spatial Analysis of Severe Fever with Thrombocytopenia Syndrome Virus in China Using a Geographically Weighted Logistic Regression Model

    PubMed Central

    Wu, Liang; Deng, Fei; Xie, Zhong; Hu, Sheng; Shen, Shu; Shi, Junming; Liu, Dan

    2016-01-01

    Severe fever with thrombocytopenia syndrome (SFTS) is caused by severe fever with thrombocytopenia syndrome virus (SFTSV), which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity), the average proportion of rural population and the average proportion of primary industries over three years (2010–2012). We constructed a geographically weighted logistic regression (GWLR) model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1) meteorological factors have a strong influence on the SFTSV cover; (2) a GWLR model is suitable for exploring SFTSV cover in mainland China; (3) our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies. PMID:27845737

  13. A Study of Domain Adaptation Classifiers Derived from Logistic Regression for the Task of Splice Site Prediction

    PubMed Central

    Herndon, Nic; Caragea, Doina

    2016-01-01

    Supervised classifiers are highly dependent on abundant labeled training data. Alternatives for addressing the lack of labeled data include: labeling data (but this is costly and time consuming); training classifiers with abundant data from another domain (however, the classification accuracy usually decreases as the distance between domains increases); or complementing the limited labeled data with abundant unlabeled data from the same domain and learning semi-supervised classifiers (but the unlabeled data can mislead the classifier). A better alternative is to use both the abundant labeled data from a source domain, the limited labeled data and optionally the unlabeled data from the target domain to train classifiers in a domain adaptation setting. We propose two such classifiers, based on logistic regression, and evaluate them for the task of splice site prediction – a difficult and essential step in gene prediction. Our classifiers achieved high accuracy, with highest areas under the precision-recall curve between 50.83% and 82.61%. PMID:26849871

  14. Measuring decision weights in recognition experiments with multiple response alternatives: comparing the correlation and multinomial-logistic-regression methods.

    PubMed

    Dai, Huanping; Micheyl, Christophe

    2012-11-01

    Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.

  15. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    NASA Astrophysics Data System (ADS)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2017-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  16. Identifying Environmental and Social Factors Predisposing to Pathological Gambling Combining Standard Logistic Regression and Logic Learning Machine.

    PubMed

    Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco

    2017-03-02

    Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.

  17. To Set Up a Logistic Regression Prediction Model for Hepatotoxicity of Chinese Herbal Medicines Based on Traditional Chinese Medicine Theory

    PubMed Central

    Liu, Hongjie; Li, Tianhao; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua

    2016-01-01

    Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p < 0.05), and the coefficient was 0.178 (p < 0.05); also they were related with five flavors (p < 0.05), and the coefficient was 0.145 (p < 0.05); they were not related with channel tropism (p > 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs. PMID:27656240

  18. Development of synthetic velocity - depth damage curves using a Weighted Monte Carlo method and Logistic Regression analysis

    NASA Astrophysics Data System (ADS)

    Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.

    2014-05-01

    Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution

  19. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  20. The Association Between Changes in Insulin Sensitivity and Consumption of Tobacco and Alcohol in Young Adults: Ordinal Logistic Regression Approach

    PubMed Central

    Fufaa, Gudeta; Cai, Bin

    2016-01-01

    Context Reduced insulin sensitivity is one of the traditional risk factors for chronic diseases such as type 2 diabetes. Reduced insulin sensitivity leads to insulin resistance, which in turn can lead to the development of type 2 diabetes. Few studies have examined factors such as blood pressure, tobacco and alcohol consumption that influence changes in insulin sensitivity over time especially among young adults. Purpose To examine temporal changes in insulin sensitivity in young adults (18-30 years of age at baseline) over a period of 20 years by taking into account the effects of tobacco and alcohol consumptions at baseline. In other words, the purpose of the present study is to examine if baseline tobacco and alcohol consumptions can be used in predicting lowered insulin sensitivity. Method This is a retrospective study using data collected by the Coronary Artery Risk Development in Young Adults (CARDIA) study from the National Heart, Lung, and Blood Institute. Participants were enrolled into the study in 1985 (baseline) and followed up to 2005. Insulin sensitivity, measured by the quantitative insulin sensitivity check index (QUICKI), was recorded at baseline and 20 years later, in 2005. The number of participants included in the study was 3,547. The original study included a total of 5,112 participants at baseline. Of these, 54.48% were female, and 45.52% were male; 45.31% were 18 to 24 years of age, and 54.69% were 25 to 30 years of age. Ordinal logistic regression was used to assess changes in insulin sensitivity. Changes in insulin sensitivity from baseline were calculated and grouped into three categories (more than 15%, more than 8.5% to at most 15%, and at most 8.5%), which provided the basis for employing ordinal logistic regression to assess changes in insulin sensitivity. The effects of alcohol and smoking consumption at baseline on the change in insulin sensitivity were accounted for by including these variables in the model. Results Daily alcohol

  1. Factors associated with trait anger level of juvenile offenders in Hubei province: A binary logistic regression analysis.

    PubMed

    Tang, Li-Na; Ye, Xiao-Zhou; Yan, Qiu-Ge; Chang, Hong-Juan; Ma, Yu-Qiao; Liu, De-Bin; Li, Zhi-Gen; Yu, Yi-Zhen

    2017-02-01

    The risk factors of high trait anger of juvenile offenders were explored through questionnaire study in a youth correctional facility of Hubei province, China. A total of 1090 juvenile offenders in Hubei province were investigated by self-compiled social-demographic questionnaire, Childhood Trauma Questionnaire (CTQ), and State-Trait Anger Expression Inventory-II (STAXI-II). The risk factors were analyzed by chi-square tests, correlation analysis, and binary logistic regression analysis with SPSS 19.0. A total of 1082 copies of valid questionnaires were collected. High trait anger group (n=316) was defined as those who scored in the upper 27th percentile of STAXI-II trait anger scale (TAS), and the rest were defined as low trait anger group (n=766). The risk factors associated with high level of trait anger included: childhood emotional abuse, childhood sexual abuse, step family, frequent drug abuse, and frequent internet using (P<0.05 or P<0.01). Birth sequence, number of sibling, ranking in the family, identity of the main care-taker, the education level of care-taker, educational style of care-taker, family income, relationship between parents, social atmosphere of local area, frequent drinking, and frequent smoking did not predict to high level of trait anger (P>0.05). It was suggested that traumatic experience in childhood and unhealthy life style may significantly increase the level of trait anger in adulthood. The risk factors of high trait anger and their effects should be taken into consideration seriously.

  2. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  3. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    PubMed Central

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047

  4. SU-F-BRD-01: A Logistic Regression Model to Predict Objective Function Weights in Prostate Cancer IMRT

    SciTech Connect

    Boutilier, J; Chan, T; Lee, T; Craig, T; Sharpe, M

    2014-06-15

    Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.

  5. Functional Logistic Regression Approach to Detecting Gene by Longitudinal Environmental Exposure Interaction in a Case-Control Study

    PubMed Central

    Wei, Peng; Tang, Hongwei; Li, Donghui

    2014-01-01

    Most complex human diseases are likely the consequence of the joint actions of genetic and environmental factors. Identification of gene-environment (GxE) interactions not only contributes to a better understanding of the disease mechanisms, but also improves disease risk prediction and targeted intervention. In contrast to the large number of genetic susceptibility loci discovered by genome-wide association studies, there have been very few successes in identifying GxE interactions which may be partly due to limited statistical power and inaccurately measured exposures. While existing statistical methods only consider interactions between genes and static environmental exposures, many environmental/lifestyle factors, such as air pollution and diet, change over time, and cannot be accurately captured at one measurement time point or by simply categorizing into static exposure categories. There is a dearth of statistical methods for detecting gene by time-varying environmental exposure interactions. Here we propose a powerful functional logistic regression (FLR) approach to model the time-varying effect of longitudinal environmental exposure and its interaction with genetic factors on disease risk. Capitalizing on the powerful functional data analysis framework, our proposed FLR model is capable of accommodating longitudinal exposures measured at irregular time points and contaminated by measurement errors, commonly encountered in observational studies. We use extensive simulations to show that the proposed method can control the Type I error and is more powerful than alternative ad hoc methods. We demonstrate the utility of this new method using data from a case-control study of pancreatic cancer to identify the windows of vulnerability of lifetime body mass index on the risk of pancreatic cancer as well as genes which may modify this association. PMID:25219575

  6. Building vulnerability to hydro-geomorphic hazards: Estimating damage probability from qualitative vulnerability assessment using logistic regression

    NASA Astrophysics Data System (ADS)

    Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida

    2016-10-01

    bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.

  7. Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2014-02-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In

  8. Sample size matters: investigating the effect of sample size on a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2013-06-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we

  9. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  10. Using ordinal logistic regression to evaluate the performance of laser-Doppler predictions of burn-healing time

    PubMed Central

    2009-01-01

    Background Laser-Doppler imaging (LDI) of cutaneous blood flow is beginning to be used by burn surgeons to predict the healing time of burn wounds; predicted healing time is used to determine wound treatment as either dressings or surgery. In this paper, we do a statistical analysis of the performance of the technique. Methods We used data from a study carried out by five burn centers: LDI was done once between days 2 to 5 post burn, and healing was assessed at both 14 days and 21 days post burn. Random-effects ordinal logistic regression and other models such as the continuation ratio model were used to model healing-time as a function of the LDI data, and of demographic and wound history variables. Statistical methods were also used to study the false-color palette, which enables the laser-Doppler imager to be used by clinicians as a decision-support tool. Results Overall performance is that diagnoses are over 90% correct. Related questions addressed were what was the best blood flow summary statistic and whether, given the blood flow measurements, demographic and observational variables had any additional predictive power (age, sex, race, % total body surface area burned (%TBSA), site and cause of burn, day of LDI scan, burn center). It was found that mean laser-Doppler flux over a wound area was the best statistic, and that, given the same mean flux, women recover slightly more slowly than men. Further, the likely degradation in predictive performance on moving to a patient group with larger %TBSA than those in the data sample was studied, and shown to be small. Conclusion Modeling healing time is a complex statistical problem, with random effects due to multiple burn areas per individual, and censoring caused by patients missing hospital visits and undergoing surgery. This analysis applies state-of-the art statistical methods such as the bootstrap and permutation tests to a medical problem of topical interest. New medical findings are that age and %TBSA are

  11. Estimating allele dropout probabilities by logistic regression: Assessments using Applied Biosystems 3500xL and 3130xl Genetic Analyzers with various commercially available human identification kits.

    PubMed

    Inokuchi, Shota; Kitayama, Tetsushi; Fujii, Koji; Nakahara, Hiroaki; Nakanishi, Hiroaki; Saito, Kazuyuki; Mizuno, Natsuko; Sekiguchi, Kazumasa

    2016-03-01

    Phenomena called allele dropouts are often observed in crime stain profiles. Allele dropouts are generated because one of a pair of heterozygous alleles is underrepresented by stochastic influences and is indicated by a low peak detection threshold. Therefore, it is important that such risks are statistically evaluated. In recent years, attempts to interpret allele dropout probabilities by logistic regression using the information on peak heights have been reported. However, these previous studies are limited to the use of a human identification kit and fragment analyzer. In the present study, we calculated allele dropout probabilities by logistic regression using contemporary capillary electrophoresis instruments, 3500xL Genetic Analyzer and 3130xl Genetic Analyzer with various commercially available human identification kits such as AmpFℓSTR® Identifiler® Plus PCR Amplification Kit. Furthermore, the differences in logistic curves between peak detection thresholds using analytical threshold (AT) and values recommended by the manufacturer were compared. The standard logistic curves for calculating allele dropout probabilities from the peak height of sister alleles were characterized. The present study confirmed that ATs were lower than the values recommended by the manufacturer in human identification kits; therefore, it is possible to reduce allele dropout probabilities and obtain more information using AT as the peak detection threshold.

  12. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  13. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    PubMed

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.

  14. Comparison of dimension reduction-based logistic regression models for case-control genome-wide association study: principal components analysis vs. partial least squares

    PubMed Central

    Yi, Honggang; Wo, Hongmei; Zhao, Yang; Zhang, Ruyang; Dai, Junchen; Jin, Guangfu; Ma, Hongxia; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng

    2015-01-01

    Abstract With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the performance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data. PMID:26243516

  15. Analysis of regression methods for solar activity forecasting

    NASA Technical Reports Server (NTRS)

    Lundquist, C. A.; Vaughan, W. W.

    1979-01-01

    The paper deals with the potential use of the most recent solar data to project trends in the next few years. Assuming that a mode of solar influence on weather can be identified, advantageous use of that knowledge presumably depends on estimating future solar activity. A frequently used technique for solar cycle predictions is a linear regression procedure along the lines formulated by McNish and Lincoln (1949). The paper presents a sensitivity analysis of the behavior of such regression methods relative to the following aspects: cycle minimum, time into cycle, composition of historical data base, and unnormalized vs. normalized solar cycle data. Comparative solar cycle forecasts for several past cycles are presented as to these aspects of the input data. Implications for the current cycle, No. 21, are also given.

  16. Batch Mode Active Learning for Regression With Expected Model Change.

    PubMed

    Cai, Wenbin; Zhang, Muhan; Zhang, Ya

    2016-04-20

    While active learning (AL) has been widely studied for classification problems, limited efforts have been done on AL for regression. In this paper, we introduce a new AL framework for regression, expected model change maximization (EMCM), which aims at choosing the unlabeled data instances that result in the maximum change of the current model once labeled. The model change is quantified as the difference between the current model parameters and the updated parameters after the inclusion of the newly selected examples. In light of the stochastic gradient descent learning rule, we approximate the change as the gradient of the loss function with respect to each single candidate instance. Under the EMCM framework, we propose novel AL algorithms for the linear and nonlinear regression models. In addition, by simulating the behavior of the sequential AL policy when applied for k iterations, we further extend the algorithms to batch mode AL to simultaneously choose a set of k most informative instances at each query time. Extensive experimental results on both UCI and StatLib benchmark data sets have demonstrated that the proposed algorithms are highly effective and efficient.

  17. Assessing the probability of acquisition of meticillin-resistant Staphylococcus aureus (MRSA) in a dog using a nested stochastic simulation model and logistic regression sensitivity analysis.

    PubMed

    Heller, J; Innocent, G T; Denwood, M; Reid, S W J; Kelly, L; Mellor, D J

    2011-05-01

    Meticillin-resistant Staphylococcus aureus (MRSA) is an important nosocomial and community-acquired pathogen with zoonotic potential. The relationship between MRSA in humans and companion animals is poorly understood. This study presents a quantitative exposure assessment, based on expert opinion and published data, in the form of a second order stochastic simulation model with accompanying logistic regression sensitivity analysis that aims to define the most important factors for MRSA acquisition in dogs. The simulation model was parameterised using expert opinion estimates, along with published and unpublished data. The outcome of the model was biologically plausible and found to be dominated by uncertainty over variability. The sensitivity analysis, in the form of four separate logistic regression models, found that both veterinary and non-veterinary routes of acquisition of MRSA are likely to be relevant for dogs. The effects of exposure to, and probability of, transmission of MRSA from the home environment were ranked as the most influential predictors in all sensitivity analyses, although it is unlikely that this environmental source of MRSA is independent of alternative sources of MRSA (human and/or animal). Exposure to and transmission from MRSA positive family members were also found to be influential for acquisition of MRSA in pet dogs, along with veterinary clinic attendance and, while exposure to and transmission from the veterinary clinic environment was also found to be influential, it was difficult to differentiate between the importance of independent sources of MRSA within the veterinary clinic. The implementation of logistic regression analyses directly to the input/output relationship within the simulation model presented in this paper represents the application of a variance based sensitivity analysis technique in the area of veterinary medicine and is a useful means of ranking the relative importance of input variables.

  18. Discriminating between adaptive and carcinogenic liver hypertrophy in rat studies using logistic ridge regression analysis of toxicogenomic data: The mode of action and predictive models.

    PubMed

    Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu; Yoshinari, Kouichi; Honda, Hiroshi

    2017-03-01

    Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System were used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment.

  19. Logistic regression analyses for predicting clinically important differences in motor capacity, motor performance, and functional independence after constraint-induced therapy in children with cerebral palsy.

    PubMed

    Wang, Tien-ni; Wu, Ching-yi; Chen, Chia-ling; Shieh, Jeng-yi; Lu, Lu; Lin, Keh-chung

    2013-03-01

    Given the growing evidence for the effects of constraint-induced therapy (CIT) in children with cerebral palsy (CP), there is a need for investigating the characteristics of potential participants who may benefit most from this intervention. This study aimed to establish predictive models for the effects of pediatric CIT on motor and functional outcomes. Therapists administered CIT to 49 children (aged 3-11 years) with CP. Sessions were 1-3.5h a day, twice a week, for 3-4 weeks. Parents were asked to document the number of restraint hours outside of the therapy sessions. Domains of treatment outcomes included motor capacity (measured by the Peabody Developmental Motor Scales II), motor performance (measured by the Pediatric Motor Activity Log), and functional independence (measured by the Pediatric Functional Independence Measure). Potential predictors included age, affected side, compliance (measured by time of restraint), and the initial level of motor impairment severity. Tests were administered before, immediately after, and 3 months after the intervention. Logistic regression analyses showed that total amount of restraint time was the only significant predictor for improved motor capacity immediately after CIT. Younger children who restrained the less affected arm for a longer time had a greater chance to achieve clinically significant improvements in motor performance. For outcomes of functional independence in daily life, younger age was associated with clinically meaningful improvement in the self-care domain. Baseline motor abilities were significantly predictive of better improvement in mobility and cognition. Significant predictors varied according to the aspects of motor outcomes after 3 months of follow-up. The potential predictors identified in this study allow clinicians to target those children who may benefit most from CIT.

  20. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    PubMed Central

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  1. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    PubMed

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  2. Models of logistic regression analysis, support vector machine, and back-propagation neural network based on serum tumor markers in colorectal cancer diagnosis.

    PubMed

    Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G

    2016-05-13

    We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P < 0.05). The results of logistic regression analysis indicted that individual detection of serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153 and their combined detection was effective for diagnosing colorectal cancer. Combined detection had a better diagnostic effect with a sensitivity of 94.2% and specificity of 97.7%; combining serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153, with the support vector machine diagnosis model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.

  3. A case study using support vector machines, neural networks and logistic regression in a GIS to identify wells contaminated with nitrate-N

    NASA Astrophysics Data System (ADS)

    Dixon, Barnali

    2009-09-01

    Accurate and inexpensive identification of potentially contaminated wells is critical for water resources protection and management. The objectives of this study are to 1) assess the suitability of approximation tools such as neural networks (NN) and support vector machines (SVM) integrated in a geographic information system (GIS) for identifying contaminated wells and 2) use logistic regression and feature selection methods to identify significant variables for transporting contaminants in and through the soil profile to the groundwater. Fourteen GIS derived soil hydrogeologic and landuse parameters were used as initial inputs in this study. Well water quality data (nitrate-N) from 6,917 wells provided by Florida Department of Environmental Protection (USA) were used as an output target class. The use of the logistic regression and feature selection methods reduced the number of input variables to nine. Receiver operating characteristics (ROC) curves were used for evaluation of these approximation tools. Results showed superior performance with the NN as compared to SVM especially on training data while testing results were comparable. Feature selection did not improve accuracy; however, it helped increase the sensitivity or true positive rate (TPR). Thus, a higher TPR was obtainable with fewer variables.

  4. 44 CFR 208.38 - Reimbursement for re-supply and logistics costs incurred during Activation.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 44 Emergency Management and Assistance 1 2010-10-01 2010-10-01 false Reimbursement for re-supply and logistics costs incurred during Activation. 208.38 Section 208.38 Emergency Management and...-supply and logistics costs incurred during Activation. With the exception of emergency...

  5. Logistic distributed activation energy model--Part 1: Derivation and numerical parametric study.

    PubMed

    Cai, Junmeng; Jin, Chuan; Yang, Songyuan; Chen, Yong

    2011-01-01

    A new distributed activation energy model is presented using the logistic distribution to mathematically represent the pyrolysis kinetics of complex solid fuels. A numerical parametric study of the logistic distributed activation energy model is conducted to evaluate the influences of the model parameters on the numerical results of the model. The parameters studied include the heating rate, reaction order, frequency factor, mean of the logistic activation energy distribution, standard deviation of the logistic activation energy distribution. The parametric study addresses the dependence on the forms of the calculated α-T and dα/dT-T curves (α: reaction conversion, T: temperature). The study results would be very helpful to the application of the logistic distributed activation energy model, which is the main subject of the next part of this series.

  6. [Multiple imputation and complete case analysis in logistic regression models: a practical assessment of the impact of incomplete covariate data].

    PubMed

    Camargos, Vitor Passos; César, Cibele Comini; Caiaffa, Waleska Teixeira; Xavier, Cesar Coelho; Proietti, Fernando Augusto

    2011-12-01

    Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.

  7. The Effect of Activity-Based Costing on Logistics Management

    DTIC Science & Technology

    1993-01-01

    361 46. Current Cost Accounting System Can Trace Costs to Specific Vendors ..................... 362 47. Will Need to Accurately...elsewhere in the firm? a. What problems have the logistics functions encountered with their current cost accounting systems? 13 b. What benefits the...have already recognized problems within their current cost accounting systems. He claims the systems ". . .give managers incorrect product costing

  8. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    NASA Astrophysics Data System (ADS)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2

  9. Driver injury severity outcome analysis in rural interstate highway crashes: a two-level Bayesian logistic regression interpretation.

    PubMed

    Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi

    2016-12-01

    There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.

  10. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    NASA Astrophysics Data System (ADS)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  11. lordif: An R Package for Detecting Differential Item Functioning Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations

    PubMed Central

    Choi, Seung W.; Gibbons, Laura E.; Crane, Paul K.

    2011-01-01

    Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group–specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient–Reported Outcomes Measurement Information System. PMID:21572908

  12. The severity of Minamata disease declined in 25 years: temporal profile of the neurological findings analyzed by multiple logistic regression model.

    PubMed

    Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

    2005-01-01

    Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.

  13. Spatio-temporal analyses of cropland degradation in the irrigated lowlands of Uzbekistan using remote-sensing and logistic regression modeling.

    PubMed

    Dubovyk, Olena; Menz, Gunter; Conrad, Christopher; Kan, Elena; Machwitz, Miriam; Khamzina, Asia

    2013-06-01

    Advancing land degradation in the irrigated areas of Central Asia hinders sustainable development of this predominantly agricultural region. To support decisions on mitigating cropland degradation, this study combines linear trend analysis and spatial logistic regression modeling to expose a land degradation trend in the Khorezm region, Uzbekistan, and to analyze the causes. Time series of the 250-m MODIS NDVI, summed over the growing seasons of 2000-2010, were used to derive areas with an apparent negative vegetation trend; this was interpreted as an indicator of land degradation. About one third (161,000 ha) of the region's area experienced negative trends of different magnitude. The vegetation decline was particularly evident on the low-fertility lands bordering on the natural sandy desert, suggesting that these areas should be prioritized in mitigation planning. The results of logistic modeling indicate that the spatial pattern of the observed trend is mainly associated with the level of the groundwater table (odds = 330 %), land-use intensity (odds = 103 %), low soil quality (odds = 49 %), slope (odds = 29 %), and salinity of the groundwater (odds = 26 %). Areas, threatened by land degradation, were mapped by fitting the estimated model parameters to available data. The elaborated approach, combining remote-sensing and GIS, can form the basis for developing a common tool for monitoring land degradation trends in irrigated croplands of Central Asia.

  14. Novel Logistic Regression Model of Chest CT Attenuation Coefficient Distributions for the Automated Detection of Abnormal (Emphysema or ILD) versus Normal Lung

    PubMed Central

    Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.

    2015-01-01

    Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294

  15. Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

    SciTech Connect

    Robertson, John M.; Soehn, Matthias; Yan Di

    2010-05-01

    Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to develop a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.

  16. Using occupancy modeling and logistic regression to assess the distribution of shrimp species in lowland streams, Costa Rica: Does regional groundwater create favorable habitat?

    USGS Publications Warehouse

    Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.

    2016-01-01

    Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.

  17. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  18. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    PubMed

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.

  19. Large scale landslide susceptibility assessment using the statistical methods of logistic regression and BSA - study case: the sub-basin of the small Niraj (Transylvania Depression, Romania)

    NASA Astrophysics Data System (ADS)

    Roşca, S.; Bilaşco, Ş.; Petrea, D.; Fodorean, I.; Vescan, I.; Filip, S.; Măguţ, F.-L.

    2015-11-01

    The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2) is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area) and predictability of the logistic model (AUROC = 0.63 for the validation area).

  20. Comparison and validation of Logistic Regression and Analytic Hierarchy Process models of landslide susceptibility in monoclinic regions. A case study in Moldavian Plateau, N-E Romania

    NASA Astrophysics Data System (ADS)

    Ciprian Margarint, Mihai; Niculita, Mihai

    2014-05-01

    The regions with monoclinic geological structure are large portions of earth surface where the repetition of similar landform patterns is very distinguished, the scarps of cuestas being characterized by similar values of morphometrical variables. Landslides are associated with these scarps of cuestas and consequently, a very high value of landslide susceptibility can be reported on its surface. In these regions, landslide susceptibility mapping can be realized for the entire region, or for test areas, with accurate, reliable, and available datasets, concerning multi-temporal inventories and landslide predictors. Because of the similar geomorphologic and landslide distribution we think that if any relevance of using test areas for extrapolating susceptibility models is present, these areas should be targeted first. This study case try to establish the level of usability of landslide predictors influence, obtained for a 90 km2 sample located in the northern part of the Moldavian Plateau (N-E Romania), in other areas of the same physio-geographic region. In a first phase, landslide susceptibility assessment was carried out and validated using logistic regression (LR) approach, using a multiple landslide inventory. This inventory was created using ortorectified aerial images from 1978 and 2005, for each period being considered both old and active landslides. The modeling strategy was based on a distinctly inventory of depletion areas of all landslide, for 1978 phase, and on a number of 30 covariates extracted from topographical and aerial images (both from 1978 and 2005 periods). The geomorphometric variables were computed from a Digital Elevation Model (DEM) obtained by interpolation from 1:5000 contour data (2.5 m equidistance), at 10x10 m resolution. Distance from river network, distance from roads and land use were extracted from topographic maps and aerial images. By applying Akaike Information Criterion (AIC) the covariates with significance under 0.001 level

  1. Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene Interaction in a case-control study

    PubMed Central

    2009-01-01

    Background There is a growing awareness that interaction between multiple genes play an important role in the risk of common, complex multi-factorial diseases. Many common diseases are affected by certain genotype combinations (associated with some genes and their interactions). The identification and characterization of these susceptibility genes and gene-gene interaction have been limited by small sample size and large number of potential interactions between genes. Several methods have been proposed to detect gene-gene interaction in a case control study. The penalized logistic regression (PLR), a variant of logistic regression with L2 regularization, is a parametric approach to detect gene-gene interaction. On the other hand, the Multifactor Dimensionality Reduction (MDR) is a nonparametric and genetic model-free approach to detect genotype combinations associated with disease risk. Methods We compared the power of MDR and PLR for detecting two-way and three-way interactions in a case-control study through extensive simulations. We generated several interaction models with different magnitudes of interaction effect. For each model, we simulated 100 datasets, each with 200 cases and 200 controls and 20 SNPs. We considered a wide variety of models such as models with just main effects, models with only interaction effects or models with both main and interaction effects. We also compared the performance of MDR and PLR to detect gene-gene interaction associated with acute rejection(AR) in kidney transplant patients. Results In this paper, we have studied the power of MDR and PLR for detecting gene-gene interaction in a case-control study through extensive simulation. We have compared their performances for different two-way and three-way interaction models. We have studied the effect of different allele frequencies on these methods. We have also implemented their performance on a real dataset. As expected, none of these methods were consistently better for all

  2. Forest cover dynamics analysis and prediction modelling using logistic regression model (case study: forest cover at Indragiri Hulu Regency, Riau Province)

    NASA Astrophysics Data System (ADS)

    Nahib, Irmadi; Suryanta, Jaka

    2017-01-01

    Forest destruction, climate change and global warming could reduce an indirect forest benefit because forest is the largest carbon sink and it plays a very important role in global carbon cycle. To support Reducing Emissions from Deforestation and Forest Degradation (REDD +) program, people pay attention of forest cover changes as the basis for calculating carbon stock changes. This study try to explore the forest cover dynamics as well as the prediction model of forest cover in Indragiri Hulu Regency, Riau Province Indonesia. The study aims to analyse some various explanatory variables associated with forest conversion processes and predict forest cover change using logistic regression model (LRM). The main data used in this study is Land use/cover map (1990 – 2011). Performance of developed model was assessed through a comparison of the predicted model of forest cover change and the actual forest cover in 2011. The analysis result showed that forest cover has decreased continuously between 1990 and 2011, up to the loss of 165,284.82 ha (35.19 %) of forest area. The LRM successfully predicted the forest cover for the period 2010 with reasonably high accuracy (ROC = 92.97 % and 70.26 %).

  3. Supply and demand analysis for flood insurance by using logistic regression model: case study at Citarum watershed in South Bandung, West Java, Indonesia

    NASA Astrophysics Data System (ADS)

    Sidi, P.; Mamat, M.; Sukono; Supian, S.

    2017-01-01

    Floods have always occurred in the Citarum river basin. The adverse effects caused by floods can cover all their property, including the destruction of houses. The impact due to damage to residential buildings is usually not small. Indeed, each of flooding, the government and several social organizations providing funds to repair the building. But the donations are given very limited, so it cannot cover the entire cost of repair was necessary. The presence of insurance products for property damage caused by the floods is considered very important. However, if its presence is also considered necessary by the public or not? In this paper, the factors that affect the supply and demand of insurance product for damaged building due to floods are analyzed. The method used in this analysis is the ordinal logistic regression. Based on the analysis that the factors that affect the supply and demand of insurance product for damaged building due to floods, it is included: age, economic circumstances, family situations, insurance motivations, and lifestyle. Simultaneously that the factors affecting supply and demand of insurance product for damaged building due to floods mounted to 65.7%.

  4. A Simulation Study to Assess the Effect of the Number of Response Categories on the Power of Ordinal Logistic Regression for Differential Item Functioning Analysis in Rating Scales

    PubMed Central

    Allahyari, Elahe; Jafari, Peyman; Bagheri, Zahra

    2016-01-01

    Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected. PMID:27403207

  5. Shallow landslide susceptibility model for the Oria river basin, Gipuzkoa province (North of Spain). Application of the logistic regression and comparison with previous studies.

    NASA Astrophysics Data System (ADS)

    Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange

    2016-04-01

    In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.

  6. Emergency department mental health presentations by people born in refugee source countries: an epidemiological logistic regression study in a Medicare Local region in Australia.

    PubMed

    Enticott, Joanne C; Cheng, I-Hao; Russell, Grant; Szwarc, Josef; Braitberg, George; Peek, Anne; Meadows, Graham

    2015-01-01

    This study investigated if people born in refugee source countries are disproportionately represented among those receiving a diagnosis of mental illness within emergency departments (EDs). The setting was the Cities of Greater Dandenong and Casey, the resettlement region for one-twelfth of Australia's refugees. An epidemiological, secondary data analysis compared mental illness diagnoses received in EDs by refugee and non-refugee populations. Data was the Victorian Emergency Minimum Dataset in the 2008-09 financial year. Univariate and multivariate logistic regression created predictive models for mental illness using five variables: age, sex, refugee background, interpreter use and preferred language. Collinearity, model fit and model stability were examined. Multivariate analysis showed age and sex to be the only significant risk factors for mental illness diagnosis in EDs. 'Refugee status', 'interpreter use' and 'preferred language' were not associatedwith a mental health diagnosis following risk adjustment forthe effects ofage and sex. The disappearance ofthe univariate association after adjustment for age and sex is a salutary lesson for Medicare Locals and other health planners regarding the importance of adjusting analyses of health service data for demographic characteristics.

  7. Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men

    SciTech Connect

    Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

    1985-08-01

    The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

  8. Spatial prediction of Lactarius deliciosus and Lactarius salmonicolor mushroom distribution with logistic regression models in the Kızılcasu Planning Unit, Turkey.

    PubMed

    Mumcu Kucuker, Derya; Baskent, Emin Zeki

    2015-01-01

    Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans.

  9. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo

    PubMed Central

    JACOB, BENJAMIN G.; NOVAK, ROBERT J.; TOE, LAURENT; SANFO, MOUSSA S.; AFRIYIE, ABENA N.; IBRAHIM, MOHAMMED A.; GRIFFITH, DANIEL A.; UNNASCH, THOMAS R.

    2013-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  10. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo.

    PubMed

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  11. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  12. Yap/Taz transcriptional activity is essential for vascular regression via Ctgf expression and actin polymerization

    PubMed Central

    Nagasawa-Masuda, Ayumi; Terai, Kenta

    2017-01-01

    Vascular regression is essential to remove redundant vessels during the formation of an efficient vascular network that can transport oxygen and nutrient to every corner of the body. However, no mechanism is known to explain how major blood vessels regress during development. Here we use the dorsal part of the caudal vein plexus (dCVP) in Zebrafish to investigate the mechanism of regression and discover a new role of Yap/Taz in vascular regression. During regression, Yap/Taz is activated by blood circulation in the endothelial cells. This leads to induction of Ctgf and actin polymerization. Interference with Yap/Taz activation decreased Ctgf production, which decreased actin polymerization and vascular regression. These results implicate a novel role of Yap/Taz in vascular regression. PMID:28369143

  13. Predicting reintubation, prolonged mechanical ventilation and death in post-coronary artery bypass graft surgery: a comparison between artificial neural networks and logistic regression models

    PubMed Central

    Mendes, Renata G.; de Souza, César R.; Machado, Maurício N.; Correa, Paulo R.; Di Thommazo-Luporini, Luciana; Arena, Ross; Myers, Jonathan; Pizzolato, Ednaldo B.

    2015-01-01

    Introduction In coronary artery bypass (CABG) surgery, the common complications are the need for reintubation, prolonged mechanical ventilation (PMV) and death. Thus, a reliable model for the prognostic evaluation of those particular outcomes is a worthwhile pursuit. The existence of such a system would lead to better resource planning, cost reductions and an increased ability to guide preventive strategies. The aim of this study was to compare different methods – logistic regression (LR) and artificial neural networks (ANNs) – in accomplishing this goal. Material and methods Subjects undergoing CABG (n = 1315) were divided into training (n = 1053) and validation (n = 262) groups. The set of independent variables consisted of age, gender, weight, height, body mass index, diabetes, creatinine level, cardiopulmonary bypass, presence of preserved ventricular function, moderate and severe ventricular dysfunction and total number of grafts. The PMV was also an input for the prediction of death. The ability of ANN to discriminate outcomes was assessed using receiver-operating characteristic (ROC) analysis and the results were compared using a multivariate LR. Results The ROC curve areas for LR and ANN models, respectively, were: for reintubation 0.62 (CI: 0.50–0.75) and 0.65 (CI: 0.53–0.77); for PMV 0.67 (CI: 0.57–0.78) and 0.72 (CI: 0.64–0.81); and for death 0.86 (CI: 0.79–0.93) and 0.85 (CI: 0.80–0.91). No differences were observed between models. Conclusions The ANN has similar discriminating power in predicting reintubation, PMV and death outcomes. Thus, both models may be applicable as a predictor for these outcomes in subjects undergoing CABG. PMID:26322087

  14. Use of Logistic Regression for Prediction of the Fate of Staphylococcus aureus in Pasteurized Milk in the Presence of Two Lytic Phages ▿

    PubMed Central

    Obeso, José M.; García, Pilar; Martínez, Beatriz; Arroyo-López, Francisco Noé; Garrido-Fernández, Antonio; Rodriguez, Ana

    2010-01-01

    The use of bacteriophages provides an attractive approach to the fight against food-borne pathogenic bacteria, since they can be found in different environments and are unable to infect humans, both characteristics of which support their use as biocontrol agents. Two lytic bacteriophages, vB_SauS-phiIPLA35 (phiIPLA35) and vB_SauS-phiIPLA88 (phiIPLA88), previously isolated from the dairy environment inhibited the growth of Staphylococcus aureus. To facilitate the successful application of both bacteriophages as biocontrol agents, probabilistic models for predicting S. aureus inactivation by the phages in pasteurized milk were developed. A linear logistic regression procedure was used to describe the survival/death interface of S. aureus after 8 h of storage as a function of the initial phage titer (2 to 8 log10 PFU/ml), initial bacterial contamination (2 to 6 log10 CFU/ml), and temperature (15 to 37°C). Two successive models were built, with the first including only data from the experimental design and a global one in which results derived from the validation experiments were also included. The temperature, interaction temperature-initial level of bacterial contamination, and initial level of bacterial contamination-phage titer contributed significantly to the first model prediction. However, only the phage titer and temperature were significantly involved in the global model prediction. The predictions of both models were fail-safe and highly consistent with the observed S. aureus responses. Nevertheless, the global model, deduced from a higher number of experiments (with a higher degree of freedom), was dependent on a lower number of variables and had an apparent better fit. Therefore, it can be considered a convenient evolution of the first model. Besides, the global model provides the minimum phage concentration (about 2 × 108 PFU/ml) required to inactivate S. aureus in milk at different temperatures, irrespective of the bacterial contamination level. PMID

  15. Hip fracture risk assessment: artificial neural network outperforms conditional logistic regression in an age- and sex-matched case control study

    PubMed Central

    2013-01-01

    Background Osteoporotic hip fractures with a significant morbidity and excess mortality among the elderly have imposed huge health and economic burdens on societies worldwide. In this age- and sex-matched case control study, we examined the risk factors of hip fractures and assessed the fracture risk by conditional logistic regression (CLR) and ensemble artificial neural network (ANN). The performances of these two classifiers were compared. Methods The study population consisted of 217 pairs (149 women and 68 men) of fractures and controls with an age older than 60 years. All the participants were interviewed with the same standardized questionnaire including questions on 66 risk factors in 12 categories. Univariate CLR analysis was initially conducted to examine the unadjusted odds ratio of all potential risk factors. The significant risk factors were then tested by multivariate analyses. For fracture risk assessment, the participants were randomly divided into modeling and testing datasets for 10-fold cross validation analyses. The predicting models built by CLR and ANN in modeling datasets were applied to testing datasets for generalization study. The performances, including discrimination and calibration, were compared with non-parametric Wilcoxon tests. Results In univariate CLR analyses, 16 variables achieved significant level, and six of them remained significant in multivariate analyses, including low T score, low BMI, low MMSE score, milk intake, walking difficulty, and significant fall at home. For discrimination, ANN outperformed CLR in both 16- and 6-variable analyses in modeling and testing datasets (p?

  16. Improving Case-Based Reasoning Systems by Combining K-Nearest Neighbour Algorithm with Logistic Regression in the Prediction of Patients’ Registration on the Renal Transplant Waiting List

    PubMed Central

    Campillo-Gimenez, Boris; Jouini, Wassim; Bayat, Sahar; Cuggia, Marc

    2013-01-01

    Introduction Case-based reasoning (CBR) is an emerging decision making paradigm in medical research where new cases are solved relying on previously solved similar cases. Usually, a database of solved cases is provided, and every case is described through a set of attributes (inputs) and a label (output). Extracting useful information from this database can help the CBR system providing more reliable results on the yet to be solved cases. Objective We suggest a general framework where a CBR system, viz. K-Nearest Neighbour (K-NN) algorithm, is combined with various information obtained from a Logistic Regression (LR) model, in order to improve prediction of access to the transplant waiting list. Methods LR is applied, on the case database, to assign weights to the attributes as well as the solved cases. Thus, five possible decision making systems based on K-NN and/or LR were identified: a standalone K-NN, a standalone LR and three soft K-NN algorithms that rely on the weights based on the results of the LR. The evaluation was performed under two conditions, either using predictive factors known to be related to registration, or using a combination of factors related and not related to registration. Results and Conclusion The results show that our suggested approach, where the K-NN algorithm relies on both weighted attributes and cases, can efficiently deal with non relevant attributes, whereas the four other approaches suffer from this kind of noisy setups. The robustness of this approach suggests interesting perspectives for medical problem solving tools using CBR methodology. PMID:24039730

  17. Comparing performances of logistic regression and neural networks for predicting melatonin excretion patterns in the rat exposed to ELF magnetic fields.

    PubMed

    Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi

    2010-02-01

    Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in

  18. Predicting Antitumor Activity of Peptides by Consensus of Regression Models Trained on a Small Data Sample

    PubMed Central

    Radman, Andreja; Gredičak, Matija; Kopriva, Ivica; Jerić, Ivanka

    2011-01-01

    Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met) with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel) support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM) regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample. PMID:22272081

  19. Using multiple linear regression model to estimate thunderstorm activity

    NASA Astrophysics Data System (ADS)

    Suparta, W.; Putro, W. S.

    2017-03-01

    This paper is aimed to develop a numerical model with the use of a nonlinear model to estimate the thunderstorm activity. Meteorological data such as Pressure (P), Temperature (T), Relative Humidity (H), cloud (C), Precipitable Water Vapor (PWV), and precipitation on a daily basis were used in the proposed method. The model was constructed with six configurations of input and one target output. The output tested in this work is the thunderstorm event when one-year data is used. Results showed that the model works well in estimating thunderstorm activities with the maximum epoch reaching 1000 iterations and the percent error was found below 50%. The model also found that the thunderstorm activities in May and October are detected higher than the other months due to the inter-monsoon season.

  20. Three-Level Mixed-Effects Logistic Regression Analysis Reveals Complex Epidemiology of Swine Rotaviruses in Diagnostic Samples from North America

    PubMed Central

    Diaz, Andres; Rossow, Stephanie; Ciarlet, Max; Marthaler, Douglas

    2016-01-01

    Rotaviruses (RV) are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively) had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82%) were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs) were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs) for RVA and RVB detection were lower for 1–3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1–3 day old pigs was significantly higher (p < 0.001) than pigs in the 4–20 days old and >55 day old age groups. Furthermore, pigs in the 21–55 day old age group had statistically higher cORs of RV co-detection compared to 1–3 day old pigs (p < 0.001). The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America. PMID:27145176

  1. Plasma Cholesterol–Induced Lesion Networks Activated before Regression of Early, Mature, and Advanced Atherosclerosis

    PubMed Central

    Björkegren, Johan L. M.; Hägg, Sara; Jain, Rajeev K.; Cedergren, Cecilia; Shang, Ming-Mei; Rossignoli, Aránzazu; Takolander, Rabbe; Melander, Olle; Hamsten, Anders; Michoel, Tom; Skogsberg, Josefin

    2014-01-01

    Plasma cholesterol lowering (PCL) slows and sometimes prevents progression of atherosclerosis and may even lead to regression. Little is known about how molecular processes in the atherosclerotic arterial wall respond to PCL and modify responses to atherosclerosis regression. We studied atherosclerosis regression and global gene expression responses to PCL (≥80%) and to atherosclerosis regression itself in early, mature, and advanced lesions. In atherosclerotic aortic wall from Ldlr−/−Apob 100/100 Mttp flox/floxMx1-Cre mice, atherosclerosis regressed after PCL regardless of lesion stage. However, near-complete regression was observed only in mice with early lesions; mice with mature and advanced lesions were left with regression-resistant, relatively unstable plaque remnants. Atherosclerosis genes responding to PCL before regression, unlike those responding to the regression itself, were enriched in inherited risk for coronary artery disease and myocardial infarction, indicating causality. Inference of transcription factor (TF) regulatory networks of these PCL-responsive gene sets revealed largely different networks in early, mature, and advanced lesions. In early lesions, PPARG was identified as a specific master regulator of the PCL-responsive atherosclerosis TF-regulatory network, whereas in mature and advanced lesions, the specific master regulators were MLL5 and SRSF10/XRN2, respectively. In a THP-1 foam cell model of atherosclerosis regression, siRNA targeting of these master regulators activated the time-point-specific TF-regulatory networks and altered the accumulation of cholesterol esters. We conclude that PCL leads to complete atherosclerosis regression only in mice with early lesions. Identified master regulators and related PCL-responsive TF-regulatory networks will be interesting targets to enhance PCL-mediated regression of mature and advanced atherosclerotic lesions. PMID:24586211

  2. Plasma cholesterol-induced lesion networks activated before regression of early, mature, and advanced atherosclerosis.

    PubMed

    Björkegren, Johan L M; Hägg, Sara; Talukdar, Husain A; Foroughi Asl, Hassan; Jain, Rajeev K; Cedergren, Cecilia; Shang, Ming-Mei; Rossignoli, Aránzazu; Takolander, Rabbe; Melander, Olle; Hamsten, Anders; Michoel, Tom; Skogsberg, Josefin

    2014-02-01

    Plasma cholesterol lowering (PCL) slows and sometimes prevents progression of atherosclerosis and may even lead to regression. Little is known about how molecular processes in the atherosclerotic arterial wall respond to PCL and modify responses to atherosclerosis regression. We studied atherosclerosis regression and global gene expression responses to PCL (≥80%) and to atherosclerosis regression itself in early, mature, and advanced lesions. In atherosclerotic aortic wall from Ldlr(-/-)Apob (100/100) Mttp (flox/flox)Mx1-Cre mice, atherosclerosis regressed after PCL regardless of lesion stage. However, near-complete regression was observed only in mice with early lesions; mice with mature and advanced lesions were left with regression-resistant, relatively unstable plaque remnants. Atherosclerosis genes responding to PCL before regression, unlike those responding to the regression itself, were enriched in inherited risk for coronary artery disease and myocardial infarction, indicating causality. Inference of transcription factor (TF) regulatory networks of these PCL-responsive gene sets revealed largely different networks in early, mature, and advanced lesions. In early lesions, PPARG was identified as a specific master regulator of the PCL-responsive atherosclerosis TF-regulatory network, whereas in mature and advanced lesions, the specific master regulators were MLL5 and SRSF10/XRN2, respectively. In a THP-1 foam cell model of atherosclerosis regression, siRNA targeting of these master regulators activated the time-point-specific TF-regulatory networks and altered the accumulation of cholesterol esters. We conclude that PCL leads to complete atherosclerosis regression only in mice with early lesions. Identified master regulators and related PCL-responsive TF-regulatory networks will be interesting targets to enhance PCL-mediated regression of mature and advanced atherosclerotic lesions.

  3. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  4. Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

    ERIC Educational Resources Information Center

    Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.

    2013-01-01

    The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

  5. RESEARCH PAPER: A logistic model for magnetic energy storage in solar active regions

    NASA Astrophysics Data System (ADS)

    Wang, Hua-Ning; Cui, Yan-Mei; He, Han

    2009-06-01

    Previous statistical analyses of a large number of SOHO/MDI full disk longitudinal magnetograms provided a result that demonstrated how responses of solar flares to photospheric magnetic properties can be fitted with sigmoid functions. A logistic model reveals that these fitted sigmoid functions might be related to the free energy storage process in solar active regions. Although this suggested model is rather simple, the free energy level of active regions can be estimated and the probability of a solar flare with importance over a threshold can be forecast within a given time window.

  6. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  7. Impact of Colic Pain as a Significant Factor for Predicting the Stone Free Rate of One-Session Shock Wave Lithotripsy for Treating Ureter Stones: A Bayesian Logistic Regression Model Analysis

    PubMed Central

    Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong

    2015-01-01

    Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059

  8. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    ERIC Educational Resources Information Center

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  9. DNA Alkylating Therapy Induces Tumor Regression through an HMGB1-Mediated Activation of Innate Immunity

    PubMed Central

    Guerriero, Jennifer L.; Ditsworth, Dara; Catanzaro, Joseph M.; Sabino, Gregory; Furie, Martha B.; Kew, Richard R.; Crawford, Howard C.; Zong, Wei-Xing

    2011-01-01

    Dysregulation of apoptosis is associated with the development of human cancer and resistance to anticancer therapy. We have previously shown in tumor xenografts that DNA alkylating agents induce sporadic cell necrosis and regression of apoptosis-deficient tumors. Sporadic tumor cell necrosis is associated with extracellular release of cellular content such as the high mobility group box 1 (HMGB1) protein and subsequent recruitment of innate immune cells into the tumor tissue. It remained unclear whether HMGB1 and the activation of innate immunity played a role in tumor response to chemotherapy. In this study, we show that whereas DNA alkylating therapy leads to a complete tumor regression in an athymic mouse tumor xenograft model, it fails to do so in tumors deficient in HMGB1. The HMGB1-deficient tumors have an impaired ability to recruit innate immune cells including macrophages, neutrophils, and NK cells into the treated tumor tissue. Cytokine array analysis reveals that whereas DNA alkylating treatment leads to suppression of protumor cytokines such as IL-4, IL-10, and IL-13, loss of HMGB1 leads to elevated levels of these cytokines upon treatment. Suppression of innate immunity and HMGB1 using depleting Abs leads to a failure in tumor regression. Taken together, these results indicate that HMGB1 plays an essential role in activation of innate immunity and tumor clearance in response to DNA alkylating agents. PMID:21300822

  10. Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

    PubMed Central

    AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

    2016-01-01

    Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463

  11. Intermittent Metronomic Drug Schedule Is Essential for Activating Antitumor Innate Immunity and Tumor Xenograft Regression12

    PubMed Central

    Chen, Chong-Sheng; Doloff, Joshua C; Waxman, David J

    2014-01-01

    Metronomic chemotherapy using cyclophosphamide (CPA) is widely associated with antiangiogenesis; however, recent studies implicate other immune-based mechanisms, including antitumor innate immunity, which can induce major tumor regression in implanted brain tumor models. This study demonstrates the critical importance of drug schedule: CPA induced a potent antitumor innate immune response and tumor regression when administered intermittently on a 6-day repeating metronomic schedule but not with the same total exposure to activated CPA administered on an every 3-day schedule or using a daily oral regimen that serves as the basis for many clinical trials of metronomic chemotherapy. Notably, the more frequent metronomic CPA schedules abrogated the antitumor innate immune and therapeutic responses. Further, the innate immune response and antitumor activity both displayed an unusually steep dose-response curve and were not accompanied by antiangiogenesis. The strong recruitment of innate immune cells by the 6-day repeating CPA schedule was not sustained, and tumor regression was abolished, by a moderate (25%) reduction in CPA dose. Moreover, an ∼20% increase in CPA dose eliminated the partial tumor regression and weak innate immune cell recruitment seen in a subset of the every 6-day treated tumors. Thus, metronomic drug treatment must be at a sufficiently high dose but also sufficiently well spaced in time to induce strong sustained antitumor immune cell recruitment. Many current clinical metronomic chemotherapeutic protocols employ oral daily low-dose schedules that do not meet these requirements, suggesting that they may benefit from optimization designed to maximize antitumor immune responses. PMID:24563621

  12. Gender differences in French GPs' activity: the contribution of quantile regressions.

    PubMed

    Dumontet, Magali; Franc, Carine

    2015-05-01

    In any fee-for-service system, doctors may be encouraged to increase the number of services (private activity) they provide to receive a higher income. Studying private activity determinants helps to predict doctors' provision of care. In the context of strong feminization and heterogeneity in general practitioners' (GP) behavior, we first aim to measure the effects of the determinants of private activity. Second, we study the evolution of these effects along the private activity distribution. Third, we examine the differences between male and female GPs. From an exhaustive database of French GPs working in private practice in 2008, we performed an ordinary least squares (OLS) regression and quantile regressions (QR) on the GPs' private activity. Among other determinants, we examined the trade-offs within the GPs' household considering his/her marital status, spousal income, and children. While the OLS results showed that female GPs had less private activity than male GPs (-13%), the QR results emphasized a private activity gender gap that increased significantly in the upper tail of the distribution. We also find gender differences in the private activity determinants, including family structure, practice characteristics, and case-mix variables. For instance, having a youngest child under 12 years old had a positive effect on the level of private activity for male GPs and a negative effect for female GPs. The results allow us to understand to what extent the supply of care differs between male and female GPs. In the context of strong feminization, this is essential to consider for organizing and forecasting the GPs' supply of care.

  13. KSC ISS Logistics Support

    NASA Technical Reports Server (NTRS)

    Tellado, Joseph

    2014-01-01

    The presentation contains a status of KSC ISS Logistics Operations. It basically presents current top level ISS Logistics tasks being conducted at KSC, current International Partner activities, hardware processing flow focussing on late Stow operations, list of KSC Logistics POC's, and a backup list of Logistics launch site services. This presentation is being given at the annual International Space Station (ISS) Multi-lateral Logistics Maintenance Control Panel meeting to be held in Turin, Italy during the week of May 13-16. The presentatiuon content doesn't contain any potential lessons learned.

  14. Quantitative structure-activity relationship of the curcumin-related compounds using various regression methods

    NASA Astrophysics Data System (ADS)

    Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi

    2016-03-01

    Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.

  15. The management challenge for household waste in emerging economies like Brazil: realistic source separation and activation of reverse logistics.

    PubMed

    Fehr, M

    2014-09-01

    Business opportunities in the household waste sector in emerging economies still evolve around the activities of bulk collection and tipping with an open material balance. This research, conducted in Brazil, pursued the objective of shifting opportunities from tipping to reverse logistics in order to close the balance. To do this, it illustrated how specific knowledge of sorted waste composition and reverse logistics operations can be used to determine realistic temporal and quantitative landfill diversion targets in an emerging economy context. Experimentation constructed and confirmed the recycling trilogy that consists of source separation, collection infrastructure and reverse logistics. The study on source separation demonstrated the vital difference between raw and sorted waste compositions. Raw waste contained 70% biodegradable and 30% inert matter. Source separation produced 47% biodegradable, 20% inert and 33% mixed material. The study on collection infrastructure developed the necessary receiving facilities. The study on reverse logistics identified private operators capable of collecting and processing all separated inert items. Recycling activities for biodegradable material were scarce and erratic. Only farmers would take the material as animal feed. No composting initiatives existed. The management challenge was identified as stimulating these activities in order to complete the trilogy and divert the 47% source-separated biodegradable discards from the landfills.

  16. Clinical regressions and broad immune activation following combination therapy targeting human NKT cells in myeloma

    PubMed Central

    Richter, Joshua; Neparidze, Natalia; Zhang, Lin; Nair, Shiny; Monesmith, Tamara; Sundaram, Ranjini; Miesowicz, Fred; Dhodapkar, Kavita M.

    2013-01-01

    Natural killer T (iNKT) cells can help mediate immune surveillance against tumors in mice. Prior studies targeting human iNKT cells were limited to therapy of advanced cancer and led to only modest activation of innate immunity. Clinical myeloma is preceded by an asymptomatic precursor phase. Lenalidomide was shown to mediate antigen-specific costimulation of human iNKT cells. We treated 6 patients with asymptomatic myeloma with 3 cycles of combination of α-galactosylceramide–loaded monocyte-derived dendritic cells and low-dose lenalidomide. Therapy was well tolerated and led to reduction in tumor-associated monoclonal immunoglobulin in 3 of 4 patients with measurable disease. Combination therapy led to activation-induced decline in measurable iNKT cells and activation of NK cells with an increase in NKG2D and CD56 expression. Treatment also led to activation of monocytes with an increase in CD16 expression. Each cycle of therapy was associated with induction of eosinophilia as well as an increase in serum soluble IL2 receptor. Clinical responses correlated with pre-existing or treatment-induced antitumor T-cell immunity. These data demonstrate synergistic activation of several innate immune cells by this combination and the capacity to mediate tumor regression. Combination therapies targeting iNKT cells may be of benefit toward prevention of cancer in humans (trial registered at clinicaltrials.gov: NCT00698776). PMID:23100308

  17. Clinical regressions and broad immune activation following combination therapy targeting human NKT cells in myeloma.

    PubMed

    Richter, Joshua; Neparidze, Natalia; Zhang, Lin; Nair, Shiny; Monesmith, Tamara; Sundaram, Ranjini; Miesowicz, Fred; Dhodapkar, Kavita M; Dhodapkar, Madhav V

    2013-01-17

    Natural killer T (iNKT) cells can help mediate immune surveillance against tumors in mice. Prior studies targeting human iNKT cells were limited to therapy of advanced cancer and led to only modest activation of innate immunity. Clinical myeloma is preceded by an asymptomatic precursor phase. Lenalidomide was shown to mediate antigen-specific costimulation of human iNKT cells. We treated 6 patients with asymptomatic myeloma with 3 cycles of combination of α-galactosylceramide-loaded monocyte-derived dendritic cells and low-dose lenalidomide. Therapy was well tolerated and led to reduction in tumor-associated monoclonal immunoglobulin in 3 of 4 patients with measurable disease. Combination therapy led to activation-induced decline in measurable iNKT cells and activation of NK cells with an increase in NKG2D and CD56 expression. Treatment also led to activation of monocytes with an increase in CD16 expression. Each cycle of therapy was associated with induction of eosinophilia as well as an increase in serum soluble IL2 receptor. Clinical responses correlated with pre-existing or treatment-induced antitumor T-cell immunity. These data demonstrate synergistic activation of several innate immune cells by this combination and the capacity to mediate tumor regression. Combination therapies targeting iNKT cells may be of benefit toward prevention of cancer in humans.

  18. A Latent Transition Model with Logistic Regression

    ERIC Educational Resources Information Center

    Chung, Hwan; Walls, Theodore A.; Park, Yousung

    2007-01-01

    Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian…

  19. 78 FR 73872 - Agency Information Collection Activities: Proposed Collection; Comment Request; Logistics...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-09

    ... or material, all submissions will be posted, without change, to the Federal eRulemaking Portal at... further enhance strengths. The LCAT is authorized by Public Law 109-295, Department of Homeland Security... their disaster logistics planning and response capabilities and identify areas of relative strength...

  20. Logistics of using the Actiheart physical activity monitors in urban Mexico among 7- to 9-year-old children.

    PubMed

    Wilson, Hannah; Dickinson, Federico; Griffiths, Paula; Bogin, Barry; Varela-Silva, Maria Inês

    2011-01-01

    Logistics of using new measurement devices are important to understand when developing protocols. This paper discusses the logistics of using Actiheart physical activity monitors on children in an urban, tropical environment in a developing country. Actiheart monitoring of 36 children aged 7-9 years old was undertaken for 7 days in the city of Mérida, Yucatán, Mexico. The Actiheart proved fragile for children and difficult to mend in the field. The excessive sweating due to the tropical climate caused poor adherence of the electrode pads, requiring a pad change midway through and extra pads to be provided. Also extra time was needed to be allotted for increased instructions to participants and their mothers and for individual calibration. When collecting objectively measured physical activity data under harsh conditions, the protocol must accommodate local conditions and device limitations and allow increased time with participants to obtain good quality data.

  1. Epigenome-Guided Analysis of the Transcriptome of Plaque Macrophages during Atherosclerosis Regression Reveals Activation of the Wnt Signaling Pathway

    PubMed Central

    Menon, Prashanthi; Podolsky, Irina; Feig, Jonathan E.; Aderem, Alan; Fisher, Edward A.; Gold, Elizabeth S.

    2014-01-01

    We report the first systems biology investigation of regulators controlling arterial plaque macrophage transcriptional changes in response to lipid lowering in vivo in two distinct mouse models of atherosclerosis regression. Transcriptome measurements from plaque macrophages from the Reversa mouse were integrated with measurements from an aortic transplant-based mouse model of plaque regression. Functional relevance of the genes detected as differentially expressed in plaque macrophages in response to lipid lowering in vivo was assessed through analysis of gene functional annotations, overlap with in vitro foam cell studies, and overlap of associated eQTLs with human atherosclerosis/CAD risk SNPs. To identify transcription factors that control plaque macrophage responses to lipid lowering in vivo, we used an integrative strategy – leveraging macrophage epigenomic measurements – to detect enrichment of transcription factor binding sites upstream of genes that are differentially expressed in plaque macrophages during regression. The integrated analysis uncovered eight transcription factor binding site elements that were statistically overrepresented within the 5′ regulatory regions of genes that were upregulated in plaque macrophages in the Reversa model under maximal regression conditions and within the 5′ regulatory regions of genes that were upregulated in the aortic transplant model during regression. Of these, the TCF/LEF binding site was present in promoters of upregulated genes related to cell motility, suggesting that the canonical Wnt signaling pathway may be activated in plaque macrophages during regression. We validated this network-based prediction by demonstrating that β-catenin expression is higher in regressing (vs. control group) plaques in both regression models, and we further demonstrated that stimulation of canonical Wnt signaling increases macrophage migration in vitro. These results suggest involvement of canonical Wnt signaling in

  2. Transcriptome Analysis of Spermatogenically Regressed, Recrudescent and Active Phase Testis of Seasonally Breeding Wall Lizards Hemidactylus flaviviridis

    PubMed Central

    Gautam, Mukesh; Mathur, Amitabh; Khan, Meraj Alam; Majumdar, Subeer S.; Rai, Umesh

    2013-01-01

    Background Reptiles are phylogenically important group of organisms as mammals have evolved from them. Wall lizard testis exhibits clearly distinct morphology during various phases of a reproductive cycle making them an interesting model to study regulation of spermatogenesis. Studies on reptile spermatogenesis are negligible hence this study will prove to be an important resource. Methodology/Principal Findings Histological analyses show complete regression of seminiferous tubules during regressed phase with retracted Sertoli cells and spermatognia. In the recrudescent phase, regressed testis regain cellular activity showing presence of normal Sertoli cells and developing germ cells. In the active phase, testis reaches up to its maximum size with enlarged seminiferous tubules and presence of sperm in seminiferous lumen. Total RNA extracted from whole testis of regressed, recrudescent and active phase of wall lizard was hybridized on Mouse Whole Genome 8×60 K format gene chip. Microarray data from regressed phase was deemed as control group. Microarray data were validated by assessing the expression of some selected genes using Quantitative Real-Time PCR. The genes prominently expressed in recrudescent and active phase testis are cytoskeleton organization GO 0005856, cell growth GO 0045927, GTpase regulator activity GO: 0030695, transcription GO: 0006352, apoptosis GO: 0006915 and many other biological processes. The genes showing higher expression in regressed phase belonged to functional categories such as negative regulation of macromolecule metabolic process GO: 0010605, negative regulation of gene expression GO: 0010629 and maintenance of stem cell niche GO: 0045165. Conclusion/Significance This is the first exploratory study profiling transcriptome of three drastically different conditions of any reptilian testis. The genes expressed in the testis during regressed, recrudescent and active phase of reproductive cycle are in concordance with the testis

  3. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    USGS Publications Warehouse

    Foster, Guy M.; Graham, Jennifer L.

    2016-04-06

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  4. Sequential PET/CT with [18F]-FDG Predicts Pathological Tumor Response to Preoperative Short Course Radiotherapy with Delayed Surgery in Patients with Locally Advanced Rectal Cancer Using Logistic Regression Analysis

    PubMed Central

    Pecori, Biagio; Lastoria, Secondo; Caracò, Corradina; Celentani, Marco; Tatangelo, Fabiana; Avallone, Antonio; Rega, Daniela; De Palma, Giampaolo; Mormile, Maria; Budillon, Alfredo; Muto, Paolo; Bianco, Francesco; Aloj, Luigi; Petrillo, Antonella; Delrio, Paolo

    2017-01-01

    Previous studies indicate that FDG PET/CT may predict pathological response in patients undergoing neoadjuvant chemo-radiotherapy for locally advanced rectal cancer (LARC). Aim of the current study is evaluate if pathological response can be similarly predicted in LARC patients after short course radiation therapy alone. Methods: Thirty-three patients with cT2-3, N0-2, M0 rectal adenocarcinoma treated with hypo fractionated short course neoadjuvant RT (5x5 Gy) with delayed surgery (SCRTDS) were prospectively studied. All patients underwent 3 PET/CT studies at baseline, 10 days from RT end (early), and 53 days from RT end (delayed). Maximal standardized uptake value (SUVmax), mean standardized uptake value (SUVmean) and total lesion glycolysis (TLG) of the primary tumor were measured and recorded at each PET/CT study. We use logistic regression analysis to aggregate different measures of metabolic response to predict the pathological response in the course of SCRTDS. Results: We provide straightforward formulas to classify response and estimate the probability of being a major responder (TRG1-2) or a complete responder (TRG1) for each individual. The formulas are based on the level of TLG at the early PET and on the overall proportional reduction of TLG between baseline and delayed PET studies. Conclusions: This study demonstrates that in the course of SCRTDS it is possible to estimate the probabilities of pathological tumor responses on the basis of PET/CT with FDG. Our formulas make it possible to assess the risks associated to LARC borne by a patient in the course of SCRTDS. These risk assessments can be balanced against other health risks associated with further treatments and can therefore be used to make informed therapy adjustments during SCRTDS. PMID:28060889

  5. A Bayesian Nonlinear Mixed-Effects Regression Model for the Characterization of Early Bactericidal Activity of Tuberculosis Drugs

    PubMed Central

    Burger, Divan Aristo; Schall, Robert

    2015-01-01

    Trials of the early bactericidal activity (EBA) of tuberculosis (TB) treatments assess the decline, during the first few days to weeks of treatment, in colony forming unit (CFU) count of Mycobacterium tuberculosis in the sputum of patients with smear-microscopy-positive pulmonary TB. Profiles over time of CFU data have conventionally been modeled using linear, bilinear, or bi-exponential regression. We propose a new biphasic nonlinear regression model for CFU data that comprises linear and bilinear regression models as special cases and is more flexible than bi-exponential regression models. A Bayesian nonlinear mixed-effects (NLME) regression model is fitted jointly to the data of all patients from a trial, and statistical inference about the mean EBA of TB treatments is based on the Bayesian NLME regression model. The posterior predictive distribution of relevant slope parameters of the Bayesian NLME regression model provides insight into the nature of the EBA of TB treatments; specifically, the posterior predictive distribution allows one to judge whether treatments are associated with monolinear or bilinear decline of log(CFU) count, and whether CFU count initially decreases fast, followed by a slower rate of decrease, or vice versa. PMID:25322214

  6. Regression-based prediction of net energy expenditure in children performing activities at high altitude.

    PubMed

    Sarton-Miller, Isabelle; Holman, Darryl J; Spielvogel, Hilde

    2003-01-01

    We developed a simple, non-invasive, and affordable method for estimating net energy expenditure (EE) in children performing activities at high altitude. A regression-based method predicts net oxygen consumption (VO(2)) from net heart rate (HR) along with several covariates. The method is atypical in that, the "net" measures are taken as the difference between exercise and resting VO(2) (DeltaVO(2)) and the difference between exercise and resting HR (DeltaHR); DeltaVO(2) partially corrects for resting metabolic rate and for posture, and DeltaHR controls for inter-individual variation in physiology and for posture. Twenty children between 8 and 13 years of age, born and raised in La Paz, Bolivia (altitude 3,600m), made up the reference sample. Anthropometric measures were taken, and VO(2) was assessed while the children performed graded exercise tests on a cycle ergometer. A repeated-measures prediction equation was developed, and maximum likelihood estimates of parameters were found from 75 observations on 20 children. The final model included the variables DeltaHR, DeltaHR(2), weight, and sex. The effectiveness of the method was established using leave-one-out cross-validation, yielding a prediction error rate of 0.126 for a mean DeltaVO(2) of 0.693 (SD 0.315). The correlation between the predicted and measured DeltaVO(2) was r = 0.917, suggesting that a useful prediction equation can be produced using paired VO(2) and HR measurements on a relatively small reference sample. The resulting prediction equation can be used for estimating EE from HR in free-living children performing habitual activities in the Bolivian Andes.

  7. Immunological responsiveness against tumors induced by avian sarcoma virus: reduced expression of pp60src kinase activity in regressing tumors.

    PubMed Central

    Poulin, L; Grisé-Miron, L; Wainberg, M A

    1985-01-01

    Tumors which are induced in chickens by avian sarcoma virus frequently grow progressively for several weeks and then regress. We showed that tumor cells which are derived from the progressively growing phase of tumor growth produce large quantities of progeny-transforming virus, are reactive with antiviral antibody, and are susceptible to lysis in cell-mediated cytotoxicity assays by splenic lymphocytes of sensitized hosts. In contrast, tumor cells derived from regressing sarcomas are poor producers of progeny virus and are relatively unreactive with both antiviral antibody and sensitized lymphocytes. We further found that pp60src kinase activity was reduced by about 75% in regressing compared with progressively growing tumor cells. The half-lives of directly precipitable pp60src in tumor cells derived from progressively growing and regressing neoplasms were 6 and 1.5 h, respectively. Studies on each of three other cellular enzymes did not reveal any regression-associated decreases in enzyme activity. These data support the notion that expression of adequate levels of long-lived pp60src kinase activity is essential to progressive tumor growth. Images PMID:2579245

  8. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  9. Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi’an – a Cross-Sectional Study

    PubMed Central

    Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, P<0.001) and cleaning method (reference, chemical cleanser; running water: OR = 7.081, P = 0.010; brushing: OR = 3.567, P = 0.005). Potential risk factors for denture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, P<0.001), and cleaning method (reference, chemical cleanser; running water: OR = 29.184, P<0.001; brushing: OR = 4.236, P = 0.007). Conclusion Denture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

  10. Prediction of Inhibitory Activity of Epidermal Growth Factor Receptor Inhibitors Using Grid Search-Projection Pursuit Regression Method

    PubMed Central

    Du, Hongying; Hu, Zhide; Bazzoli, Andrea; Zhang, Yang

    2011-01-01

    The epidermal growth factor receptor (EGFR) protein tyrosine kinase (PTK) is an important protein target for anti-tumor drug discovery. To identify potential EGFR inhibitors, we conducted a quantitative structure–activity relationship (QSAR) study on the inhibitory activity of a series of quinazoline derivatives against EGFR tyrosine kinase. Two 2D-QSAR models were developed based on the best multi-linear regression (BMLR) and grid-search assisted projection pursuit regression (GS-PPR) methods. The results demonstrate that the inhibitory activity of quinazoline derivatives is strongly correlated with their polarizability, activation energy, mass distribution, connectivity, and branching information. Although the present investigation focused on EGFR, the approach provides a general avenue in the structure-based drug development of different protein receptor inhibitors. PMID:21811593

  11. Comparison and applicability of landslide susceptibility models based on landslide ratio-based logistic regression, frequency ratio, weight of evidence, and instability index methods in an extreme rainfall event

    NASA Astrophysics Data System (ADS)

    Wu, Chunhung

    2016-04-01

    Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall

  12. Comparisons of different regressions tools in measurement of antioxidant activity in green tea using near infrared spectroscopy.

    PubMed

    Chen, Quansheng; Guo, Zhiming; Zhao, Jiewen; Ouyang, Qin

    2012-02-23

    To rapidly and efficiently measure antioxidant activity (AA) in green tea, near infrared (NIR) spectroscopy was employed with the help of a regression tool in this work. Three different linear and nonlinear regressions tools (i.e. partial least squares (PLS), back propagation artificial neural network (BP-ANN), and support vector machine regression (SVMR)), were systemically studied and compared in developing the model. The model was optimized by a leave-one-out cross-validation, and its performance was tested according to root mean square error of prediction (RMSEP) and correlation coefficient (R(p)) in the prediction set. Experimental results showed that the performance of SVMR model was superior to the others, and the optimum results of the SVMR model were achieved as follow: RMSEP=0.02161 and R(p)=0.9691 in the prediction set. The overall results sufficiently demonstrate that the spectroscopy coupled with the SVMR regression tool has the potential to measure AA in green tea.

  13. Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features

    PubMed Central

    Peek, Andrew S

    2007-01-01

    Background RNA interference (RNAi) is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM) approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams) and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid sequences can be found at

  14. Medical Logistics Functional Integration Management Activity Based Costing and Data Modeling Workshop. Volume 1

    DTIC Science & Technology

    1993-04-02

    Appendix wre Incuded to give the reader a cursory3 ~look at the IDEF data moideling techniques; ( IDEFIX ). Definition 3 ~documentation of data elements...and their relationships in the enerprise. The IDEFIX technique is the key to achieving integration. As a resut of applying the techniquie,3 ~unique data...definitions emerge, which describe the stable wacwe of an organizaion’s rwsable data. Components Activities I Th~e basic components of an IDEFIX

  15. ADAMTS metalloproteases generate active versican fragments that regulate interdigital web regression

    PubMed Central

    McCulloch, Daniel R.; Nelson, Courtney M.; Dixon, Laura J.; Silver, Debra L.; Wylie, James D.; Lindner, Volkhard; Sasaki, Takako; Cooley, Marion A.; Argraves, W. Scott; Apte, Suneel S.

    2009-01-01

    We show that combinatorial mouse alleles for the secreted metalloproteases Adamts5, Adamts20 (bt), and Adamts9 result in fully penetrant soft-tissue syndactyly. Interdigital webs in Adamts5−/−; bt/bt mice had reduced apoptosis and decreased cleavage of the proteoglycan versican; however, the BMP-FGF axis, which regulates interdigital apoptosis was unaffected. BMP4 induced apoptosis, but without concomitant versican proteolysis. Haploinsufficiency of either Vcan or Fbln1, a co-factor for versican processing by ADAMTS5, led to highly penetrant syndactyly in bt mice, suggesting that cleaved versican was essential for web regression. The local application of an amino-terminal versican fragment corresponding to ADAMTS-processed versican, induced cell death in Adamts5−/−; bt/bt webs. Thus, ADAMTS proteases cooperatively maintain versican proteolysis above a required threshold to create a permissive environment for apoptosis. The data highlight the developmental significance of proteolytic action on the ECM, not only as a clearance mechanism, but also as a means to generate bioactive versican fragments. PMID:19922873

  16. Early Parallel Activation of Semantics and Phonology in Picture Naming: Evidence from a Multiple Linear Regression MEG Study

    PubMed Central

    Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf

    2015-01-01

    The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200–400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset. PMID:25005037

  17. The logistics of choice.

    PubMed

    Killeen, Peter R

    2015-07-01

    The generalized matching law (GML) is reconstructed as a logistic regression equation that privileges no particular value of the sensitivity parameter, a. That value will often approach 1 due to the feedback that drives switching that is intrinsic to most concurrent schedules. A model of that feedback reproduced some features of concurrent data. The GML is a law only in the strained sense that any equation that maps data is a law. The machine under the hood of matching is in all likelihood the very law that was displaced by the Matching Law. It is now time to return the Law of Effect to centrality in our science.

  18. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    PubMed

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries.

  19. Application of least squares support vector regression and linear multiple regression for modeling removal of methyl orange onto tin oxide nanoparticles loaded on activated carbon and activated carbon prepared from Pistacia atlantica wood.

    PubMed

    Ghaedi, M; Rahimi, Mahmoud Reza; Ghaedi, A M; Tyagi, Inderjeet; Agarwal, Shilpi; Gupta, Vinod Kumar

    2016-01-01

    Two novel and eco friendly adsorbents namely tin oxide nanoparticles loaded on activated carbon (SnO2-NP-AC) and activated carbon prepared from wood tree Pistacia atlantica (AC-PAW) were used for the rapid removal and fast adsorption of methyl orange (MO) from the aqueous phase. The dependency of MO removal with various adsorption influential parameters was well modeled and optimized using multiple linear regressions (MLR) and least squares support vector regression (LSSVR). The optimal parameters for the LSSVR model were found based on γ value of 0.76 and σ(2) of 0.15. For testing the data set, the mean square error (MSE) values of 0.0010 and the coefficient of determination (R(2)) values of 0.976 were obtained for LSSVR model, and the MSE value of 0.0037 and the R(2) value of 0.897 were obtained for the MLR model. The adsorption equilibrium and kinetic data was found to be well fitted and in good agreement with Langmuir isotherm model and second-order equation and intra-particle diffusion models respectively. The small amount of the proposed SnO2-NP-AC and AC-PAW (0.015 g and 0.08 g) is applicable for successful rapid removal of methyl orange (>95%). The maximum adsorption capacity for SnO2-NP-AC and AC-PAW was 250 mg g(-1) and 125 mg g(-1) respectively.

  20. Dihydromyricetin promotes hepatocellular carcinoma regression via a p53 activation-dependent mechanism

    PubMed Central

    Zhang, Qingyu; Liu, Jie; Liu, Bin; Xia, Juan; Chen, Nianping; Chen, Xiaofeng; Cao, Yi; Zhang, Chen; Lu, CaiJie; Li, Mingyi; Zhu, Runzhi

    2014-01-01

    The development of antitumor chemotherapy drugs remains a key goal for oncologists, and natural products provide a vast resource for anti-cancer drug discovery. In the current study, we found that the flavonoid dihydromyricetin (DHM) exhibited antitumor activity against liver cancer cells, including primary cells obtained from hepatocellular carcinoma (HCC) patients. In contrast, DHM was not cytotoxic to immortalized normal liver cells. Furthermore, DHM treatment resulted in the growth inhibition and remission of xenotransplanted tumors in nude mice. Our results further demonstrated that this antitumor activity was caused by the activation of the p53-dependent apoptosis pathway via p53 phosphorylation at serine (15Ser). Moreover, our results showed that DHM plays a dual role in the induction of cell death when administered in combination with cisplatin, a common clinical drug that kills primary hepatoma cells but not normal liver cells. PMID:24717393

  1. Dihydromyricetin promotes hepatocellular carcinoma regression via a p53 activation-dependent mechanism

    NASA Astrophysics Data System (ADS)

    Zhang, Qingyu; Liu, Jie; Liu, Bin; Xia, Juan; Chen, Nianping; Chen, Xiaofeng; Cao, Yi; Zhang, Chen; Lu, Caijie; Li, Mingyi; Zhu, Runzhi

    2014-04-01

    The development of antitumor chemotherapy drugs remains a key goal for oncologists, and natural products provide a vast resource for anti-cancer drug discovery. In the current study, we found that the flavonoid dihydromyricetin (DHM) exhibited antitumor activity against liver cancer cells, including primary cells obtained from hepatocellular carcinoma (HCC) patients. In contrast, DHM was not cytotoxic to immortalized normal liver cells. Furthermore, DHM treatment resulted in the growth inhibition and remission of xenotransplanted tumors in nude mice. Our results further demonstrated that this antitumor activity was caused by the activation of the p53-dependent apoptosis pathway via p53 phosphorylation at serine (15Ser). Moreover, our results showed that DHM plays a dual role in the induction of cell death when administered in combination with cisplatin, a common clinical drug that kills primary hepatoma cells but not normal liver cells.

  2. Prediction of fungicidal activities of rice blast disease based on least-squares support vector machines and project pursuit regression.

    PubMed

    Du, Hongying; Wang, Jie; Hu, Zhide; Yao, Xiaojun; Zhang, Xiaoyun

    2008-11-26

    Three machine learning methods, genetic algorithm-multilinear regression (GA-MLR), least-squares support vector machine (LS-SVM), and project pursuit regression (PPR), were used to investigate the relationship between thiazoline derivatives and their fungicidal activities against the rice blast disease. The GA-MLR method was used to select the most appropriate molecular descriptors from a large set of descriptors, which were only calculated from molecular structures, and develop a linear quantitative structure-activity relationship (QSAR) model at the same time. On the basis of the selected descriptors, the other two more accurate models (LS-SVM and PPR) were built. Both the linear and nonlinear modes gave good prediction results, but the nonlinear models afforded better prediction ability, which meant that the LS-SVM and PPR methods could simulate the relationship between the structural descriptors and fungicidal activities more accurately. The results show that the nonlinear methods (LS-SVM and PPR) could be used as good modeling tools for the study of rice blast. Moreover, this study provides a new and simple but efficient approach, which should facilitate the design and development of new compounds to resist the rice blast disease.

  3. Valproic acid induces differentiation and transient tumor regression, but spares leukemia-initiating activity in mouse models of APL.

    PubMed

    Leiva, M; Moretti, S; Soilihi, H; Pallavicini, I; Peres, L; Mercurio, C; Dal Zuffo, R; Minucci, S; de Thé, H

    2012-07-01

    Aberrant histone acetylation was physiopathologically associated with the development of acute myeloid leukemias (AMLs). Reversal of histone deacetylation by histone deacetylase inhibitor (HDACis) activates a cell death program that allows tumor regression in mouse models of AMLs. We have used several models of PML-RARA-driven acute promyelocytic leukemias (APLs) to analyze the in vivo effects of valproic acid, a well-characterized HDACis. Valproic acid (VPA)-induced rapid tumor regression and sharply prolonged survival. However, discontinuation of treatment was associated to an immediate relapse. In vivo, as well as ex vivo, VPA-induced terminal granulocytic differentiation. Yet, despite full differentiation, leukemia-initiating cell (LIC) activity was actually enhanced by VPA treatment. In contrast to all-trans retinoic acid (ATRA) or arsenic, VPA did not degrade PML-RARA. However, in combination with ATRA, VPA synergized for PML-RARA degradation and LIC eradication in vivo. Our studies indicate that VPA triggers differentiation, but spares LIC activity, further uncouple differentiation from APL clearance and stress the importance of PML-RARA degradation in APL cure.

  4. Testicular histomorphometry and the proliferative and apoptotic activities of the seminiferous epithelium in Syrian hamster (Mesocricetus auratus) during regression owing to short photoperiod.

    PubMed

    Seco-Rovira, V; Beltrán-Frutos, E; Ferrer, C; Saez, F J; Madrid, J F; Canteras, M; Pastor, L M

    2015-05-01

    During the non-breeding season some animals exhibit testicular atrophy, decreased testicular weight and reduced seminiferous tubule diameter accompanied by depletion of the seminiferous epithelium. Some cellular factors involved in this depletion are changes in germ cell proliferation and apoptosis. In the Syrian hamster this depletion has been studied histologically and in terms of the involvement of proliferation and apoptosis in the seminiferous epithelium of fully regressed testes. The objectives of this study included the histomorphometrical characterization of the testis and the determination of the proliferative and apoptotic activity of germ cells in the seminiferous epithelium during testicular regression owing to short photoperiod. The study was performed using conventional light microscopy (hematoxylin and eosin), proliferating cell nuclear antigen and terminal deoxynucleotidyl transferase (TdT)-mediated dUTP in situ nick end labelling staining, image analysis software, and transmission electron microscopy in three established regression groups: mild regression (MR), strong regression (SR), and total regression (TR). Morphometrically a gradual decrease in total tubular area and in the testicular, tubular, and epithelial volumes was observed during testicular regression. Interstitial and luminal volumes decreased from the MR group onwards. The tubular length decreased from MR to SR. As regards spermatogonial proliferation, only an initial decrease in proliferative activity was observed, whereas apoptotic germ cell activity increased throughout regression. The number of germ cells studied decreased throughout the process of testicular regression. In conclusion, testicular regression in Syrian hamster comprises two histomorphometrical phases, the first involving a decrease in seminiferous tubular diameter and volume and the second involving shortening of the seminiferous tubule and a decrease in interstitial volume. At the cellular level, there is an

  5. Improving importance estimation in pool-based batch active learning for approximate linear regression.

    PubMed

    Kurihara, Nozomi; Sugiyama, Masashi

    2012-12-01

    Pool-based batch active learning is aimed at choosing training inputs from a 'pool' of test inputs so that the generalization error is minimized. P-ALICE (Pool-based Active Learning using Importance-weighted least-squares learning based on Conditional Expectation of the generalization error) is a state-of-the-art method that can cope with model misspecification by weighting training samples according to the importance (i.e., the ratio of test and training input densities). However, importance estimation in the original P-ALICE is based on the assumption that the number of training samples to gather is small, which is not always true in practice. In this paper, we propose an alternative scheme for importance estimation based on the inclusion probability, and show its validity through numerical experiments.

  6. Brain Region–Specific Decrease in the Activity and Expression of Protein Kinase A in the Frontal Cortex of Regressive Autism

    PubMed Central

    Ji, Lina; Chauhan, Ved; Flory, Michael J.; Chauhan, Abha

    2011-01-01

    Autism is a severe neurodevelopmental disorder that is characterized by impaired language, communication, and social skills. In regressive autism, affected children first show signs of normal social and language development but eventually lose these skills and develop autistic behavior. Protein kinases are essential in G-protein-coupled, receptor-mediated signal transduction and are involved in neuronal functions, gene expression, memory, and cell differentiation. We studied the activity and expression of protein kinase A (PKA), a cyclic AMP–dependent protein kinase, in postmortem brain tissue samples from the frontal, temporal, parietal, and occipital cortices, and the cerebellum of individuals with regressive autism; autistic subjects without a clinical history of regression; and age-matched developmentally normal control subjects. The activity of PKA and the expression of PKA (C-α), a catalytic subunit of PKA, were significantly decreased in the frontal cortex of individuals with regressive autism compared to control subjects and individuals with non-regressive autism. Such changes were not observed in the cerebellum, or the cortices from the temporal, parietal, and occipital regions of the brain in subjects with regressive autism. In addition, there was no significant difference in PKA activity or expression of PKA (C-α) between non-regressive autism and control groups. These results suggest that regression in autism may be associated, in part, with decreased PKA-mediated phosphorylation of proteins and abnormalities in cellular signaling. PMID:21909354

  7. Association between Physical Activity and Teacher-Reported Academic Performance among Fifth-Graders in Shanghai: A Quantile Regression

    PubMed Central

    Zhang, Yunting; Zhang, Donglan; Jiang, Yanrui; Sun, Wanqi; Wang, Yan; Chen, Wenjuan; Li, Shenghui; Shi, Lu; Shen, Xiaoming; Zhang, Jun; Jiang, Fan

    2015-01-01

    Introduction A growing body of literature reveals the causal pathways between physical activity and brain function, indicating that increasing physical activity among children could improve rather than undermine their scholastic performance. However, past studies of physical activity and scholastic performance among students often relied on parent-reported grade information, and did not explore whether the association varied among different levels of scholastic performance. Our study among fifth-grade students in Shanghai sought to determine the association between regular physical activity and teacher-reported academic performance scores (APS), with special attention to the differential associational patterns across different strata of scholastic performance. Method A total of 2,225 students were chosen through a stratified random sampling, and a complete sample of 1470 observations were used for analysis. We used a quantile regression analysis to explore whether the association between physical activity and teacher-reported APS differs by distribution of APS. Results Minimal-intensity physical activity such as walking was positively associated with academic performance scores (β = 0.13, SE = 0.04). The magnitude of the association tends to be larger at the lower end of the APS distribution (β = 0.24, SE = 0.08) than in the higher end of the distribution (β = 0.00, SE = 0.07). Conclusion Based upon teacher-reported student academic performance, there is no evidence that spending time on frequent physical activity would undermine student’s APS. Those students who are below the average in their academic performance could be worse off in academic performance if they give up minimal-intensity physical activity. Therefore, cutting physical activity time in schools could hurt the scholastic performance among those students who were already at higher risk for dropping out due to inadequate APS. PMID:25774525

  8. Country logistics performance and disaster impact.

    PubMed

    Vaillancourt, Alain; Haavisto, Ira

    2016-04-01

    The aim of this paper is to deepen the understanding of the relationship between country logistics performance and disaster impact. The relationship is analysed through correlation analysis and regression models for 117 countries for the years 2007 to 2012 with disaster impact variables from the International Disaster Database (EM-DAT) and logistics performance indicators from the World Bank. The results show a significant relationship between country logistics performance and disaster impact overall and for five out of six specific logistic performance indicators. These specific indicators were further used to explore the relationship between country logistic performance and disaster impact for three specific disaster types (epidemic, flood and storm). The findings enhance the understanding of the role of logistics in a humanitarian context with empirical evidence of the importance of country logistics performance in disaster response operations.

  9. Biomass Logistics

    SciTech Connect

    J. Richard Hess; Kevin L. Kenney; William A. Smith; Ian Bonner; David J. Muth

    2015-04-01

    Equipment manufacturers have made rapid improvements in biomass harvesting and handling equipment. These improvements have increased transportation and handling efficiencies due to higher biomass densities and reduced losses. Improvements in grinder efficiencies and capacity have reduced biomass grinding costs. Biomass collection efficiencies (the ratio of biomass collected to the amount available in the field) as high as 75% for crop residues and greater than 90% for perennial energy crops have also been demonstrated. However, as collection rates increase, the fraction of entrained soil in the biomass increases, and high biomass residue removal rates can violate agronomic sustainability limits. Advancements in quantifying multi-factor sustainability limits to increase removal rate as guided by sustainable residue removal plans, and mitigating soil contamination through targeted removal rates based on soil type and residue type/fraction is allowing the use of new high efficiency harvesting equipment and methods. As another consideration, single pass harvesting and other technologies that improve harvesting costs cause biomass storage moisture management challenges, which challenges are further perturbed by annual variability in biomass moisture content. Monitoring, sampling, simulation, and analysis provide basis for moisture, time, and quality relationships in storage, which has allowed the development of moisture tolerant storage systems and best management processes that combine moisture content and time to accommodate baled storage of wet material based upon “shelf-life.” The key to improving biomass supply logistics costs has been developing the associated agronomic sustainability and biomass quality technologies and processes that allow the implementation of equipment engineering solutions.

  10. Differential regulation of the anti-crossover and replication fork regression activities of Mph1 by Mte1

    PubMed Central

    Xue, Xiaoyu; Papusha, Alma; Choi, Koyi; Bonner, Jaclyn N.; Kumar, Sandeep; Niu, Hengyao; Kaur, Hardeep; Zheng, Xiao-Feng; Donnianni, Roberto A.; Lu, Lucy; Lichten, Michael; Zhao, Xiaolan; Ira, Grzegorz; Sung, Patrick

    2016-01-01

    We identified Mte1 (Mph1-associated telomere maintenance protein 1) as a multifunctional regulator of Saccharomyces cerevisiae Mph1, a member of the FANCM family of DNA motor proteins important for DNA replication fork repair and crossover suppression during homologous recombination. We show that Mte1 interacts with Mph1 and DNA species that resemble a DNA replication fork and the D loop formed during recombination. Biochemically, Mte1 stimulates Mph1-mediated DNA replication fork regression and branch migration in a model substrate. Consistent with this activity, genetic analysis reveals that Mte1 functions with Mph1 and the associated MHF complex in replication fork repair. Surprisingly, Mte1 antagonizes the D-loop-dissociative activity of Mph1–MHF and exerts a procrossover role in mitotic recombination. We further show that the influence of Mte1 on Mph1 activities requires its binding to Mph1 and DNA. Thus, Mte1 differentially regulates Mph1 activities to achieve distinct outcomes in recombination and replication fork repair. PMID:26966246

  11. Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models.

    PubMed

    Yu, Xiaolei; Guo, Xulin

    2016-08-01

    The relationship between hourly photosynthetically active radiation (PAR) and the global solar radiation (R s ) was analyzed from data gathered over 3 years at Bondville, IL, and Sioux Falls, SD, Midwestern USA. These data were used to determine temporal variability of the PAR fraction and its dependence on different sky conditions, which were defined by the clearness index. Meanwhile, models based on artificial neural networks (ANNs) were established for predicting hourly PAR. The performance of the proposed models was compared with four existing conventional regression models in terms of the normalized root mean square error (NRMSE), the coefficient of determination (r (2)), the mean percentage error (MPE), and the relative standard error (RSE). From the overall analysis, it shows that the ANN model can predict PAR accurately, especially for overcast sky and clear sky conditions. Meanwhile, the parameters related to water vapor do not improve the prediction result significantly.

  12. Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models

    NASA Astrophysics Data System (ADS)

    Yu, Xiaolei; Guo, Xulin

    2016-08-01

    The relationship between hourly photosynthetically active radiation (PAR) and the global solar radiation ( R s ) was analyzed from data gathered over 3 years at Bondville, IL, and Sioux Falls, SD, Midwestern USA. These data were used to determine temporal variability of the PAR fraction and its dependence on different sky conditions, which were defined by the clearness index. Meanwhile, models based on artificial neural networks (ANNs) were established for predicting hourly PAR. The performance of the proposed models was compared with four existing conventional regression models in terms of the normalized root mean square error (NRMSE), the coefficient of determination ( r 2), the mean percentage error (MPE), and the relative standard error (RSE). From the overall analysis, it shows that the ANN model can predict PAR accurately, especially for overcast sky and clear sky conditions. Meanwhile, the parameters related to water vapor do not improve the prediction result significantly.

  13. Systematic Artifacts in Support Vector Regression-Based Compound Potency Prediction Revealed by Statistical and Activity Landscape Analysis

    PubMed Central

    Balfer, Jenny; Bajorath, Jürgen

    2015-01-01

    Support vector machines are a popular machine learning method for many classification tasks in biology and chemistry. In addition, the support vector regression (SVR) variant is widely used for numerical property predictions. In chemoinformatics and pharmaceutical research, SVR has become the probably most popular approach for modeling of non-linear structure-activity relationships (SARs) and predicting compound potency values. Herein, we have systematically generated and analyzed SVR prediction models for a variety of compound data sets with different SAR characteristics. Although these SVR models were accurate on the basis of global prediction statistics and not prone to overfitting, they were found to consistently mispredict highly potent compounds. Hence, in regions of local SAR discontinuity, SVR prediction models displayed clear limitations. Compared to observed activity landscapes of compound data sets, landscapes generated on the basis of SVR potency predictions were partly flattened and activity cliff information was lost. Taken together, these findings have implications for practical SVR applications. In particular, prospective SVR-based potency predictions should be considered with caution because artificially low predictions are very likely for highly potent candidate compounds, the most important prediction targets. PMID:25742011

  14. Quantitative structure-activity relationship study on BTK inhibitors by modified multivariate adaptive regression spline and CoMSIA methods.

    PubMed

    Xu, A; Zhang, Y; Ran, T; Liu, H; Lu, S; Xu, J; Xiong, X; Jiang, Y; Lu, T; Chen, Y

    2015-01-01

    Bruton's tyrosine kinase (BTK) plays a crucial role in B-cell activation and development, and has emerged as a new molecular target for the treatment of autoimmune diseases and B-cell malignancies. In this study, two- and three-dimensional quantitative structure-activity relationship (2D and 3D-QSAR) analyses were performed on a series of pyridine and pyrimidine-based BTK inhibitors by means of genetic algorithm optimized multivariate adaptive regression spline (GA-MARS) and comparative molecular similarity index analysis (CoMSIA) methods. Here, we propose a modified MARS algorithm to develop 2D-QSAR models. The top ranked models showed satisfactory statistical results (2D-QSAR: Q(2) = 0.884, r(2) = 0.929, r(2)pred = 0.878; 3D-QSAR: q(2) = 0.616, r(2) = 0.987, r(2)pred = 0.905). Key descriptors selected by 2D-QSAR were in good agreement with the conclusions of 3D-QSAR, and the 3D-CoMSIA contour maps facilitated interpretation of the structure-activity relationship. A new molecular database was generated by molecular fragment replacement (MFR) and further evaluated with GA-MARS and CoMSIA prediction. Twenty-five pyridine and pyrimidine derivatives as novel potential BTK inhibitors were finally selected for further study. These results also demonstrated that our method can be a very efficient tool for the discovery of novel potent BTK inhibitors.

  15. Antimicrobial Activity of Aroma Compounds against Saccharomyces cerevisiae and Improvement of Microbiological Stability of Soft Drinks as Assessed by Logistic Regression▿

    PubMed Central

    Belletti, Nicoletta; Kamdem, Sylvain Sado; Patrignani, Francesca; Lanciotti, Rosalba; Covelli, Alessandro; Gardini, Fausto

    2007-01-01

    The combined effects of a mild heat treatment (55°C) and the presence of three aroma compounds [citron essential oil, citral, and (E)-2-hexenal] on the spoilage of noncarbonated beverages inoculated with different amounts of a Saccharomyces cerevisiae strain were evaluated. The results, expressed as growth/no growth, were elaborated using a logistic regression in order to assess the probability of beverage spoilage as a function of thermal treatment length, concentration of flavoring agents, and yeast inoculum. The logit models obtained for the three substances were extremely precise. The thermal treatment alone, even if prolonged for 20 min, was not able to prevent yeast growth. However, the presence of increasing concentrations of aroma compounds improved the stability of the products. The inhibiting effect of the compounds was enhanced by a prolonged thermal treatment. In fact, it influenced the vapor pressure of the molecules, which can easily interact within microbial membranes when they are in gaseous form. (E)-2-Hexenal showed a threshold level, related to initial inoculum and thermal treatment length, over which yeast growth was rapidly inhibited. Concentrations over 100 ppm of citral and thermal treatment longer than 16 min allowed a 90% probability of stability for bottles inoculated with 105 CFU/bottle. Citron gave the most interesting responses: beverages with 500 ppm of essential oil needed only 3 min of treatment to prevent yeast growth. In this framework, the logistic regression proved to be an important tool to study alternative hurdle strategies for the stabilization of noncarbonated beverages. PMID:17616627

  16. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  17. Guide to Preparing Work Descriptions for Performance Work Statements for Contracted Maintenance Activities for Army Installation Directorates of Logistics

    DTIC Science & Technology

    1991-05-01

    400-2, The Modern Army Recordkeeping System (MARKS) ()A, 15 October 1986). AR 50-5, Nuclear and Chemical Weapons and Material - Nuclear Surety (DA... Material Handling Equipment MIG Magnesium Inert Gas MILVAN Military Van MIMS Maintenance Information Management Systems MLSP Microfix Logistic Support...Facility Systems (FS) Division of the U.S. Army Construction Engineering Research Laboratory (USACERL). The principal investigator was Mr. Donald K

  18. Quercetin, a Natural Flavonoid Interacts with DNA, Arrests Cell Cycle and Causes Tumor Regression by Activating Mitochondrial Pathway of Apoptosis

    PubMed Central

    Srivastava, Shikha; Somasagara, Ranganatha R.; Hegde, Mahesh; Nishana, Mayilaadumveettil; Tadi, Satish Kumar; Srivastava, Mrinal; Choudhary, Bibha; Raghavan, Sathees C.

    2016-01-01

    Naturally occurring compounds are considered as attractive candidates for cancer treatment and prevention. Quercetin and ellagic acid are naturally occurring flavonoids abundantly seen in several fruits and vegetables. In the present study, we evaluate and compare antitumor efficacies of quercetin and ellagic acid in animal models and cancer cell lines in a comprehensive manner. We found that quercetin induced cytotoxicity in leukemic cells in a dose-dependent manner, while ellagic acid showed only limited toxicity. Besides leukemic cells, quercetin also induced cytotoxicity in breast cancer cells, however, its effect on normal cells was limited or none. Further, quercetin caused S phase arrest during cell cycle progression in tested cancer cells. Quercetin induced tumor regression in mice at a concentration 3-fold lower than ellagic acid. Importantly, administration of quercetin lead to ~5 fold increase in the life span in tumor bearing mice compared to that of untreated controls. Further, we found that quercetin interacts with DNA directly, and could be one of the mechanisms for inducing apoptosis in both, cancer cell lines and tumor tissues by activating the intrinsic pathway. Thus, our data suggests that quercetin can be further explored for its potential to be used in cancer therapeutics and combination therapy. PMID:27068577

  19. RESEARCH PAPER: Forecast daily indices of solar activity, F10.7, using support vector regression method

    NASA Astrophysics Data System (ADS)

    Huang, Cong; Liu, Dan-Dan; Wang, Jing-Song

    2009-06-01

    The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network.

  20. Overview of the SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division Activities and Technical Projects

    NASA Technical Reports Server (NTRS)

    Singhal, Surendra N.

    2003-01-01

    The SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division activities include identification and fulfillment of joint industry, government, and academia needs for development and implementation of RMSL technologies. Four Projects in the Probabilistic Methods area and two in the area of RMSL have been identified. These are: (1) Evaluation of Probabilistic Technology - progress has been made toward the selection of probabilistic application cases. Future effort will focus on assessment of multiple probabilistic softwares in solving selected engineering problems using probabilistic methods. Relevance to Industry & Government - Case studies of typical problems encountering uncertainties, results of solutions to these problems run by different codes, and recommendations on which code is applicable for what problems; (2) Probabilistic Input Preparation - progress has been made in identifying problem cases such as those with no data, little data and sufficient data. Future effort will focus on developing guidelines for preparing input for probabilistic analysis, especially with no or little data. Relevance to Industry & Government - Too often, we get bogged down thinking we need a lot of data before we can quantify uncertainties. Not True. There are ways to do credible probabilistic analysis with little data; (3) Probabilistic Reliability - probabilistic reliability literature search has been completed along with what differentiates it from statistical reliability. Work on computation of reliability based on quantification of uncertainties in primitive variables is in progress. Relevance to Industry & Government - Correct reliability computations both at the component and system level are needed so one can design an item based on its expected usage and life span; (4) Real World Applications of Probabilistic Methods (PM) - A draft of volume 1 comprising aerospace applications has been released. Volume 2, a compilation of real world

  1. Regression of melanoma metastases after immunotherapy is associated with activation of antigen presentation and interferon-mediated rejection genes

    PubMed Central

    Carretero, Rafael; Wang, Ena; Rodriguez, Ana I.; Reinboth, Jennifer; Ascierto, Maria L.; Engle, Alyson M.; Liu, Hui; Camacho, Francisco M.; Marincola, Francesco M.; Garrido, Federico; Cabrera, Teresa

    2012-01-01

    We present the results of a comparative gene expression analysis of 15 metastases (10 regressing and 5 progressing) obtained from 2 melanoma patients with mixed response following different forms of immunotherapy. Whole genome transcriptional analysis clearly indicate that regression of melanoma metastases is due to an acute immune rejection mediated by the upregulation of genes involved in antigen presentation and interferon mediated response (STAT-1/IRF-1) in all the regressing metastases from both patients. In contrast, progressing metastases showed low transcription levels of genes involved in these pathways. Histological analysis showed T cells and HLA-DR positive infiltrating cells in the regressing but not in the progressing metastases. Quantitative expression analysis of HLA-A, B and C genes on microdisected tumoral regions indicate higher HLA expression in regressing than in progressing metastases. The molecular signature obtained in melanoma rejection appeared to be similar to that observed in other forms of immune-mediated tissue-specific rejection such as allograft, pathogen clearance, graft versus host or autoimmune disease, supporting the immunological constant of rejection. We favor the idea that the major factor determining the success or failure of immunotherapy is the nature of HLA Class I alterations in tumor cells and not the type of immunotherapy used. If the molecular alteration is reversible by the immunotherapy, the HLA expression will be upregulated and the lesion will be recognized and rejected. In contrast, if the defect is structural the MHC Class I expression will remain unchanged and the lesion will progress. PMID:21964766

  2. Predictive Ability of Pender's Health Promotion Model for Physical Activity and Exercise in People with Spinal Cord Injuries: A Hierarchical Regression Analysis

    ERIC Educational Resources Information Center

    Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi

    2012-01-01

    The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…

  3. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…

  4. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  5. Incremental logistic regression for customizing automatic diagnostic models.

    PubMed

    Tortajada, Salvador; Robles, Montserrat; García-Gómez, Juan Miguel

    2015-01-01

    In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence.

  6. Assessing Lake Trophic Status: A Proportional Odds Logistic Regression Model

    EPA Science Inventory

    Lake trophic state classifications are good predictors of ecosystem condition and are indicative of both ecosystem services (e.g., recreation and aesthetics), and disservices (e.g., harmful algal blooms). Methods for classifying trophic state are based off the foundational work o...

  7. Analyzing Army Reserve Unsatisfactory Participants through Logistic Regression

    DTIC Science & Technology

    2012-06-08

    interactions, and referenced Dr. Fredric Jablin’s four stages of socialization (anticipatory socialization, organizational encounter, metamorphosis ...Soldier’s encounter is positive he begins the metamorphosis stage, which “marks the newcomer’s alignment of expectations to those of the organization...anticipatory socialization, encounter, and metamorphosis stages do not have enough time in the unit to make educated decisions. The metamorphosis stage should

  8. Measuring NAVSPASUR Sensor Performance using Logistic Regression Models

    DTIC Science & Technology

    1992-09-01

    consists of three transmitting and six receiving stations or sites. The three transmitting sites, located in Lake Kickapoo TX, Gila River AZ and Jordan...frequency of 216.980 MHz. The largest of the transmitters, located in Lake Kickapoo TX, operates at 810 kW of output power and consists of eighteen...parameters for the Lake Kickapoo transmitter and the San Diego receiver’ will be used. The following parameters are given [Ref. 1: p. 6-7]: P, = 6.76 X 10

  9. Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies

    DTIC Science & Technology

    1982-11-01

    prompted close examination of the issue at a workshop on hypertriglyceridemia where some of the cautions and perspectives given in this paper were...longevity," Circulation, 34, 679-697, (1966). 19. Lippel, K., Tyroler, H., Eder, H., Gotto, A., and Vahouny, G. "Rela- tionship of hypertriglyceridemia

  10. Background or Experience? Using Logistic Regression to Predict College Retention

    ERIC Educational Resources Information Center

    Synco, Tracee M.

    2012-01-01

    Tinto, Astin and countless others have researched the retention and attrition of students from college for more than thirty years. However, the six year graduation rate for all first-time full-time freshmen for the 2002 cohort was 57%. This study sought to determine the retention variables that predicted continued enrollment of entering freshmen…

  11. Logistics background study: underground mining

    SciTech Connect

    Hanslovan, J. J.; Visovsky, R. G.

    1982-02-01

    Logistical functions that are normally associated with US underground coal mining are investigated and analyzed. These functions imply all activities and services that support the producing sections of the mine. The report provides a better understanding of how these functions impact coal production in terms of time, cost, and safety. Major underground logistics activities are analyzed and include: transportation and personnel, supplies and equipment; transportation of coal and rock; electrical distribution and communications systems; water handling; hydraulics; and ventilation systems. Recommended areas for future research are identified and prioritized.

  12. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer.

  13. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  14. Logistics Management: New trends in the Reverse Logistics

    NASA Astrophysics Data System (ADS)

    Antonyová, A.; Antony, P.; Soewito, B.

    2016-04-01

    Present level and quality of the environment are directly dependent on our access to natural resources, as well as their sustainability. In particular production activities and phenomena associated with it have a direct impact on the future of our planet. Recycling process, which in large enterprises often becomes an important and integral part of the production program, is usually in small and medium-sized enterprises problematic. We can specify a few factors, which have direct impact on the development and successful application of the effective reverse logistics system. Find the ways to economically acceptable model of reverse logistics, focusing on converting waste materials for renewable energy, is the task in progress.

  15. Just-In-Time Logistics

    DTIC Science & Technology

    2006-02-07

    Logistics modernization, end-to-end logistics, logistics transformation, and just - in - time logistics -- whichever name it is being called today, it is...demonstrations as to what these systems will do for the end user, consumer confidence will increase and " just - in - time " logistics will lead to a lighter and leaner combat logistics support.

  16. The Automated Logistics Element Planning System (ALEPS)

    NASA Technical Reports Server (NTRS)

    Schwaab, Douglas G.

    1991-01-01

    The design and functions of ALEPS (Automated Logistics Element Planning System) is a computer system that will automate planning and decision support for Space Station Freedom Logistical Elements (LEs) resupply and return operations. ALEPS provides data management, planning, analysis, monitoring, interfacing, and flight certification for support of LE flight load planning activities. The prototype ALEPS algorithm development is described.

  17. Stepwise drying of Lake Turkana at the end of the African Humid Period: a forced regression modulated by solar activity variations?

    NASA Astrophysics Data System (ADS)

    Nutz, Alexis; Schuster, Mathieu

    2016-12-01

    Although the timing of the termination of the African Humid Period (AHP) is now relatively well established, the modes and controlling factors of this drying are still debated. Here, through a geomorphological approach, we characterize the regression of Lake Turkana at the end of the AHP. We show that lake level fall during this period was not continuous but rather stepwise and consisted of five episodes of rapid lake level fall separated by episodes marked by slower rates of lake level fall. Whereas the overall regressive trend reflects a decrease in regional precipitations linked to the gradual reduction in Northern Hemisphere summer insolation, itself controlled by orbital precession, we focus discussion on the origin of the five periods of accelerated lake level fall. We propose that these periods are due to temporary reductions in rainfall across the Lake Turkana area associated with repeated westward displacement of the Congo Air Boundary (CAB) during solar activity minima.

  18. Focused Logistics: Putting Agility in Agile Logistics

    DTIC Science & Technology

    2011-05-19

    envisioned an agile and adaptable logstics system built around common situational understanding.5 The Focused Logistics concept specified the requirement to...the tracking of resources moving through the TD network. As a result, 112 Ibid, 20. 113 Ibid, 18-20. 39 distribution centers tagged inbound

  19. Ridge Regression: A Panacea?

    ERIC Educational Resources Information Center

    Walton, Joseph M.; And Others

    1978-01-01

    Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)

  20. The "Smarter Regression" Add-In for Linear and Logistic Regression in Excel

    DTIC Science & Technology

    2007-07-01

    only Visual Basic , the built-in programming language of Excel (Walkenbach, 1999). We wanted to avoid the use of external Dynamic Linked Libraries...least two different ways of entering results into workbook cells in Visual Basic . One is to establish an array in Visual Basic , fill up the elements of

  1. Quantitative structure-property relationship (QSPR) for the adsorption of organic compounds onto activated carbon cloth: Comparison between multiple linear regression and neural network

    SciTech Connect

    Brasquet, C.; Bourges, B.; Le Cloirec, P.

    1999-12-01

    The adsorption of 55 organic compounds is carried out onto a recently discovered adsorbent, activated carbon cloth. Isotherms are modeled using the Freundlich classical model, and the large database generated allows qualitative assumptions about the adsorption mechanism. However, to confirm these assumptions, a quantitative structure-property relationship methodology is used to assess the correlations between an adsorbability parameter (expressed using the Freundlich parameter K) and topological indices related to the compounds molecular structure (molecular connectivity indices, MCI). This correlation is set up by mean of two different statistical tools, multiple linear regression (MLR) and neural network (NN). A principal component analysis is carried out to generate new and uncorrelated variables. It enables the relations between the MCI to be analyzed, but the multiple linear regression assessed using the principal components (PCs) has a poor statistical quality and introduces high order PCs, too inaccurate for an explanation of the adsorption mechanism. The correlations are thus set up using the original variables (MCI), and both statistical tools, multiple linear regression and neutral network, are compared from a descriptive and predictive point of view. To compare the predictive ability of both methods, a test database of 10 organic compounds is used.

  2. Introduction to the use of regression models in epidemiology.

    PubMed

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  3. Security in Logistics

    NASA Astrophysics Data System (ADS)

    Cempírek, Václav; Nachtigall, Petr; Široký, Jaromír

    2016-12-01

    This paper deals with security of logistic chains according to incorrect declaration of transported goods, fraudulent transport and forwarding companies and possible threats caused by political influences. The main goal of this paper is to highlight possible logistic costs increase due to these fraudulent threats. An analysis of technological processes will beis provided, and an increase of these transport times considering the possible threatswhich will beis evaluated economic costs-wise. In the conclusion, possible threat of companies'` efficiency in logistics due to the costs`, means of transport and increase in human resources` increase will beare pointed out.

  4. Examining Non-Linear Associations between Accelerometer-Measured Physical Activity, Sedentary Behavior, and All-Cause Mortality Using Segmented Cox Regression

    PubMed Central

    Lee, Paul H.

    2016-01-01

    Healthy adults are advised to perform at least 150 min of moderate-intensity physical activity weekly, but this advice is based on studies using self-reports of questionable validity. This study examined the dose-response relationship of accelerometer-measured physical activity and sedentary behaviors on all-cause mortality using segmented Cox regression to empirically determine the break-points of the dose-response relationship. Data from 7006 adult participants aged 18 or above in the National Health and Nutrition Examination Survey waves 2003–2004 and 2005–2006 were included in the analysis and linked with death certificate data using a probabilistic matching approach in the National Death Index through December 31, 2011. Physical activity and sedentary behavior were measured using ActiGraph model 7164 accelerometer over the right hip for 7 consecutive days. Each minute with accelerometer count <100; 1952–5724; and ≥5725 were classified as sedentary, moderate-intensity physical activity, and vigorous-intensity physical activity, respectively. Segmented Cox regression was used to estimate the hazard ratio (HR) of time spent in sedentary behaviors, moderate-intensity physical activity, and vigorous-intensity physical activity and all-cause mortality, adjusted for demographic characteristics, health behaviors, and health conditions. Data were analyzed in 2016. During 47,119 person-year of follow-up, 608 deaths occurred. Each additional hour per day of sedentary behaviors was associated with a HR of 1.15 (95% CI 1.01, 1.31) among participants who spend at least 10.9 h per day on sedentary behaviors, and each additional minute per day spent on moderate-intensity physical activity was associated with a HR of 0.94 (95% CI 0.91, 0.96) among participants with daily moderate-intensity physical activity ≤14.1 min. Associations of moderate physical activity and sedentary behaviors on all-cause mortality were independent of each other. To conclude, evidence from

  5. A change in the pattern of activity affects the developmental regression of the Purkinje cell polyinnervation by climbing fibers in the rat cerebellum.

    PubMed

    Andjus, P R; Zhu, L; Cesa, R; Carulli, D; Strata, P

    2003-01-01

    Pattern of activity during development is important for the refinement of the final architecture of the brain. In the cerebellar cortex, the regression from multiple to single climbing fiber innervation of the Purkinje cell occurs during development between postnatal days (P) 5 and 15. However, the regression is hampered by altering in various ways the morpho-functional integrity of the parallel fiber input. In rats we disrupted the normal activity pattern of the climbing fiber, the terminal arbor of the inferior olive neurons, by administering harmaline for 4 days from P9 to P12. At all studied ages (P15-87) after harmaline treatment multiple (double only) climbing fiber EPSC-steps persist in 28% of cells as compared with none in the control. The ratio between the amplitudes of the larger and the smaller climbing fiber-evoked EPSC increases in parallel with the decline of the polyinnervation factor, indicating a gradual enlargement of the synaptic contribution of the winning climbing fiber synapse at the expense of the losing one. Harmaline treatment had no later effects on the climbing fiber EPSC kinetics and I/V relation in Purkinje cells (P15-36). However, there was a rise in the paired-pulse depression indicating a potentiation of the presynaptic mechanisms. In the same period, after harmaline treatment, parallel fiber-Purkinje cell electrophysiology was unaffected. The distribution of parallel fiber synaptic boutons was also not changed. Thus, a change in the pattern of activity during a narrow developmental period may affect climbing fiber-Purkinje cell synapse competition resulting in occurrence of multiple innervation at least up to 3 months of age. Our results extend the current view on the role of the pattern of activity in the refinement of neuronal connections during development. They suggest that many similar results obtained by different gene or receptor manipulations might be simply the consequence of disrupting the pattern of activity.

  6. Green Logistics Management

    NASA Astrophysics Data System (ADS)

    Chang, Yoon S.; Oh, Chang H.

    Nowadays, environmental management becomes a critical business consideration for companies to survive from many regulations and tough business requirements. Most of world-leading companies are now aware that environment friendly technology and management are critical to the sustainable growth of the company. The environment market has seen continuous growth marking 532B in 2000, and 590B in 2004. This growth rate is expected to grow to 700B in 2010. It is not hard to see the environment-friendly efforts in almost all aspects of business operations. Such trends can be easily found in logistics area. Green logistics aims to make environmental friendly decisions throughout a product lifecycle. Therefore for the success of green logistics, it is critical to have real time tracking capability on the product throughout the product lifecycle and smart solution service architecture. In this chapter, we introduce an RFID based green logistics solution and service.

  7. Lunar Commercial Mining Logistics

    NASA Astrophysics Data System (ADS)

    Kistler, Walter P.; Citron, Bob; Taylor, Thomas C.

    2008-01-01

    Innovative commercial logistics is required for supporting lunar resource recovery operations and assisting larger consortiums in lunar mining, base operations, camp consumables and the future commercial sales of propellant over the next 50 years. To assist in lowering overall development costs, ``reuse'' innovation is suggested in reusing modified LTS in-space hardware for use on the moon's surface, developing product lines for recovered gases, regolith construction materials, surface logistics services, and other services as they evolve, (Kistler, Citron and Taylor, 2005) Surface logistics architecture is designed to have sustainable growth over 50 years, financed by private sector partners and capable of cargo transportation in both directions in support of lunar development and resource recovery development. The author's perspective on the importance of logistics is based on five years experience at remote sites on Earth, where remote base supply chain logistics didn't always work, (Taylor, 1975a). The planning and control of the flow of goods and materials to and from the moon's surface may be the most complicated logistics challenges yet to be attempted. Affordability is tied to the innovation and ingenuity used to keep the transportation and surface operations costs as low as practical. Eleven innovations are proposed and discussed by an entrepreneurial commercial space startup team that has had success in introducing commercial space innovation and reducing the cost of space operations in the past. This logistics architecture offers NASA and other exploring nations a commercial alternative for non-essential cargo. Five transportation technologies and eleven surface innovations create the logistics transportation system discussed.

  8. Space Station fluid management logistics

    NASA Technical Reports Server (NTRS)

    Dominick, Sam M.

    1990-01-01

    Viewgraphs and discussion on space station fluid management logistics are presented. Topics covered include: fluid management logistics - issues for Space Station Freedom evolution; current fluid logistics approach; evolution of Space Station Freedom fluid resupply; launch vehicle evolution; ELV logistics system approach; logistics carrier configuration; expendable fluid/propellant carrier description; fluid carrier design concept; logistics carrier orbital operations; carrier operations at space station; summary/status of orbital fluid transfer techniques; Soviet progress tanker system; and Soviet propellant resupply system observations.

  9. [Appliancation of logistics in resources management of medical asset].

    PubMed

    Miroshnichenko, Iu V; Goriachev, A B; Bunin, S A

    2011-06-01

    The usage of basic regulations of logistics in practical activity for providing joints and military units with medical asset is theoretically justified. The role of logistics in organizing, building and functioning of military (armed forces) medical supply system is found out. The methods of solving urgent problems of improvement the resources management of medical asset on the basis of logistics are presented.

  10. Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

    PubMed

    Kumar, K Vasanth; Porkodi, K; Rocha, F

    2008-01-15

    A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.

  11. Removal of toxic zinc from water/wastewater using eucalyptus seeds activated carbon: non-linear regression analysis.

    PubMed

    Senthil Kumar, Ponnusamy; Saravanan, Anbalagan; Anish Kumar, Kodyingil; Yashwanth, Ramesh; Visvesh, Sridharan

    2016-08-01

    In the present study, a novel activated carbon was prepared from low-cost eucalyptus seeds, which was utilised for the effectively removal of toxic zinc from the water/wastewater. The prepared adsorbent was studied by Fourier transform infrared spectroscopy and scanning electron microscopic characterisation studies. Adsorption process was experimentally performed for optimising the influencing factors such as adsorbent dosage, solution pH, contact time, initial zinc concentration, and temperature for the maximum removal of zinc from aqueous solution. Adsorption isotherm of zinc removal was ensued Freundlich model, and the kinetic model ensued pseudo-second order model. Langmuir monolayer adsorption capacity of the adsorbent for zinc removal was evaluated as 80.37 mg/g. The results of the thermodynamic studies suggested that the adsorption process was exothermic, thermodynamically feasible and impulsive process. Finally, a batch adsorber was planned to remove zinc from known volume and known concentration of wastewater using best obeyed model such as Freundlich. The experimental details showed the newly prepared material can be effectively utilised as a cheap material for the adsorption of toxic metal ions from the contaminated water.

  12. Isotherms and thermodynamics by linear and non-linear regression analysis for the sorption of methylene blue onto activated carbon: comparison of various error functions.

    PubMed

    Kumar, K Vasanth; Porkodi, K; Rocha, F

    2008-03-01

    A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of methylene blue sorption by activated carbon. The r2 was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions, namely coefficient of determination (r2), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r2 was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K2 was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.

  13. Regressive systemic sclerosis.

    PubMed Central

    Black, C; Dieppe, P; Huskisson, T; Hart, F D

    1986-01-01

    Systemic sclerosis is a disease which usually progresses or reaches a plateau with persistence of symptoms and signs. Regression is extremely unusual. Four cases of established scleroderma are described in which regression is well documented. The significance of this observation and possible mechanisms of disease regression are discussed. Images PMID:3718012

  14. NCCS Regression Test Harness

    SciTech Connect

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  15. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  16. Processes and procedures for a worldwide biological samples distribution; product assurance and logistic activities to support the mice drawer system tissue sharing event

    NASA Astrophysics Data System (ADS)

    Benassai, Mario; Cotronei, Vittorio

    The Mice Drawer System (MDS) is a scientific payload developed by the Italian Space Agency (ASI), it hosted 6 mice on the International Space Station (ISS) and re-entered on ground on November 28, 2009 with the STS 129 at KSC. Linked to the MDS experiment, a Tissue Sharing Program (TSP), was developed in order to make available to 16 Payload Investigators (PI) (located in USA, Canada, EU -Italy, Belgium and Germany -and Japan) the biological samples coming from the mice. ALTEC SpA (a PPP owned by ASI, TAS-I and local institutions) was responsible to support the logistics aspects of the MDS samples for the first MDS mission, in the frame of Italian Space Agency (ASI) OSMA program (OSteoporosis and Muscle Atrophy). The TSP resulted in a complex scenario, as ASI, progressively, extended the original OSMA Team also to researchers from other ASI programs and from other Agencies (ESA, NASA, JAXA). The science coordination was performed by the University of Genova (UNIGE). ALTEC has managed all the logistic process with the support of a specialized freight forwarder agent during the whole shipping operation phases. ALTEC formalized all the steps from the handover of samples by the dissection Team to the packaging and shipping process in a dedicated procedure. ALTEC approached all the work in a structured way, performing: A study of the aspects connected to international shipments of biological samples. A coopera-tive work with UNIGE/ASI /PIs to identify all the needs of the various researchers and their compatibility. A complete revision and integration of shipment requirements (addresses, tem-peratures, samples, materials and so on). A complete definition of the final shipment scenario in terms of boxes, content, refrigerant and requirements. A formal approach to identification and selection of the most suited and specialized Freight Forwarder. A clear identification of all the processes from sample dissection by PI Team, sample processing, freezing, tube preparation

  17. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  18. Logistics Modeling for Lunar Exploration Systems

    NASA Technical Reports Server (NTRS)

    Andraschko, Mark R.; Merrill, R. Gabe; Earle, Kevin D.

    2008-01-01

    The extensive logistics required to support extended crewed operations in space make effective modeling of logistics requirements and deployment critical to predicting the behavior of human lunar exploration systems. This paper discusses the software that has been developed as part of the Campaign Manifest Analysis Tool in support of strategic analysis activities under the Constellation Architecture Team - Lunar. The described logistics module enables definition of logistics requirements across multiple surface locations and allows for the transfer of logistics between those locations. A key feature of the module is the loading algorithm that is used to efficiently load logistics by type into carriers and then onto landers. Attention is given to the capabilities and limitations of this loading algorithm, particularly with regard to surface transfers. These capabilities are described within the context of the object-oriented software implementation, with details provided on the applicability of using this approach to model other human exploration scenarios. Some challenges of incorporating probabilistics into this type of logistics analysis model are discussed at a high level.

  19. ISS Logistics Hardware Disposition and Metrics Validation

    NASA Technical Reports Server (NTRS)

    Rogers, Toneka R.

    2010-01-01

    I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.

  20. Nuclear Shuttle Logistics Configuration

    NASA Technical Reports Server (NTRS)

    1971-01-01

    This 1971 artist's concept shows the Nuclear Shuttle in both its lunar logistics configuraton and geosynchronous station configuration. As envisioned by Marshall Space Flight Center Program Development persornel, the Nuclear Shuttle would deliver payloads to lunar orbits or other destinations then return to Earth orbit for refueling and additional missions.

  1. Poxvirus-Based Active Immunotherapy with PD-1 and LAG-3 Dual Immune Checkpoint Inhibition Overcomes Compensatory Immune Regulation, Yielding Complete Tumor Regression in Mice

    PubMed Central

    dela Cruz, Tracy; Cote, Joseph J.; Gordon, Evan J.; Kemp, Felicia; Xavier, Veronica; Franzusoff, Alex; Rountree, Ryan B.; Mandl, Stefanie J.

    2016-01-01

    Poxvirus-based active immunotherapies mediate anti-tumor efficacy by triggering broad and durable Th1 dominated T cell responses against the tumor. While monotherapy significantly delays tumor growth, it often does not lead to complete tumor regression. It was hypothesized that the induced robust infiltration of IFNγ-producing T cells into the tumor could provoke an adaptive immune evasive response by the tumor through the upregulation of PD-L1 expression. In therapeutic CT26-HER-2 tumor models, MVA-BN-HER2 poxvirus immunotherapy resulted in significant tumor growth delay accompanied by a robust, tumor-infiltrating T cell response that was characterized by low to mid-levels of PD-1 expression on T cells. As hypothesized, this response was countered by significantly increased PD-L1 expression on the tumor and, unexpectedly, also on infiltrating T cells. Synergistic benefit of anti-tumor therapy was observed when MVA-BN-HER2 immunotherapy was combined with PD-1 immune checkpoint blockade. Interestingly, PD-1 blockade stimulated a second immune checkpoint molecule, LAG-3, to be expressed on T cells. Combining MVA-BN-HER2 immunotherapy with dual PD-1 plus LAG-3 blockade resulted in comprehensive tumor regression in all mice treated with the triple combination therapy. Subsequent rejection of tumors lacking the HER-2 antigen by treatment-responsive mice without further therapy six months after the original challenge demonstrated long lasting memory and suggested that effective T cell immunity to novel, non-targeted tumor antigens (antigen spread) had occurred. These data support the clinical investigation of this triple therapy regimen, especially in cancer patients harboring PD-L1neg/low tumors unlikely to benefit from immune checkpoint blockade alone. PMID:26910562

  2. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    PubMed Central

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-01-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1–10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633–0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926–0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation. PMID:27586851

  3. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    NASA Astrophysics Data System (ADS)

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-09-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1–10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633–0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926–0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation.

  4. Optimal estimation for regression models on τ-year survival probability.

    PubMed

    Kwak, Minjung; Kim, Jinseog; Jung, Sin-Ho

    2015-01-01

    A logistic regression method can be applied to regressing the [Formula: see text]-year survival probability to covariates, if there are no censored observations before time [Formula: see text]. But if some observations are incomplete due to censoring before time [Formula: see text], then the logistic regression cannot be applied. Jung (1996) proposed to modify the score function for logistic regression to accommodate the right-censored observations. His modified score function, motivated for a consistent estimation of regression parameters, becomes a regular logistic score function if no observations are censored before time [Formula: see text]. In this article, we propose a modification of Jung's estimating function for an optimal estimation for the regression parameters in addition to consistency. We prove that the optimal estimator is more efficient than Jung's estimator. This theoretical comparison is illustrated with a real example data analysis and simulations.

  5. Space Shuttle Orbiter logistics - Managing in a dynamic environment

    NASA Technical Reports Server (NTRS)

    Renfroe, Michael B.; Bradshaw, Kimberly

    1990-01-01

    The importance and methods of monitoring logistics vital signs, logistics data sources and acquisition, and converting data into useful management information are presented. With the launch and landing site for the Shuttle Orbiter project at the Kennedy Space Center now totally responsible for its own supportability posture, it is imperative that logistics resource requirements and management be continually monitored and reassessed. Detailed graphs and data concerning various aspects of logistics activities including objectives, inventory operating levels, customer environment, and data sources are provided. Finally, some lessons learned from the Shuttle Orbiter project and logistics options which should be considered by other space programs are discussed.

  6. Targeting Attenuated Interferon-α to Myeloma Cells with a CD38 Antibody Induces Potent Tumor Regression with Reduced Off-Target Activity

    PubMed Central

    Pogue, Sarah L.; Taura, Tetsuya; Bi, Mingying; Yun, Yong; Sho, Angela; Mikesell, Glen; Behrens, Collette; Sokolovsky, Maya; Hallak, Hussein; Rosenstock, Moti; Sanchez, Eric; Chen, Haiming; Berenson, James; Doyle, Anthony; Nock, Steffen; Wilson, David S.

    2016-01-01

    Interferon-α (IFNα) has been prescribed to effectively treat multiple myeloma (MM) and other malignancies for decades. Its use has waned in recent years, however, due to significant toxicity and a narrow therapeutic index (TI). We sought to improve IFNα’s TI by, first, attaching it to an anti-CD38 antibody, thereby directly targeting it to MM cells, and, second, by introducing an attenuating mutation into the IFNα portion of the fusion protein rendering it relatively inactive on normal, CD38 negative cells. This anti-CD38-IFNα(attenuated) immunocytokine, or CD38-Attenukine™, exhibits 10,000-fold increased specificity for CD38 positive cells in vitro compared to native IFNα and, significantly, is ~6,000-fold less toxic to normal bone marrow cells in vitro than native IFNα. Moreover, the attenuating mutation significantly decreases IFNα biomarker activity in cynomolgus macaques indicating that this approach may yield a better safety profile in humans than native IFNα or a non-attenuated IFNα immunocytokine. In human xenograft MM tumor models, anti-CD38-IFNα(attenuated) exerts potent anti-tumor activity in mice, inducing complete tumor regression in most cases. Furthermore, anti-CD38-IFNα(attenuated) is more efficacious than standard MM treatments (lenalidomide, bortezomib, dexamethasone) and exhibits strong synergy with lenalidomide and with bortezomib in xenograft models. Our findings suggest that tumor-targeted attenuated cytokines such as IFNα can promote robust tumor killing while minimizing systemic toxicity. PMID:27611189

  7. Activation of PI3K/Akt/mTOR signaling in the tumor stroma drives endocrine therapy-dependent breast tumor regression

    PubMed Central

    Polo, María Laura; Riggio, Marina; May, María; Rodríguez, María Jimena; Perrone, María Cecilia; Stallings-Mann, Melody; Kaen, Diego; Frost, Marlene; Goetz, Matthew; Boughey, Judy; Lanari, Claudia; Radisky, Derek; Novaro, Virginia

    2015-01-01

    Improved efficacy of neoadjuvant endocrine-targeting therapies in luminal breast carcinomas could be achieved with optimal use of pathway targeting agents. In a mouse model of ductal breast carcinoma we identify a tumor regressive stromal reaction that is induced by neoadjuvant endocrine therapy. This reparative reaction is characterized by tumor neovascularization accompanied by infiltration of immune cells and carcinoma-associated fibroblasts that stain for phosphorylated ribosomal protein S6 (pS6), downstream the PI3K/Akt/mTOR pathway. While tumor variants with higher PI3K/Akt/mTOR activity respond well to a combination of endocrine and PI3K/Akt/mTOR inhibitors, tumor variants with lower PI3K/Akt/mTOR activity respond more poorly to the combination therapy than to the endocrine therapy alone, associated with inhibition of stromal pS6 and the reparative reaction. In human breast cancer xenografts we confirm that such differential sensitivity to therapy is primarily determined by the level of PI3K/Akt/mTOR in tumor cells. We further show that the clinical response of breast cancer patients undergoing neoadjuvant endocrine therapy is associated with the reparative stromal reaction. We conclude that tumor level and localization of pS6 are associated with therapeutic response in breast cancer and represent biomarkers to distinguish which tumors will benefit from the incorporation of PI3K/Akt/mTOR inhibitors with neoadjuvant endocrine therapy. PMID:26098779

  8. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  9. Mini pressurized logistics module (MPLM)

    NASA Astrophysics Data System (ADS)

    Vallerani, E.; Brondolo, D.; Basile, L.

    1996-06-01

    The MPLM Program was initiated through a Memorandum of Understanding (MOU) between the United States' National Aeronautics and Space Administration (NASA) and Italy's ASI, the Italian Space Agency, that was signed on 6 December 1991. The MPLM is a pressurized logistics module that will be used to transport supplies and materials (up to 20,000 lb), including user experiments, between Earth and International Space Station Alpha (ISSA) using the Shuttle, to support active and passive storage, and to provide a habitable environment for two people when docked to the Station. The Italian Space Agency has selected Alenia Spazio to develop MPLM modules that have always been considered a key element for the new International Space Station taking benefit from its design flexibility and consequent possible cost saving based on the maximum utilization of the Shuttle launch capability for any mission. In the frame of the very recent agreement between the U.S. and Russia for cooperation in space, that foresees the utilization of MIR 1 hardware, the Italian MPLM will remain an important element of the logistics system, being the only pressurized module designed for re-entry. Within the new scenario of anticipated Shuttle flights to MIR 1 during Space Station phase 1, MPLM remains a candidate for one or more missions to provide MIR 1 resupply capabilities and advanced ISSA hardware/procedures verification. Based on the concept of Flexible Carriers, Alenia Spazio is providing NASA with three MPLM flight units that can be configured according to the requirements of the Human-Tended Capability (HTC) and Permanent Human Capability (PHC) of the Space Station. Configurability will allow transportation of passive cargo only, or a combination of passive and cold cargo accommodated in R/F racks. Having developed and qualified the baseline configuration with respect to the worst enveloping condition, each unit could be easily configured to the passive or active version depending upon the

  10. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification.

  11. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  12. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  13. Morse-Smale Regression

    PubMed Central

    Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-01

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424

  14. Logistics Assessment Guidebook

    DTIC Science & Technology

    2011-07-01

    significant. Another case involved the integration of an aircraft and ground vehicle system. Integration issues were identified early in the design...for height and width; insufficient power requirements to support maintenance actions; and insufficient design of the autonomic logistics system...evaluation at IOT &E and thus on track for Initial Operational Capability (IOC). If not, a plan is in place to ensure they are met (ref DoDI 5000; CJCSM

  15. Logistics Software Implementation.

    DTIC Science & Technology

    1986-02-28

    to obtain benchmark performance results. The list presented in Appendix A is not final and should be considered only as the initial worksheet . d...has been used in more applications and its imple-... mentation is better understood. All of the preliminary tests have indi- cated excellent ...environment. The fifth-generation languages, such as PROLOG, also provide excellent dynamic data base support. The proposed approach for the Logistics

  16. Logistics and Strategy

    DTIC Science & Technology

    2014-12-04

    2014 by: , Director, Graduate Degree Programs Robert F. Baumann, PhD The opinions and conclusions expressed herein are those of the...Washington, DC: Center for Military History, 1993), 244. 24 Spencer C. Tucker and Priscilla Mary Roberts , Encyclopedia of World War I (Santa Barbara...Theater of Operations. See Richard M. Leighton and Robert W. Coakley, Global Logistics and Strategy 1940-1943 (Washington, DC: Center for Military

  17. Logistics of Guinea worm disease eradication in South Sudan.

    PubMed

    Jones, Alexander H; Becknell, Steven; Withers, P Craig; Ruiz-Tiben, Ernesto; Hopkins, Donald R; Stobbelaar, David; Makoy, Samuel Yibi

    2014-03-01

    From 2006 to 2012, the South Sudan Guinea Worm Eradication Program reduced new Guinea worm disease (dracunculiasis) cases by over 90%, despite substantial programmatic challenges. Program logistics have played a key role in program achievements to date. The program uses disease surveillance and program performance data and integrated technical-logistical staffing to maintain flexible and effective logistical support for active community-based surveillance and intervention delivery in thousands of remote communities. Lessons learned from logistical design and management can resonate across similar complex surveillance and public health intervention delivery programs, such as mass drug administration for the control of neglected tropical diseases and other disease eradication programs. Logistical challenges in various public health scenarios and the pivotal contribution of logistics to Guinea worm case reductions in South Sudan underscore the need for additional inquiry into the role of logistics in public health programming in low-income countries.

  18. Logistics of Guinea Worm Disease Eradication in South Sudan

    PubMed Central

    Jones, Alexander H.; Becknell, Steven; Withers, P. Craig; Ruiz-Tiben, Ernesto; Hopkins, Donald R.; Stobbelaar, David; Makoy, Samuel Yibi

    2014-01-01

    From 2006 to 2012, the South Sudan Guinea Worm Eradication Program reduced new Guinea worm disease (dracunculiasis) cases by over 90%, despite substantial programmatic challenges. Program logistics have played a key role in program achievements to date. The program uses disease surveillance and program performance data and integrated technical–logistical staffing to maintain flexible and effective logistical support for active community-based surveillance and intervention delivery in thousands of remote communities. Lessons learned from logistical design and management can resonate across similar complex surveillance and public health intervention delivery programs, such as mass drug administration for the control of neglected tropical diseases and other disease eradication programs. Logistical challenges in various public health scenarios and the pivotal contribution of logistics to Guinea worm case reductions in South Sudan underscore the need for additional inquiry into the role of logistics in public health programming in low-income countries. PMID:24445199

  19. Relative risk regression analysis of epidemiologic data.

    PubMed

    Prentice, R L

    1985-11-01

    Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation

  20. Boosted Beta Regression

    PubMed Central

    Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

    2013-01-01

    Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706

  1. NASA Space Exploration Logistics Workshop Proceedings

    NASA Technical Reports Server (NTRS)

    deWeek, Oliver; Evans, William A.; Parrish, Joe; James, Sarah

    2006-01-01

    As NASA has embarked on a new Vision for Space Exploration, there is new energy and focus around the area of manned space exploration. These activities encompass the design of new vehicles such as the Crew Exploration Vehicle (CEV) and Crew Launch Vehicle (CLV) and the identification of commercial opportunities for space transportation services, as well as continued operations of the Space Shuttle and the International Space Station. Reaching the Moon and eventually Mars with a mix of both robotic and human explorers for short term missions is a formidable challenge in itself. How to achieve this in a safe, efficient and long-term sustainable way is yet another question. The challenge is not only one of vehicle design, launch, and operations but also one of space logistics. Oftentimes, logistical issues are not given enough consideration upfront, in relation to the large share of operating budgets they consume. In this context, a group of 54 experts in space logistics met for a two-day workshop to discuss the following key questions: 1. What is the current state-of the art in space logistics, in terms of architectures, concepts, technologies as well as enabling processes? 2. What are the main challenges for space logistics for future human exploration of the Moon and Mars, at the intersection of engineering and space operations? 3. What lessons can be drawn from past successes and failures in human space flight logistics? 4. What lessons and connections do we see from terrestrial analogies as well as activities in other areas, such as U.S. military logistics? 5. What key advances are required to enable long-term success in the context of a future interplanetary supply chain? These proceedings summarize the outcomes of the workshop, reference particular presentations, panels and breakout sessions, and record specific observations that should help guide future efforts.

  2. George: Gaussian Process regression

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel

    2015-11-01

    George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.

  3. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes.

  4. Squared sine logistic map

    NASA Astrophysics Data System (ADS)

    de Carvalho, R. Egydio; Leonel, Edson D.

    2016-12-01

    A periodic time perturbation is introduced in the logistic map as an attempt to investigate new scenarios of bifurcations and new mechanisms toward the chaos. With a squared sine perturbation we observe that a point attractor reaches the chaotic attractor without following a cascade of bifurcations. One fixed point of the system presents a new scenario of bifurcations through an infinite sequence of alternating changes of stability. At the bifurcations, the perturbation does not modify the scaling features observed in the convergence toward the stationary state.

  5. Targeting of the Plzf Gene in the Rat by Transcription Activator-Like Effector Nuclease Results in Caudal Regression Syndrome in Spontaneously Hypertensive Rats.

    PubMed

    Liška, František; Peterková, Renata; Peterka, Miroslav; Landa, Vladimír; Zídek, Václav; Mlejnek, Petr; Šilhavý, Jan; Šimáková, Miroslava; Křen, Vladimír; Starker, Colby G; Voytas, Daniel F; Izsvák, Zsuzsanna; Pravenec, Michal

    2016-01-01

    Recently, it has been found that spontaneous mutation Lx (polydactyly-luxate syndrome) in the rat is determined by deletion of a conserved intronic sequence of the Plzf (Promyelocytic leukemia zinc finger protein) gene. In addition, Plzf is a prominent candidate gene for quantitative trait loci (QTLs) associated with cardiac hypertrophy and fibrosis in the spontaneously hypertensive rat (SHR). In the current study, we tested the effects of Plzf gene targeting in the SHR using TALENs (transcription activator-like effector nucleases). SHR ova were microinjected with constructs pTAL438/439 coding for a sequence-specific endonuclease that binds to target sequence in the first coding exon of the Plzf gene. Out of 43 animals born after microinjection, we detected a single male founder. Sequence analysis revealed a deletion of G that resulted in frame shift mutation starting in codon 31 and causing a premature stop codon at position of amino acid 58. The Plzftm1Ipcv allele is semi-lethal since approximately 95% of newborn homozygous animals died perinatally. All homozygous animals exhibited manifestations of a caudal regression syndrome including tail anomalies and serious size reduction and deformities of long bones, and oligo- or polydactyly on the hindlimbs. The heterozygous animals only exhibited the tail anomalies. Impaired development of the urinary tract was also revealed: one homozygous and one heterozygous rat exhibited a vesico-ureteric reflux with enormous dilatation of ureters and renal pelvis. In the homozygote, this was combined with a hypoplastic kidney. These results provide evidence for the important role of Plzf gene during development of the caudal part of a body-column vertebrae, hindlimbs and urinary system in the rat.

  6. Targeting of the Plzf Gene in the Rat by Transcription Activator-Like Effector Nuclease Results in Caudal Regression Syndrome in Spontaneously Hypertensive Rats

    PubMed Central

    Liška, František; Peterková, Renata; Peterka, Miroslav; Landa, Vladimír; Zídek, Václav; Mlejnek, Petr; Šilhavý, Jan; Šimáková, Miroslava; Křen, Vladimír; Starker, Colby G.; Voytas, Daniel F.; Izsvák, Zsuzsanna; Pravenec, Michal

    2016-01-01

    Recently, it has been found that spontaneous mutation Lx (polydactyly-luxate syndrome) in the rat is determined by deletion of a conserved intronic sequence of the Plzf (Promyelocytic leukemia zinc finger protein) gene. In addition, Plzf is a prominent candidate gene for quantitative trait loci (QTLs) associated with cardiac hypertrophy and fibrosis in the spontaneously hypertensive rat (SHR). In the current study, we tested the effects of Plzf gene targeting in the SHR using TALENs (transcription activator-like effector nucleases). SHR ova were microinjected with constructs pTAL438/439 coding for a sequence-specific endonuclease that binds to target sequence in the first coding exon of the Plzf gene. Out of 43 animals born after microinjection, we detected a single male founder. Sequence analysis revealed a deletion of G that resulted in frame shift mutation starting in codon 31 and causing a premature stop codon at position of amino acid 58. The Plzftm1Ipcv allele is semi-lethal since approximately 95% of newborn homozygous animals died perinatally. All homozygous animals exhibited manifestations of a caudal regression syndrome including tail anomalies and serious size reduction and deformities of long bones, and oligo- or polydactyly on the hindlimbs. The heterozygous animals only exhibited the tail anomalies. Impaired development of the urinary tract was also revealed: one homozygous and one heterozygous rat exhibited a vesico-ureteric reflux with enormous dilatation of ureters and renal pelvis. In the homozygote, this was combined with a hypoplastic kidney. These results provide evidence for the important role of Plzf gene during development of the caudal part of a body—column vertebrae, hindlimbs and urinary system in the rat. PMID:27727328

  7. Reverse logistics in the Brazilian construction industry.

    PubMed

    Nunes, K R A; Mahler, C F; Valle, R A

    2009-09-01

    In Brazil most Construction and Demolition Waste (C&D waste) is not recycled. This situation is expected to change significantly, since new federal regulations oblige municipalities to create and implement sustainable C&D waste management plans which assign an important role to recycling activities. The recycling organizational network and its flows and components are fundamental to C&D waste recycling feasibility. Organizational networks, flows and components involve reverse logistics. The aim of this work is to introduce the concepts of reverse logistics and reverse distribution channel networks and to study the Brazilian C&D waste case.

  8. Estimation of the Regression Effect Using a Latent Trait Model.

    ERIC Educational Resources Information Center

    Quinn, Jimmy L.

    A logistic model was used to generate data to serve as a proxy for an immediate retest from item responses to a fourth grade standardized reading comprehension test of 45 items. Assuming that the actual test may be considered a pretest and the proxy data may be considered a retest, the effect of regression was investigated using a percentage of…

  9. Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables

    ERIC Educational Resources Information Center

    Rakow, Ernest A.

    1978-01-01

    Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)

  10. Logistics Lessons Learned in NASA Space Flight

    NASA Technical Reports Server (NTRS)

    Evans, William A.; DeWeck, Olivier; Laufer, Deanna; Shull, Sarah

    2006-01-01

    Information System (LLIS) and verified that we received the same result using the internal version of LLIS for our logistics lesson searches. In conducting the research, information from multiple databases was consolidated into a single spreadsheet of 300 lessons learned. Keywords were applied for the purpose of sorting and evaluation. Once the lessons had been compiled, an analysis of the resulting data was performed, first sorting it by keyword, then finding duplication and root cause, and finally sorting by root cause. The data was then distilled into the top 7 lessons learned across programs, centers, and activities.

  11. Developing Future Strategic Logistics Leaders

    DTIC Science & Technology

    2013-03-01

    and our focus on tactical level lessons learned has 12 skewed our emphasis in PME to a very narrow band of excellence. As such our current...A partnership between senior logistics leaders, PME developers, and personnel managers is essential to constructing and maintaining strategic...short term and inconsistent PME system that does not produce strategic logistic leaders. The Logistics Corps, sustainment branches, LOO lead for leader

  12. Space Shuttle operational logistics plan

    NASA Technical Reports Server (NTRS)

    Botts, J. W.

    1983-01-01

    The Kennedy Space Center plan for logistics to support Space Shuttle Operations and to establish the related policies, requirements, and responsibilities are described. The Directorate of Shuttle Management and Operations logistics responsibilities required by the Kennedy Organizational Manual, and the self-sufficiency contracting concept are implemented. The Space Shuttle Program Level 1 and Level 2 logistics policies and requirements applicable to KSC that are presented in HQ NASA and Johnson Space Center directives are also implemented.

  13. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  14. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  15. Investigating the quantitative structure-activity relationships for antibody recognition of two immunoassays for polycyclic aromatic hydrocarbons by multiple regression methods.

    PubMed

    Zhang, Yan-Feng; Zhang, Li; Gao, Zhi-Xian; Dai, Shu-Gui

    2012-01-01

    Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous contaminants found in the environment. Immunoassays represent useful analytical methods to complement traditional analytical procedures for PAHs. Cross-reactivity (CR) is a very useful character to evaluate the extent of cross-reaction of a cross-reactant in immunoreactions and immunoassays. The quantitative relationships between the molecular properties and the CR of PAHs were established by stepwise multiple linear regression, principal component regression and partial least square regression, using the data of two commercial enzyme-linked immunosorbent assay (ELISA) kits. The objective is to find the most important molecular properties that affect the CR, and predict the CR by multiple regression methods. The results show that the physicochemical, electronic and topological properties of the PAH molecules have an integrated effect on the CR properties for the two ELISAs, among which molar solubility (S(m)) and valence molecular connectivity index ((3)χ(v)) are the most important factors. The obtained regression equations for Ris(C) kit are all statistically significant (p < 0.005) and show satisfactory ability for predicting CR values, while equations for RaPID kit are all not significant (p > 0.05) and not suitable for predicting. It is probably because that the Ris(C) immunoassay employs a monoclonal antibody, while the RaPID kit is based on polyclonal antibody. Considering the important effect of solubility on the CR values, cross-reaction potential (CRP) is calculated and used as a complement of CR for evaluation of cross-reactions in immunoassays. Only the compounds with both high CR and high CRP can cause intense cross-reactions in immunoassays.

  16. Active Travel to School: Findings from the Survey of US Health Behavior in School-Aged Children, 2009-2010

    ERIC Educational Resources Information Center

    Yang, Yong; Ivey, Stephanie S.; Levy, Marian C.; Royne, Marla B.; Klesges, Lisa M.

    2016-01-01

    Background: Whereas children's active travel to school (ATS) has confirmed benefits, only a few large national surveys of ATS exist. Methods: Using data from the Health Behavior in School-aged Children (HBSC) 2009-2010 US survey, we conducted a logistic regression model to estimate the odds ratios of ATS and a linear regression model to estimate…

  17. Modelling long-term fire occurrence factors in Spain by accounting for local variations with geographically weighted regression

    NASA Astrophysics Data System (ADS)

    Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.

    2013-02-01

    Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local

  18. Nuclear Lunar Logistics Study

    NASA Technical Reports Server (NTRS)

    1963-01-01

    This document has been prepared to incorporate all presentation aid material, together with some explanatory text, used during an oral briefing on the Nuclear Lunar Logistics System given at the George C. Marshall Space Flight Center, National Aeronautics and Space Administration, on 18 July 1963. The briefing and this document are intended to present the general status of the NERVA (Nuclear Engine for Rocket Vehicle Application) nuclear rocket development, the characteristics of certain operational NERVA-class engines, and appropriate technical and schedule information. Some of the information presented herein is preliminary in nature and will be subject to further verification, checking and analysis during the remainder of the study program. In addition, more detailed information will be prepared in many areas for inclusion in a final summary report. This work has been performed by REON, a division of Aerojet-General Corporation under Subcontract 74-10039 from the Lockheed Missiles and Space Company. The presentation and this document have been prepared in partial fulfillment of the provisions of the subcontract. From the inception of the NERVA program in July 1961, the stated emphasis has centered around the demonstration of the ability of a nuclear rocket to perform safely and reliably in the space environment, with the understanding that the assignment of a mission (or missions) would place undue emphasis on performance and operational flexibility. However, all were aware that the ultimate justification for the development program must lie in the application of the nuclear propulsion system to the national space objectives.

  19. Selected Logistics Models and Techniques.

    DTIC Science & Technology

    1984-09-01

    Programmable Calculator LCC...Program 27 TI-59 Programmable Calculator LCC Model 30 Unmanned Spacecraft Cost Model 31 iv I: TABLE OF CONTENTS (CONT’D) (Subject Index) LOGISTICS...34"" - % - "° > - " ° .° - " .’ > -% > ]*° - LOGISTICS ANALYSIS MODEL/TECHNIQUE DATA MODEL/TECHNIQUE NAME: TI-59 Programmable Calculator LCC Model TYPE MODEL: Cost Estimating DEVELOPED BY:

  20. Associations between School Recreational Environments and Physical Activity

    ERIC Educational Resources Information Center

    Nichol, Marianne E.; Pickett, William; Janssen, Ian

    2009-01-01

    Background: School environments may promote or hinder physical activity in young people. The purpose of this research was to examine relationships between school recreational environments and adolescent physical activity. Methods: Using multilevel logistic regression, data from 7638 grade 6 to 10 students from 154 schools who participated in the…

  1. Calculating a Stepwise Ridge Regression.

    ERIC Educational Resources Information Center

    Morris, John D.

    1986-01-01

    Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…

  2. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  3. Liver Fibrosis Regression Measured by Transient Elastography in Human Immunodeficiency Virus (HIV)-Hepatitis B Virus (HBV)-Coinfected Individuals on Long-Term HBV-Active Combination Antiretroviral Therapy

    PubMed Central

    Audsley, Jennifer; Robson, Christopher; Aitchison, Stacey; Matthews, Gail V.; Iser, David; Sasadeusz, Joe; Lewin, Sharon R.

    2016-01-01

    Background. Advanced fibrosis occurs more commonly in human immunodeficiency virus (HIV)-hepatitis B virus (HBV) coinfected individuals; therefore, fibrosis monitoring is important in this population. However, transient elastography (TE) data in HIV-HBV coinfection are lacking. We aimed to assess liver fibrosis using TE in a cross-sectional study of HIV-HBV coinfected individuals receiving combination HBV-active (lamivudine and/or tenofovir/tenofovir-emtricitabine) antiretroviral therapy, identify factors associated with advanced fibrosis, and examine change in fibrosis in those with >1 TE assessment. Methods. We assessed liver fibrosis in 70 HIV-HBV coinfected individuals on HBV-active combination antiretroviral therapy (cART). Change in fibrosis over time was examined in a subset with more than 1 TE result (n = 49). Clinical and laboratory variables at the time of the first TE were collected, and associations with advanced fibrosis (≥F3, Metavir scoring system) and fibrosis regression (of least 1 stage) were examined. Results. The majority of the cohort (64%) had mild to moderate fibrosis at the time of the first TE, and we identified alanine transaminase, platelets, and detectable HIV ribonucleic acid as associated with advanced liver fibrosis. Alanine transaminase and platelets remained independently advanced in multivariate modeling. More than 28% of those with >1 TE subsequently showed liver fibrosis regression, and higher baseline HBV deoxyribonucleic acid was associated with regression. Prevalence of advanced fibrosis (≥F3) decreased 12.3% (32.7%–20.4%) over a median of 31 months. Conclusions. The observed fibrosis regression in this group supports the beneficial effects of cART on liver stiffness. It would be important to study a larger group of individuals with more advanced fibrosis to more definitively assess factors associated with liver fibrosis regression. PMID:27006960

  4. Liver Fibrosis Regression Measured by Transient Elastography in Human Immunodeficiency Virus (HIV)-Hepatitis B Virus (HBV)-Coinfected Individuals on Long-Term HBV-Active Combination Antiretroviral Therapy.

    PubMed

    Audsley, Jennifer; Robson, Christopher; Aitchison, Stacey; Matthews, Gail V; Iser, David; Sasadeusz, Joe; Lewin, Sharon R

    2016-01-01

    Background.  Advanced fibrosis occurs more commonly in human immunodeficiency virus (HIV)-hepatitis B virus (HBV) coinfected individuals; therefore, fibrosis monitoring is important in this population. However, transient elastography (TE) data in HIV-HBV coinfection are lacking. We aimed to assess liver fibrosis using TE in a cross-sectional study of HIV-HBV coinfected individuals receiving combination HBV-active (lamivudine and/or tenofovir/tenofovir-emtricitabine) antiretroviral therapy, identify factors associated with advanced fibrosis, and examine change in fibrosis in those with >1 TE assessment. Methods.  We assessed liver fibrosis in 70 HIV-HBV coinfected individuals on HBV-active combination antiretroviral therapy (cART). Change in fibrosis over time was examined in a subset with more than 1 TE result (n = 49). Clinical and laboratory variables at the time of the first TE were collected, and associations with advanced fibrosis (≥F3, Metavir scoring system) and fibrosis regression (of least 1 stage) were examined. Results.  The majority of the cohort (64%) had mild to moderate fibrosis at the time of the first TE, and we identified alanine transaminase, platelets, and detectable HIV ribonucleic acid as associated with advanced liver fibrosis. Alanine transaminase and platelets remained independently advanced in multivariate modeling. More than 28% of those with >1 TE subsequently showed liver fibrosis regression, and higher baseline HBV deoxyribonucleic acid was associated with regression. Prevalence of advanced fibrosis (≥F3) decreased 12.3% (32.7%-20.4%) over a median of 31 months. Conclusions.  The observed fibrosis regression in this group supports the beneficial effects of cART on liver stiffness. It would be important to study a larger group of individuals with more advanced fibrosis to more definitively assess factors associated with liver fibrosis regression.

  5. Structural regression trees

    SciTech Connect

    Kramer, S.

    1996-12-31

    In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

  6. Seasonal Variation in Physical Activity among Preschool Children in a Northern Canadian City

    ERIC Educational Resources Information Center

    Carson, Valerie; Spence, John C.; Cutumisu, Nicoleta; Boule, Normand; Edwards, Joy

    2010-01-01

    Little research has examined seasonal differences in physical activity (PA) levels among children. Proxy reports of PA were completed by 1,715 parents on their children in Edmonton, Alberta, Canada. Total PA (TPA) minutes were calculated, and each participant was classified as active, somewhat active, or inactive. Logistic regression models were…

  7. The challenge of logistics facilities development

    NASA Technical Reports Server (NTRS)

    Davis, James R.

    1987-01-01

    The paper discusses the experiences of a group of engineers and logisticians at John F. Kennedy Space center in the design, construction and activation of a consolidated logistics facility for support of Space Transportation System ground operations and maintenance. The planning, methodology and processes are covered, with emphasis placed on unique aspects and lessons learned. The project utilized a progressive design, baseline and build concept for each phase of construction, with the Government exercising funding and configuration oversight.

  8. Time series modeling by a regression approach based on a latent process.

    PubMed

    Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice

    2009-01-01

    Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.

  9. Logistics in smallpox: the legacy.

    PubMed

    Wickett, John; Carrasco, Peter

    2011-12-30

    Logistics, defined as "the time-related positioning of resources" was critical to the implementation of the smallpox eradication strategy of surveillance and containment. Logistical challenges in the smallpox programme included vaccine delivery, supplies, staffing, vehicle maintenance, and financing. Ensuring mobility was essential as health workers had to travel to outbreaks to contain them. Three examples illustrate a range of logistic challenges which required imagination and innovation. Standard price lists were developed to expedite vehicle maintenance and repair in Bihar, India. Innovative staffing ensured an adequate infrastructure for vehicle maintenance in Bangladesh. The use of disaster relief mechanisms in Somalia provided airlifts, vehicles and funding within 27 days of their initiation. In contrast the Expanded Programme on Immunization (EPI) faces more complex logistical challenges.

  10. What Exactly is Space Logistics?

    DTIC Science & Technology

    2011-01-01

    information on the space shuttle in the news and have inferred by now that resupply missions to the International Space Sta- tion, Hubble Space ... Telescope repair missions, and satellite deployment missions are space logistics—and you would be technically correct. The science of logistics as applied...What Exactly is Space Logistics? James C. Breidenbach Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the

  11. Logistics Transformation: The Paradigm Shift

    DTIC Science & Technology

    2007-05-14

    NUMBER 6. AUTHOR(S) MAJ Derrick A. Corbett 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND...ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT U.S. Army Command and General Staff College School of Advanced Military Studies 250 Gibbon...logistics transformation be defined. In the 18th Century, the French invented “a third military science which they called Logistique , or Logistics

  12. Ridge regression processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.

  13. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Neeley, James R.; Jones, James V.; Watson, Michael D.; Bramon, Christopher J.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle and is scheduled for its first mission in 2017. The goal of the first mission, which will be uncrewed, is to demonstrate the integrated system performance of the SLS rocket and spacecraft before a crewed flight in 2021. SLS has many of the same logistics challenges as any other large scale program. Common logistics concerns for SLS include integration of discreet programs geographically separated, multiple prime contractors with distinct and different goals, schedule pressures and funding constraints. However, SLS also faces unique challenges. The new program is a confluence of new hardware and heritage, with heritage hardware constituting seventy-five percent of the program. This unique approach to design makes logistics concerns such as commonality especially problematic. Additionally, a very low manifest rate of one flight every four years makes logistics comparatively expensive. That, along with the SLS architecture being developed using a block upgrade evolutionary approach, exacerbates long-range planning for supportability considerations. These common and unique logistics challenges must be clearly identified and tackled to allow SLS to have a successful program. This paper will address the common and unique challenges facing the SLS programs, along with the analysis and decisions the NASA Logistics engineers are making to mitigate the threats posed by each.

  14. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking

    PubMed Central

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440

  15. Joint regression analysis of correlated data using Gaussian copulas.

    PubMed

    Song, Peter X-K; Li, Mingyao; Yuan, Ying

    2009-03-01

    This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.

  16. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    PubMed

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  17. Spontaneous skin regression and predictors of skin regression in Thai scleroderma patients.

    PubMed

    Foocharoen, Chingching; Mahakkanukrauh, Ajanee; Suwannaroj, Siraphop; Nanagara, Ratanavadee

    2011-09-01

    Skin tightness is a major clinical manifestation of systemic sclerosis (SSc). Importantly for both clinicians and patients, spontaneous regression of the fibrosis process has been documented. The purpose of this study is to identify the incidence and related clinical characteristics of spontaneous regression among Thai SSc patients. A historical cohort with 4 years of follow-up was performed among SSc patients over 15 years of age diagnosed with SSc between January 1, 2005 and December 31, 2006 in Khon Kaen, Thailand. The start date was the date of the first symptom and the end date was the date of the skin score ≤2. To estimate the respective probability of regression and to assess the associated factors, the Kaplan-Meier method and Cox regression analysis was used. One hundred seventeen cases of SSc were included with a female to male ratio of 1.5:1. Thirteen patients (11.1%) experienced regression. The incidence rate of spontaneous skin regression was 0.31 per 100 person-months and the average duration of SSc at the time of regression was 35.9±15.6 months (range, 15.7-60 months). The factors that negatively correlated with regression were (a) diffuse cutaneous type, (b) Raynaud's phenomenon, (c) esophageal dysmotility, and (d) colchicine treatment at onset with a respective hazard ratio (HR) of 0.19, 0.19, 0.26, and 0.20. By contrast, the factor that positively correlated with regression was active alveolitis with cyclophosphamide therapy at onset with an HR of 4.23 (95% CI, 1.23-14.10). After regression analysis, only Raynaud's phenomenon at onset and diffuse cutaneous type had a significantly negative correlation to regression. A spontaneous regression of the skin fibrosis process was not uncommon among Thai SSc patients. The factors suggesting a poor predictor for cutaneous manifestation were Raynaud's phenomenon, diffuse cutaneous type while early cyclophosphamide therapy might be related to a better skin outcome.

  18. Operational Logistics: Lessons from the Inchon Landing.

    DTIC Science & Technology

    2007-11-02

    logistics at the strategic, operational, or tactical levels. An analysis of Operation Chromite , the amphibious landing at Inchon, reveals that logistical...commander’s intent, and define the logistics focus of effort demonstrate that operational logistics was a key enabler in Operation Chromite . This analysis

  19. Marine guanidine alkaloids crambescidins inhibit tumor growth and activate intrinsic apoptotic signaling inducing tumor regression in a colorectal carcinoma zebrafish xenograft model

    PubMed Central

    Roel, María; Rubiolo, Juan A.; Guerra-Varela, Jorge; Silva, Siguara B. L.; Thomas, Olivier P.; Cabezas-Sainz, Pablo; Sánchez, Laura; López, Rafael; Botana, Luis M.

    2016-01-01

    The marine environment constitutes an extraordinary resource for the discovery of new therapeutic agents. In the present manuscript we studied the effect of 3 different sponge derived guanidine alkaloids, crambescidine-816, -830, and -800. We show that these compounds strongly inhibit tumor cell proliferation by down-regulating cyclin-dependent kinases 2/6 and cyclins D/A expression while up-regulating the cell cyclin-dependent kinase inhibitors -2A, -2D and -1A. We also show that these guanidine compounds disrupt tumor cell adhesion and cytoskeletal integrity promoting the activation of the intrinsic apoptotic signaling, resulting in loss of mitochondrial membrane potential and concomitant caspase-3 cleavage and activation. The crambescidin 816 anti-tumor effect was fnally assayed in a zebrafish xenotransplantation model confirming its potent antitumor activity against colorectal carcinoma in vivo. Considering these results crambescidins could represent promising natural anticancer agents and therapeutic tools. PMID:27825113

  20. Continual Improvement in Shuttle Logistics

    NASA Technical Reports Server (NTRS)

    Flowers, Jean; Schafer, Loraine

    1995-01-01

    It has been said that Continual Improvement (CI) is difficult to apply to service oriented functions, especially in a government agency such as NASA. However, a constrained budget and increasing requirements are a way of life at NASA Kennedy Space Center (KSC), making it a natural environment for the application of CI tools and techniques. This paper describes how KSC, and specifically the Space Shuttle Logistics Project, a key contributor to KSC's mission, has embraced the CI management approach as a means of achieving its strategic goals and objectives. An overview of how the KSC Space Shuttle Logistics Project has structured its CI effort and examples of some of the initiatives are provided.

  1. Leisure Activity Patterns and Their Associations with Overweight: A Prospective Study among Adolescents

    ERIC Educational Resources Information Center

    Lajunen, Hanna-Reetta; Keski-Rahkonen, Anna; Pulkkinen, Lea; Rose, Richard J.; Rissanen, Aila; Kaprio, Jaakko

    2009-01-01

    We examined longitudinal associations between individual leisure activities (television viewing, video viewing, computer games, listening to music, board games, musical instrument playing, reading, arts, crafts, socializing, clubs or scouts, sports, outdoor activities) and being overweight using logistic regression and latent class analysis in a…

  2. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  3. Support vector machines classifiers of physical activities in preschoolers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  4. Bullied Status and Physical Activity in Texas Adolescents

    ERIC Educational Resources Information Center

    Case, Kathleen R.; Pérez, Adriana; Saxton, Debra L.; Hoelscher, Deanna M.; Springer, Andrew E.

    2016-01-01

    This study examined the association between having been bullied at school during the past 6 months ("bullied status") and not meeting physical activity (PA) recommendations of 60 minutes of daily PA during the past week among 8th- and 11th-grade Texas adolescents. Multiple logistic regression analysis was conducted to examine this…

  5. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Bramon, Chris; Neeley, James R.; Jones, James V.; Watson, Michael D.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle in development and is scheduled for its first mission in 2017. SLS has many of the same logistics challenges as any other large scale program. However, SLS also faces unique challenges. This presentation will address the SLS challenges, along with the analysis and decisions to mitigate the threats posed by each.

  6. Equivalent Linear Logistic Test Models.

    ERIC Educational Resources Information Center

    Bechger, Timo M.; Verstralen, Huub H. F. M.; Verhelst, Norma D.

    2002-01-01

    Discusses the Linear Logistic Test Model (LLTM) and demonstrates that there are many equivalent ways to specify a model. Analyzed a real data set (300 responses to 5 analogies) using a Lagrange multiplier test for the specification of the model, and demonstrated that there may be many ways to change the specification of an LLTM and achieve the…

  7. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  8. Error bounds in cascading regressions

    USGS Publications Warehouse

    Karlinger, M.R.; Troutman, B.M.

    1985-01-01

    Cascading regressions is a technique for predicting a value of a dependent variable when no paired measurements exist to perform a standard regression analysis. Biases in coefficients of a cascaded-regression line as well as error variance of points about the line are functions of the correlation coefficient between dependent and independent variables. Although this correlation cannot be computed because of the lack of paired data, bounds can be placed on errors through the required properties of the correlation coefficient. The potential meansquared error of a cascaded-regression prediction can be large, as illustrated through an example using geomorphologic data. ?? 1985 Plenum Publishing Corporation.

  9. Research on the influencing factors of reverse logistics carbon footprint under sustainable development.

    PubMed

    Sun, Qiang

    2016-12-09

    With the concerns of ecological and circular economy along with sustainable development, reverse logistics has attracted the attention of enterprise. How to achieve sustainable development of reverse logistics has important practical significance of enhancing low carbon competitiveness. In this paper, the system boundary of reverse logistics carbon footprint is presented. Following the measurement of reverse logistics carbon footprint and reverse logistics carbon capacity is provided. The influencing factors of reverse logistics carbon footprint are classified into five parts such as intensity of reverse logistics, energy structure, energy efficiency, reverse logistics output, and product remanufacturing rate. The quantitative research methodology using ADF test, Johansen co-integration test, and impulse response is utilized to interpret the relationship between reverse logistics carbon footprint and the influencing factors more accurately. This research finds that energy efficiency, energy structure, and product remanufacturing rate are more capable of inhibiting reverse logistics carbon footprint. The statistical approaches will help practitioners in this field to structure their reverse logistics activities and also help academics in developing better decision models to reduce reverse logistics carbon footprint.

  10. Logistics, electronic commerce, and the environment

    NASA Astrophysics Data System (ADS)

    Sarkis, Joseph; Meade, Laura; Talluri, Srinivas

    2002-02-01

    Organizations realize that a strong supporting logistics or electronic logistics (e-logistics) function is important from both commercial and consumer perspectives. The implications of e-logistics models and practices cover the forward and reverse logistics functions of organizations. They also have direct and profound impact on the natural environment. This paper will focus on a discussion of forward and reverse e-logistics and their relationship to the natural environment. After discussion of the many pertinent issues in these areas, directions of practice and implications for study and research are then described.

  11. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  12. Logistics support of lunar bases

    NASA Astrophysics Data System (ADS)

    Woodcock, Gordon R.

    Concepts for lunar base design and operations have been studied for over twenty years. A brief summary of past concepts is presented, with reference to new ideas on design, uses and logistics schemes. Transportation systems and mission modes are reviewed, showing the significant cost benefits of reusable transportation at the higher traffic rates representative of lunar base buildup and logistics operations. Parametric studies are presented over the range of base size from 6 crew to 1000 crew, and the "choke points", or barriers to growth are identified and means for their resolution presented. The leverages of food growth on the Moon, crew stay time, use of lunar oxygen, indigenous resources for construction of facilities, and transportation systems size and operations modes are presented. It is concluded that bases as large as 1000 people are affordable at less than twice the cost of an initial base of six people if these leverages of advanced basing technologies are exploited.

  13. Logistic analysis of algae cultivation.

    PubMed

    Slegers, P M; Leduc, S; Wijffels, R H; van Straten, G; van Boxtel, A J B

    2015-03-01

    Energy requirements for resource transport of algae cultivation are unknown. This work describes the quantitative analysis of energy requirements for water and CO2 transport. Algae cultivation models were combined with the quantitative logistic decision model 'BeWhere' for the regions Benelux (Northwest Europe), southern France and Sahara. For photobioreactors, the energy consumed for transport of water and CO2 turns out to be a small percentage of the energy contained in the algae biomass (0.1-3.6%). For raceway ponds the share for transport is higher (0.7-38.5%). The energy consumption for transport is the lowest in the Benelux due to good availability of both water and CO2. Analysing transport logistics is still important, despite the low energy consumption for transport. The results demonstrate that resource requirements, resource distribution and availability and transport networks have a profound effect on the location choices for algae cultivation.

  14. Logistical teamwork tames Madagascar wildcats

    SciTech Connect

    Twa, W.

    1986-03-01

    Amoco Production Company's exploration program in western Madagascar's Sakalaya coastal plain exemplifies the unique logistical challenges both operator and drilling contractor must undergo to reach the few remaining onshore frontier areas. Sakalava is characterized by deep rivers, flood prone tributaries, and a lone 40 km hard-surface road. Problems caused by a lack of port facilities and oil field services are complicated by thousands of square miles of unimproved wooded plains. Rainwater from the nearby mountains of central Madagascar frequently floods rivers in the Sakalava coastal plain leaving impassable marshes in their wake. Prior to this project, about 45 wells had been drilled in Madagascar. Most recently, state oil company Omnis contracted Bawden Drilling International Inc. to drill nine wells for its heavy oil project on the Tsimioro oil prospect. Bawden provided both logistical and drilling services for that program.

  15. Reuse and recycling - reverse logistics opportunities

    SciTech Connect

    Kopicki, R.; Berg, M.J.; Legg, L.

    1993-12-31

    This book is intended to serve as a managerial guide for planning and implementing waste reduction programs. It is based on the premise that proactive management of environmental issues is becoming vital to corporate success, and that these issues are creating new roles and opportunities for logistic professionals. Examined in detail are nonhazardous waste reduction activities; reuse and recycling activities; and source reduction. The book is based on in-depth interviews with seventeen firms and several trade associations acknowledged to be leaders in waste reduction efforts. Topics discussed include adapting inbound supply chains to use more recycled goods; minimizing packaging waste; reverse distribution capabilities for taking back products and packaging; and the use of third party services for recycling, reuse, and source reduction activities. Included are two case analyses of progressive firms like E.I. Dupont Nemours and Home Depot and their waste reduction efforts.

  16. Fleet Logistics Center, Puget Sound

    DTIC Science & Technology

    2012-08-01

    ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Supply Systems Command,Fleet Logistics Center, Puget Sound,467 W Street , Bremerton ,WA,98314-5100 8... Bremerton , WA Established: October 1967 Name Changes: Naval Supply Center Puget Sound, Fleet and Industrial Supply Center Puget...or Sasebo) deployed Ships in the Western Pacific (WestPac) Naval Base Kitsap at Bremerton and Bangor (NBK at Bremerton or Bangor) Navy Region

  17. Which polyunsaturated fatty acids are active in children with attention-deficit hyperactivity disorder receiving PUFA supplementation? A fatty acid validated meta-regression analysis of randomized controlled trials.

    PubMed

    Puri, Basant K; Martins, Julian G

    2014-05-01

    Concerns about growth retardation and unknown effects on long-term brain development with stimulants have prompted interest in polyunsaturated fatty acid supplementation (PUFA) as an alternative treatment. However, randomized controlled trials (RCTs) and meta-analyses of PUFA supplementation in ADHD have shown marginal benefit, and uncertainty exists as to which, if any, PUFA might be effective in alleviating symptoms of ADHD. We conducted an updated meta-analysis of RCTs in ADHD together with multivariable meta-regression analyses using data on PUFA content obtained from independent fatty acid methyl ester analyses of each study PUFA regimen. The PubMed, Embase and PsycINFO databases were searched with no start date and up to 28th July 2013. Study inclusion criteria were: randomized design, placebo controlled, PUFA preparation as active intervention, reporting change scores on ADHD rating-scale measures. Rating-scale measures of inattention and hyperactive-impulsive symptoms were extracted, study authors were contacted to obtain missing data, studies not reporting negative findings had these data imputed, and study quality was assessed using the Jadad system plus other indicators. Random-effects models were used for pooled effects and for meta-regression analyses. Standardized mean differences (SMD) in inattention, hyperactive-impulsive and combined symptoms were assessed as rated by parents, teachers or all raters. The influence of study characteristics and PUFA regimen content was explored in multivariable meta-regression analyses. The overall pooled estimate from 18 studies showed that combined ADHD symptoms rated by all raters decreased with PUFA supplementation; SMD -0.192 (95% CI: -0.297, -0.086; P<0.001). However, when analyzed by rater, only parent-rated symptoms decreased significantly. Multivariable meta-regression showed that longer study duration, γ-linolenic acid (GLA), and the interaction between GLA and eicosapentaenoic acid (EPA) were associated

  18. Information logistics: A production-line approach to information services

    NASA Technical Reports Server (NTRS)

    Adams, Dennis; Lee, Chee-Seng

    1991-01-01

    Logistics can be defined as the process of strategically managing the acquisition, movement, and storage of materials, parts, and finished inventory (and the related information flow) through the organization and its marketing channels in a cost effective manner. It is concerned with delivering the right product to the right customer in the right place at the right time. The logistics function is composed of inventory management, facilities management, communications unitization, transportation, materials management, and production scheduling. The relationship between logistics and information systems is clear. Systems such as Electronic Data Interchange (EDI), Point of Sale (POS) systems, and Just in Time (JIT) inventory management systems are important elements in the management of product development and delivery. With improved access to market demand figures, logisticians can decrease inventory sizes and better service customer demand. However, without accurate, timely information, little, if any, of this would be feasible in today's global markets. Information systems specialists can learn from logisticians. In a manner similar to logistics management, information logistics is concerned with the delivery of the right data, to the ring customer, at the right time. As such, information systems are integral components of the information logistics system charged with providing customers with accurate, timely, cost-effective, and useful information. Information logistics is a management style and is composed of elements similar to those associated with the traditional logistics activity: inventory management (data resource management), facilities management (distributed, centralized and decentralized information systems), communications (participative design and joint application development methodologies), unitization (input/output system design, i.e., packaging or formatting of the information), transportations (voice, data, image, and video communication systems

  19. A Predictive Logistic Regression Model of World Conflict Using Open Source Data

    DTIC Science & Technology

    2015-03-26

    model that predicts the future state of the world where nations will either be in a state of “ violent conflict” or “not in violent conflict” based on...state of violent conflict in 2015, seventeen of them are new to conflict since the last published list in 2013. A prediction tool is created to allow...war- game subject matter experts and students to identify future predicted violent conflict and the responsible variables. v Dedication

  20. Ordinal Logistic Regression to Detect Differential Item Functioning for Gender in the Institutional Integration Scale

    ERIC Educational Resources Information Center

    Breidenbach, Daniel H.; French, Brian F.

    2011-01-01

    Many factors can influence a student's decision to withdraw from college. Intervention programs aimed at retention can benefit from understanding the factors related to such decisions, especially in underrepresented groups. The Institutional Integration Scale (IIS) has been suggested as a predictor of student persistence. Accurate prediction of…

  1. Fitting Proportional Odds Models to Educational Data in Ordinal Logistic Regression Using Stata, SAS and SPSS

    ERIC Educational Resources Information Center

    Liu, Xing

    2008-01-01

    The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…

  2. Logistic Regression Analysis of the Response of Winter Wheat to Components of Artificial Freezing Episodes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Improvement of cold tolerance of winter wheat (Triticum aestivum L.) through breeding methods has been problematic. A better understanding of how individual wheat cultivars respond to components of the freezing process may provide new information that can be used to develop more cold tolerance culti...

  3. Analysis of Variables: Predicting Sophomore Persistence Using Logistic Regression Analysis at the University of South Florida

    ERIC Educational Resources Information Center

    Miller, Thomas E.; Herreid, Charlene H.

    2009-01-01

    This is the fifth in a series of articles describing an attrition prediction and intervention project at the University of South Florida (USF) in Tampa. The project was originally presented in the 83(2) issue (Miller 2007). The statistical model for predicting attrition was described in the 83(3) issue (Miller and Herreid 2008). The methods and…

  4. AP® Potential Predicted by PSAT/NMSQT® Scores Using Logistic Regression. Statistical Report 2014-1

    ERIC Educational Resources Information Center

    Zhang, Xiuyuan; Patel, Priyank; Ewing, Maureen

    2014-01-01

    AP Potential™ is an educational guidance tool that uses PSAT/NMSQT® scores to identify students who have the potential to do well on one or more Advanced Placement® (AP®) Exams. Students identified as having AP potential, perhaps students who would not have been otherwise identified, should consider enrolling in the corresponding AP course if they…

  5. Achievement Gap Projection for Standardized Testing through Logistic Regression within a Large Arizona School District

    ERIC Educational Resources Information Center

    Kellermeyer, Steven Bruce

    2011-01-01

    In the last few decades high-stakes testing has become more political than educational. The Districts within Arizona are bound by the mandates of both AZ LEARNS and the No Child Left Behind Act of 2001. At the time of this writing, both legislative mandates relied on the Arizona Instrument for Measuring Standards (AIMS) as State Tests for gauging…

  6. Role of Social Performance in Predicting Learning Problems: Prediction of Risk Using Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Del Prette, Zilda Aparecida Pereira; Prette, Almir Del; De Oliveira, Lael Almeida; Gresham, Frank M.; Vance, Michael J.

    2012-01-01

    Social skills are specific behaviors that individuals exhibit in order to successfully complete social tasks whereas social competence represents judgments by significant others that these social tasks have been successfully accomplished. The present investigation identified the best sociobehavioral predictors obtained from different raters…

  7. Exploring Person Fit with an Approach Based on Multilevel Logistic Regression

    ERIC Educational Resources Information Center

    Walker, A. Adrienne; Engelhard, George, Jr.

    2015-01-01

    The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…

  8. Prediction of early postoperative infections in pediatric liver transplantation by logistic regression

    NASA Astrophysics Data System (ADS)

    Uzunova, Yordanka; Prodanova, Krasimira; Spassov, Lubomir

    2016-12-01

    Orthotopic liver transplantation (OLT) is the only curative treatment for end-stage liver disease. Early diagnosis and treatment of infections after OLT are usually associated with improved outcomes. This study's objective is to identify reliable factors that can predict postoperative infectious morbidity. 27 children were included in the analysis. They underwent liver transplantation in our department. The correlation between two parameters (the level of blood glucose at 5th postoperative day and the duration of the anhepatic phase) and postoperative infections was analyzed, using univariate analysis. In this analysis, an independent predictive factor was derived which adequately identifies patients at risk of infectious complications after a liver transplantation.

  9. Investigating the Effect of Complexity Factors in Stoichiometry Problems Using Logistic Regression and Eye Tracking

    ERIC Educational Resources Information Center

    Tang, Hui; Kirk, John; Pienta, Norbert J.

    2014-01-01

    This paper includes two experiments, one investigating complexity factors in stoichiometry word problems, and the other identifying students' problem-solving protocols by using eye-tracking technology. The word problems used in this study had five different complexity factors, which were randomly assigned by a Web-based tool that we developed. The…

  10. Comparing the Discrete and Continuous Logistic Models

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.

    2008-01-01

    The solutions of the discrete logistic growth model based on a difference equation and the continuous logistic growth model based on a differential equation are compared and contrasted. The investigation is conducted using a dynamic interactive spreadsheet. (Contains 5 figures.)

  11. EMS incident management: emergency medical logistics.

    PubMed

    Maniscalco, P M; Christen, H T

    1999-01-01

    If you had to get x amount of supplies to point A or point B, or both, in 10 minutes, how would you do it? The answer lies in the following steps: 1. Develop a logistics plan. 2. Use emergency management as a partner agency for developing your logistics plan. 3. Implement a push logistics system by determining what supplies/medications and equipment are important. 4. Place mass casualty/disaster caches at key locations for rapid deployment. Have medication/fluid caches available at local hospitals. 5. Develop and implement command caches for key supervisors and managers. 6. Anticipate the logistics requirements of a terrorism/tactical violence event based on a community threat assessment. 7. Educate the public about preparing a BLS family disaster kit. 8. Test logistics capabilities at disaster exercises. 9. Budget for logistics needs. 10. Never underestimate the importance of logistics. When logistics support fails, the EMS system fails.

  12. Rank regression: an alternative regression approach for data with outliers.

    PubMed

    Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin

    2014-10-01

    Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.

  13. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  14. The Evolution of Centralized Operational Logistics

    DTIC Science & Technology

    2012-05-17

    Command Reports 1-10; Lieutenant General William G. Pagonis and Jeffery L . Cruikshank. Moving Mountains: Lessons in Leadership and Logistics from the Gulf...1943- 1945, (Washington, DC: Government Printing Office, 1955), 320-321. 22 Alan Gropman, The Big " L "- American Logistics in World War II...29 Carter B. Magruder, Recurring Logistic Problems As I Have Observed Them, 29. 30 Ibid., 245; Lieutenant Colonel Robert L . Burke, “Corps Logistic

  15. Research on 6R Military Logistics Network

    NASA Astrophysics Data System (ADS)

    Jie, Wan; Wen, Wang

    The building of military logistics network is an important issue for the construction of new forces. This paper has thrown out a concept model of 6R military logistics network model based on JIT. Then we conceive of axis spoke y logistics centers network, flexible 6R organizational network, lean 6R military information network based grid. And then the strategy and proposal for the construction of the three sub networks of 6Rmilitary logistics network are given.

  16. Nowcasting sunshine number using logistic modeling

    NASA Astrophysics Data System (ADS)

    Brabec, Marek; Badescu, Viorel; Paulescu, Marius

    2013-04-01

    In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.

  17. Crisis Management- Operational Logistics & Asset Visibility Technologies

    DTIC Science & Technology

    2006-06-01

    greatly increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common...increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common logistical...FRAMEWORK............................................................29 A. INTEGRATED COMMUNICATIONS NETWORK MODEL ................29 B. INTEGRATED

  18. Forecasting Workload for Defense Logistics Agency Distribution

    DTIC Science & Technology

    2014-12-01

    NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA MBA PROFESSIONAL REPORT FORECASTING WORKLOAD FOR DEFENSE LOGISTICS AGENCY...DATE December 2014 3. REPORT TYPE AND DATES COVERED MBA Professional Report 4. TITLE AND SUBTITLE FORECASTING WORKLOAD FOR DEFENSE LOGISTICS ...maximum 200 words) The Defense Logistics Agency (DLA) predicts issue and receipt workload for its distribution agency in order to maintain

  19. Naval Logistics Integration Through Interoperable Supply Systems

    DTIC Science & Technology

    2014-06-13

    NAVAL LOGISTICS INTEGRATION THROUGH INTEROPERABLE SUPPLY SYSTEMS A thesis presented to the Faculty of the U.S. Army Command...Thesis 3. DATES COVERED (From - To) AUG 2013 – JUNE 2014 4. TITLE AND SUBTITLE Naval Logistics Integration through Interoperable Supply Systems...Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT This research investigates how the Navy and the Marine Corps could increase Naval Logistics

  20. Logistics hardware and services control system

    NASA Technical Reports Server (NTRS)

    Koromilas, A.; Miller, K.; Lamb, T.

    1973-01-01

    Software system permits onsite direct control of logistics operations, which include spare parts, initial installation, tool control, and repairable parts status and control, through all facets of operations. System integrates logistics actions and controls receipts, issues, loans, repairs, fabrications, and modifications and assets in predicting and allocating logistics parts and services effectively.

  1. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  2. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  3. Simultaneous Inhibition of PI3Kδ and PI3Kα Induces ABC-DLBCL Regression by Blocking BCR-Dependent and -Independent Activation of NF-κB and AKT.

    PubMed

    Paul, Juliane; Soujon, Maurice; Wengner, Antje M; Zitzmann-Kolbe, Sabine; Sturz, Andrea; Haike, Katja; Keng Magdalene, Koh Hui; Tan, Sze Huey; Lange, Martin; Tan, Soo Yong; Mumberg, Dominik; Lim, Soon Thye; Ziegelbauer, Karl; Liu, Ningshu

    2017-01-09

    Compared with follicular lymphoma, high PI3Kα expression was more prevalent in diffuse large B cell lymphoma (DLBCL), although both tumor types expressed substantial PI3Kδ. Simultaneous inhibition of PI3Kα and PI3Kδ dramatically enhanced the anti-tumor profile in ABC-DLBCL models compared with selective inhibition of PI3Kδ, PI3Kα, or BTK. The anti-tumor activity was associated with suppression of p-AKT and a mechanism of blocking nuclear factor-κB activation driven by CD79(mut), CARD11(mut), TNFAIP3(mut), or MYD88(mut). Inhibition of PI3Kα/δ resulted in tumor regression in an ibrutinib-resistant CD79B(WT)/MYD88(mut) patient-derived ABC-DLBCL model. Furthermore, rebound activation of BTK and AKT was identified as a mechanism limiting CD79B(mut)-ABC-DLBCL to show a robust response to PI3K and BTK inhibitor monotherapies. A combination of ibrutinib with the PI3Kα/δ inhibitor copanlisib produced a sustained complete response in vivo in CD79B(mut)/MYD88(mut) ABC-DLBCL models.

  4. Physical Activity Behaviors and Perceived Life Satisfaction among Public High School Adolescents

    ERIC Educational Resources Information Center

    Valois, Robert F.; Zullig, Keith J.; Huebner, E. Scott; Drane, J. Wanzer

    2004-01-01

    This study explored relationships between perceived life satisfaction and physical activity behaviors in a statewide sample of adolescents in South Carolina (n = 4,758) using the CDC Youth Risk Behavior Survey (YRBS) and the Brief Multidimensional Student Life Satisfaction Scale (BMSLSS). Adjusted logistic regression analyses and multivariate…

  5. Parental Youth Assets and Sexual Activity: Differences by Race/Ethnicity

    ERIC Educational Resources Information Center

    Tolma, Eleni L.; Oman, Roy F.; Vesely, Sara K.; Aspy, Cheryl B.; Beebe, Laura; Fluhr, Janene

    2011-01-01

    Objectives: To examine how the relationship between parental-related youth assets and youth sexual activity differed by race/ethnicity. Methods: A random sample of 976 youth and their parents living in a Midwestern city participated in the study. Multivariate logistic regression analyses were conducted for 3 major ethnic groups controlling for the…

  6. Out-of-School Time Activity Participation among US-Immigrant Youth

    ERIC Educational Resources Information Center

    Yu, Stella M.; Newport-Berra, McHale; Liu, Jihong

    2015-01-01

    Background: Structured out-of-school time (OST) activities are associated with positive academic and psychosocial outcomes. Methods: Data came from the 2007 National Survey of Children's Health, restricted to 36,132 youth aged 12-17?years. Logistic regression models were used to examine the joint effects of race/ethnicity and immigrant family type…

  7. Obesity, Physical Activity, and Sedentary Behavior of Youth with Learning Disabilities and ADHD

    ERIC Educational Resources Information Center

    Cook, Bryan G.; Li, Dongmei; Heinrich, Katie M.

    2015-01-01

    Obesity, physical activity, and sedentary behavior in childhood are important indicators of present and future health and are associated with school-related outcomes such as academic achievement, behavior, peer relationships, and self-esteem. Using logistic regression models that controlled for gender, age, ethnicity/race, and socioeconomic…

  8. Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning

    ERIC Educational Resources Information Center

    MacLellan, Christopher J.; Liu, Ran; Koedinger, Kenneth R.

    2015-01-01

    Additive Factors Model (AFM) and Performance Factors Analysis (PFA) are two popular models of student learning that employ logistic regression to estimate parameters and predict performance. This is in contrast to Bayesian Knowledge Tracing (BKT) which uses a Hidden Markov Model formalism. While all three models tend to make similar predictions,…

  9. Complementary Log Regression for Sufficient-Cause Modeling of Epidemiologic Data.

    PubMed

    Lin, Jui-Hsiang; Lee, Wen-Chung

    2016-12-13

    The logistic regression model is the workhorse of epidemiological data analysis. The model helps to clarify the relationship between multiple exposures and a binary outcome. Logistic regression analysis is readily implemented using existing statistical software, and this has contributed to it becoming a routine procedure for epidemiologists. In this paper, the authors focus on a causal model which has recently received much attention from the epidemiologic community, namely, the sufficient-component cause model (causal-pie model). The authors show that the sufficient-component cause model is associated with a particular 'link' function: the complementary log link. In a complementary log regression, the exponentiated coefficient of a main-effect term corresponds to an adjusted 'peril ratio', and the coefficient of a cross-product term can be used directly to test for causal mechanistic interaction (sufficient-cause interaction). The authors provide detailed instructions on how to perform a complementary log regression using existing statistical software and use three datasets to illustrate the methodology. Complementary log regression is the model of choice for sufficient-cause analysis of binary outcomes. Its implementation is as easy as conventional logistic regression.

  10. XRA image segmentation using regression

    NASA Astrophysics Data System (ADS)

    Jin, Jesse S.

    1996-04-01

    Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.

  11. Regressive evolution in Astyanax cavefish.

    PubMed

    Jeffery, William R

    2009-01-01

    A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment.

  12. Changes in plasma gonadotropins, inhibin and testosterone concentrations and testicular gonadotropin receptor mRNA expression during testicular active, regressive and recrudescent phase in the captive Japanese black bear (Ursus thibetanus japonicus).

    PubMed

    Iibuchi, Ruriko; Kamine, Akari; Shimozuru, Michito; Nio-Kobayashi, Junko; Watanabe, Gen; Taya, Kazuyoshi; Tsubota, Toshio

    2010-02-01

    Male Japanese black bears (Ursus thibetanus japonicus) have an explicit reproductive cycle. The objective of this study was to clarify the variation of plasma testosterone, FSH, inhibin, LH levels and testicular gonadotropin receptor mRNA expression of male bears associated with their testicular activity. Notably, this study investigated peripheral FSH concentration and localization of gonadotropin receptor mRNAs for the first time in male bears. Blood and testicular tissue samples were taken from captive, mature, male Japanese black bears during testicular active, regressive and recrudescent phases. Plasma hormone concentrations were measured by immunoassays, and gonadotropin receptor mRNA expression in the testis was investigated by in situ hybridization technique and also by real-time PCR. There were significant variations in plasma testosterone and inhibin concentrations. Changes in FSH concentration preceded these hormones with a similar tendency. Hormones started to increase during denning, and achieved the highest values at the end of the recrudescent phase for FSH and in the active phase for testosterone and inhibin. These changes in hormone concentrations were accompanied by testicular growth. In situ hybridization analysis revealed that FSH and LH receptor mRNA was possibly expressed in Sertoli cells and Leydig cells, respectively, as they are in other mammals. However, neither plasma LH concentration nor testicular gonadotropin receptor mRNA expression level varied significantly among the sampling months. These results suggest that FSH, inhibin and testosterone have roles in testicular activity in male bears. This study provides important endocrine information for comprehending seasonal reproductivity in male Japanese black bears.

  13. Sourcing for Parameter Estimation and Study of Logistic Differential Equation

    ERIC Educational Resources Information Center

    Winkel, Brian J.

    2012-01-01

    This article offers modelling opportunities in which the phenomena of the spread of disease, perception of changing mass, growth of technology, and dissemination of information can be described by one differential equation--the logistic differential equation. It presents two simulation activities for students to generate real data, as well as…

  14. Achieving Data Quality within the Logistics Modernization Program

    DTIC Science & Technology

    2012-09-01

    Modernization Program LOGSA Logistics Support Activity BOM Bills of Material AIS Automated Information Systems GAO Government...the scope of this study to audits of bills of material ( BOM ) data. A. PURPOSE OF RESEARCH The author’s purpose in this study was to determine the...13 G. BILLS OF MATERIAL DATA QUALITY AUDIT PROCESS DESCRIPTION

  15. A conditional likelihood approach for regression analysis using biomarkers measured with batch-specific error.

    PubMed

    Wang, Ming; Flanders, W Dana; Bostick, Roberd M; Long, Qi

    2012-12-20

    Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. Although a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a 'hybrid' approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific errors. We illustrate our method by using data from a colorectal adenoma study.

  16. Space station synergetic RAM-logistics analysis

    NASA Technical Reports Server (NTRS)

    Dejulio, Edmund T.; Leet, Joel H.

    1988-01-01

    NASA's Space Station Maintenance Planning and Analysis (MP&A) Study is a step in the overall Space Station Program to define optimum approaches for on-orbit maintenance planning and logistics support. The approach used in the MP&A study and the analysis process used are presented. Emphasis is on maintenance activities and processes that can be accomplished on orbit within the known design and support constraints of the Space Station. From these analyses, recommendations for maintainability/maintenance requirements are established. The ultimate goal of the study is to reduce on-orbit maintenance requirements to a practical and safe minimum, thereby conserving crew time for productive endeavors. The reliability, availability, and maintainability (RAM) and operations performance evaluation models used were assembled and developed as part of the MP&A study and are described. A representative space station system design is presented to illustrate the analysis process.

  17. Annual report procurement and logistics management center Sandia National Laboratories fiscal year 2002.

    SciTech Connect

    Palmer, David L.

    2003-05-01

    This report summarizes the purchasing and transportation activities of the Procurement and Logistics Management Center for Fiscal Year 2002. Activities for both the New Mexico and California locations are included.

  18. A linear regression solution to the spatial autocorrelation problem

    NASA Astrophysics Data System (ADS)

    Griffith, Daniel A.

    The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.

  19. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  20. Multiple Regression: A Leisurely Primer.

    ERIC Educational Resources Information Center

    Daniel, Larry G.; Onwuegbuzie, Anthony J.

    Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…