NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
An Additional Measure of Overall Effect Size for Logistic Regression Models
ERIC Educational Resources Information Center
Allen, Jeff; Le, Huy
2008-01-01
Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria.
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression.
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Transfer Learning Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Paul, A.; Rottensteiner, F.; Heipke, C.
2015-08-01
In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.
Logistic Regression Applied to Seismic Discrimination
BG Amindan; DN Hagedorn
1998-10-08
The usefulness of logistic discrimination was examined in an effort to learn how it performs in a regional seismic setting. Logistic discrimination provides an easily understood method, works with user-defined models and few assumptions about the population distributions, and handles both continuous and discrete data. Seismic event measurements from a data set compiled by Los Alamos National Laboratory (LANL) of Chinese events recorded at station WMQ were used in this demonstration study. PNNL applied logistic regression techniques to the data. All possible combinations of the Lg and Pg measurements were tried, and a best-fit logistic model was created. The best combination of Lg and Pg frequencies for predicting the source of a seismic event (earthquake or explosion) used Lg{sub 3.0-6.0} and Pg{sub 3.0-6.0} as the predictor variables. A cross-validation test was run, which showed that this model was able to correctly predict 99.7% earthquakes and 98.0% explosions for this given data set. Two other models were identified that used Pg and Lg measurements from the 1.5 to 3.0 Hz frequency range. Although these other models did a good job of correctly predicting the earthquakes, they were not as effective at predicting the explosions. Two possible biases were discovered which affect the predicted probabilities for each outcome. The first bias was due to this being a case-controlled study. The sampling fractions caused a bias in the probabilities that were calculated using the models. The second bias is caused by a change in the proportions for each event. If at a later date the proportions (a priori probabilities) of explosions versus earthquakes change, this would cause a bias in the predicted probability for an event. When using logistic regression, the user needs to be aware of the possible biases and what affect they will have on the predicted probabilities.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
[Logistic regression against a divergent Bayesian network].
Sánchez Trujillo, Noel Antonio
2015-02-03
This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered); we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints.
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
ERIC Educational Resources Information Center
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
Preserving Institutional Privacy in Distributed binary Logistic Regression.
Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
[Unconditioned logistic regression and sample size: a bibliographic review].
Ortega Calvo, Manuel; Cayuela Domínguez, Aurelio
2002-01-01
Unconditioned logistic regression is a highly useful risk prediction method in epidemiology. This article reviews the different solutions provided by different authors concerning the interface between the calculation of the sample size and the use of logistics regression. Based on the knowledge of the information initially provided, a review is made of the customized regression and predictive constriction phenomenon, the design of an ordinal exposition with a binary output, the event of interest per variable concept, the indicator variables, the classic Freeman equation, etc. Some skeptical ideas regarding this subject are also included.
Estimating the exceedance probability of rain rate by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Differentially private distributed logistic regression using private and public data
2014-01-01
Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786
Identification of high leverage points in binary logistic regression
NASA Astrophysics Data System (ADS)
Fitrianto, Anwar; Wendy, Tham
2016-10-01
Leverage points are those which measures uncommon observations in x space of regression diagnostics. Detection of high leverage points plays a vital role because it is responsible in masking outlier. In regression, high observations which made at extreme in the space of explanatory variables and they are far apart from the average of the data are called as leverage points. In this project, a method for identification of high leverage point in logistic regression was shown using numerical example. We investigate the effects of high leverage point in the logistic regression model. The comparison of the result in the model with and without leverage model is being discussed. Some graphical analyses based on the result of the analysis are presented. We found that the presence of HLP have effect on the hii, estimated probability, estimated coefficients, p-value of variable, odds ratio and regression equation.
Improving Predictions in Imbalanced Data Using Pairwise Expanded Logistic Regression
Jiang, Xiaoqian; El-Kareh, Robert; Ohno-Machado, Lucila
2011-01-01
Building classifiers for medical problems often involves dealing with rare, but important events. Imbalanced datasets pose challenges to ordinary classification algorithms such as Logistic Regression (LR) and Support Vector Machines (SVM). The lack of effective strategies for dealing with imbalanced training data often results in models that exhibit poor discrimination. We propose a novel approach to estimate class memberships based on the evaluation of pairwise relationships in the training data. The method we propose, Pairwise Expanded Logistic Regression, improved discrimination and had higher accuracy when compared to existing methods in two imbalanced datasets, thus showing promise as a potential remedy for this problem. PMID:22195118
Metadata for ReVA logistic regression dataset
LaMotte, Andrew E.
2004-01-01
The U.S. Geological Survey in cooperation with the U.S. Environmental Protection Agency's Regional Vulnerability Assessment Program, has developed a set of statistical tools to support regional-scale, ground-water quality and vulnerability assessments. The Regional Vulnerability Assessment Program goals are to develop and demonstrate approaches to comprehensive, regional-scale assessments that effectively inform water-resources managers and decision-makers as to the magnitude, extent, distribution, and uncertainty of current and anticipated environmental risks. The U.S. Geological Survey is developing and exploring the use of statistical probability models to characterize the relation between ground-water quality and geographic factors in the Mid-Atlantic Region. Available water-quality data obtained from U.S. Geological Survey National Water-Quality Assessment Program studies conducted in the Mid-Atlantic Region were used in association with geographic data (land cover, geology, soils, and others) to develop logistic-regression equations that use explanatory variables to predict the presence of a selected water-quality parameter exceeding specified management concentration thresholds. The resulting logistic-regression equations were transformed to determine the probability, P(X), of a water-quality parameter exceeding a specified management threshold. Additional statistical procedures modified by the U.S. Geological Survey were used to compare the observed values to model-predicted values at each sample point. In addition, procedures to evaluate the confidence of the model predictions and estimate the uncertainty of the probability value were developed and applied. The resulting logistic-regression models were applied to the Mid-Atlantic Region to predict the spatial probability of nitrate concentrations exceeding specified management thresholds. These thresholds are usually set or established by regulators or managers at national or local levels. At management
Autoregressive logistic regression applied to atmospheric circulation patterns
NASA Astrophysics Data System (ADS)
Guanche, Y.; Mínguez, R.; Méndez, F. J.
2014-01-01
Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.
Predicting β-Turns in Protein Using Kernel Logistic Regression
Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, FangXiang; Li, Min
2013-01-01
A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Qtotal of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. PMID:23509793
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Sugarcane Land Classification with Satellite Imagery using Logistic Regression Model
NASA Astrophysics Data System (ADS)
Henry, F.; Herwindiati, D. E.; Mulyono, S.; Hendryli, J.
2017-03-01
This paper discusses the classification of sugarcane plantation area from Landsat-8 satellite imagery. The classification process uses binary logistic regression method with time series data of normalized difference vegetation index as input. The process is divided into two steps: training and classification. The purpose of training step is to identify the best parameter of the regression model using gradient descent algorithm. The best fit of the model can be utilized to classify sugarcane and non-sugarcane area. The experiment shows high accuracy and successfully maps the sugarcane plantation area which obtained best result of Cohen’s Kappa value 0.7833 (strong) with 89.167% accuracy.
Saddlepoint approximations for small sample logistic regression problems.
Platt, R W
2000-02-15
Double saddlepoint approximations provide quick and accurate approximations to exact conditional tail probabilities in a variety of situations. This paper describes the use of these approximations in two logistic regression problems. An investigation of regression analysis of the log-odds ratio in a sequence or set of 2x2 tables via simulation studies shows that in practical settings the saddlepoint methods closely approximate exact conditional inference. The double saddlepoint approximation in the test for trend in a sequence of binomial random variates is also shown, via simulation studies, to be an effective approximation to exact conditional inference.
Logistic Regression Model on Antenna Control Unit Autotracking Mode
2015-10-20
412TW-PA-15240 Logistic Regression Model on Antenna Control Unit Autotracking Mode DANIEL T. LAIRD AIR FORCE TEST CENTER EDWARDS AFB, CA...OCTOBER 20 2015 4 1 2 T W Approved for public release; distribution is unlimited. 412TW-PA-15240 AIR FORCE TEST ...unlimited. 13. SUPPLEMENTARY NOTES CA: Air Force Test Center Edwards AFB CA CC: 012100 14. ABSTRACT Over the past several years
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Using logistic regression to estimate delay-discounting functions.
Wileyto, E Paul; Audrain-McGovern, Janet; Epstein, Leonard H; Lerman, Caryn
2004-02-01
The monetary choice questionnaire (MCQ) and similar computer tasks ask preference questions in order to ascertain indifference, the perceived equivalence of immediate versus larger delayed rewards. Indifference data are then fitted with a hyperbolic function, summarizing the decline in perceived value with delay time. We present a fitting method that estimates the hyperbolic parameter k directly from survey responses. Binary preferences are modeled as a function of time (X2) and a transformed reward ratio (X1), yielding logistic regression coefficients beta 2 and beta 1. The hyperbolic parameter emerges as k = beta 2/beta 1, where the logistic predicted p = .5 (the definition of indifference). The MCQ was administered to 1,073 adolescents and was scored using both standard and logistic methods. The means for In(k) were similar (standard, -4.53; logistic, -4.51), and the results were highly correlated (rho = .973). Simulated MCQ data showed that k was unbiased, except where beta 1 > or = -1, indicating a vague survey response. Jackknife standard errors provided excellent coverage.
Detecting nonsense for Chinese comments based on logistic regression
NASA Astrophysics Data System (ADS)
Zhuolin, Ren; Guang, Chen; Shu, Chen
2016-07-01
To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.
ERIC Educational Resources Information Center
Ferrer, Alvaro J. Arce; Wang, Lin
This study compared the classification performance among parametric discriminant analysis, nonparametric discriminant analysis, and logistic regression in a two-group classification application. Field data from an organizational survey were analyzed and bootstrapped for additional exploration. The data were observed to depart from multivariate…
ERIC Educational Resources Information Center
French, Brian F.; Maller, Susan J.
2007-01-01
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
Landslide Hazard Mapping in Rwanda Using Logistic Regression
NASA Astrophysics Data System (ADS)
Piller, A.; Anderson, E.; Ballard, H.
2015-12-01
Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.
Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.
Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina
2016-10-26
This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history (P < .01) and vaccine decision made before the visit (P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history (P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.
Subgroup finding via Bayesian additive regression trees.
Sivaganesan, Siva; Müller, Peter; Huang, Bin
2017-03-09
We provide a Bayesian decision theoretic approach to finding subgroups that have elevated treatment effects. Our approach separates the modeling of the response variable from the task of subgroup finding and allows a flexible modeling of the response variable irrespective of potential subgroups of interest. We use Bayesian additive regression trees to model the response variable and use a utility function defined in terms of a candidate subgroup and the predicted response for that subgroup. Subgroups are identified by maximizing the expected utility where the expectation is taken with respect to the posterior predictive distribution of the response, and the maximization is carried out over an a priori specified set of candidate subgroups. Our approach allows subgroups based on both quantitative and categorical covariates. We illustrate the approach using simulated data set study and a real data set. Copyright © 2017 John Wiley & Sons, Ltd.
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
ERIC Educational Resources Information Center
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
Risk factors for temporomandibular disorder: Binary logistic regression analysis
Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.
2014-01-01
Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706
Application of Bayesian logistic regression to mining biomedical data.
Avali, Viji R; Cooper, Gregory F; Gopalakrishnan, Vanathi
2014-01-01
Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve.
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.
López Puga, Jorge; García García, Juan
2012-11-01
Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.
Algamal, Zakariya Yahya; Lee, Muhammad Hisyam
2015-12-01
Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework.
Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran
2015-01-01
Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087
A secure distributed logistic regression protocol for the detection of rare adverse drug events
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-01-01
Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for
Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza
2013-01-01
Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152
Sub-resolution assist feature (SRAF) printing prediction using logistic regression
NASA Astrophysics Data System (ADS)
Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei
2015-03-01
In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.
Predicting students' success at pre-university studies using linear and logistic regressions
NASA Astrophysics Data System (ADS)
Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir
2014-09-01
The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.
An application of a microcomputer compiler program to multiple logistic regression analysis.
Sakai, R
1988-01-01
Microcomputer programs for multiple logistic regression analysis were written in BASIC language to determine the usefulness of microcomputers for multivariate analysis, which is an important method in epidemiological studies. The program, carried out by an interpreter system, required a comparatively long computing time for a small amount of data. For example, it took approximately thirty minutes to compute the data of 6 independent variables and 63 matched sets of case and controls (1:4). The majority of the calculation time was spent computing a matrix. The matrix computation time increased cumulatively in proportion to additions in the number of subjects, and increased exponentially with the number of variables. A BASIC compiler was utilized for the program of multiple logistic regression analysis. The compiled program carried out the same computations as above, but within 4 minutes. Therefore, it is evident that a compiler can be an extremely convenient tool for computing multivariate analysis. The two programs produced here were also easily linked with spreadsheet packages to enter data.
Comparison of artificial neural networks with logistic regression for detection of obesity.
Heydari, Seyed Taghi; Ayatollahi, Seyed Mohammad Taghi; Zare, Najaf
2012-08-01
Obesity is a common problem in nutrition, both in the developed and developing countries. The aim of this study was to classify obesity by artificial neural networks and logistic regression. This cross-sectional study comprised of 414 healthy military personnel in southern Iran. All subjects completed questionnaires on their socio-economic status and their anthropometric measures were measured by a trained nurse. Classification of obesity was done by artificial neural networks and logistic regression. The mean age±SD of participants was 34.4 ± 7.5 years. A total of 187 (45.2%) were obese. In regard to logistic regression and neural networks the respective values were 80.2% and 81.2% when correctly classified, 80.2 and 79.7 for sensitivity and 81.9 and 83.7 for specificity; while the area under Receiver-Operating Characteristic (ROC) curve were 0.888 and 0.884 and the Kappa statistic were 0.600 and 0.629 for logistic regression and neural networks model respectively. We conclude that the neural networks and logistic regression both were good classifier for obesity detection but they were not significantly different in classification.
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO2, SO2, O3 and PM2.5) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O3>PM2.5>NO2>humidity followed at a significant distance by the effects of SO2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
NASA Astrophysics Data System (ADS)
Junek, W. N.; Jones, W. L.; Woods, M. T.
2011-12-01
An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.
Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping
2013-01-01
Background Complex binary traits are influenced by many factors including the main effects of many quantitative trait loci (QTLs), the epistatic effects involving more than one QTLs, environmental effects and the effects of gene-environment interactions. Although a number of QTL mapping methods for binary traits have been developed, there still lacks an efficient and powerful method that can handle both main and epistatic effects of a relatively large number of possible QTLs. Results In this paper, we use a Bayesian logistic regression model as the QTL model for binary traits that includes both main and epistatic effects. Our logistic regression model employs hierarchical priors for regression coefficients similar to the ones used in the Bayesian LASSO linear model for multiple QTL mapping for continuous traits. We develop efficient empirical Bayesian algorithms to infer the logistic regression model. Our simulation study shows that our algorithms can easily handle a QTL model with a large number of main and epistatic effects on a personal computer, and outperform five other methods examined including the LASSO, HyperLasso, BhGLM, RVM and the single-QTL mapping method based on logistic regression in terms of power of detection and false positive rate. The utility of our algorithms is also demonstrated through analysis of a real data set. A software package implementing the empirical Bayesian algorithms in this paper is freely available upon request. Conclusions The EBLASSO logistic regression method can handle a large number of effects possibly including the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTLs mapping for complex binary traits. PMID:23410082
Use and interpretation of logistic regression in habitat-selection studies
Keating, Kim A.; Cherry, Steve
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in
ERIC Educational Resources Information Center
Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman
2007-01-01
This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression
ERIC Educational Resources Information Center
Elosua, Paula; Wells, Craig
2013-01-01
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
ERIC Educational Resources Information Center
Rudner, Lawrence
2016-01-01
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
ERIC Educational Resources Information Center
Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.
2010-01-01
Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
Comparison of a Bayesian network with a logistic regression model to forecast IgA nephropathy.
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
ERIC Educational Resources Information Center
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
Monte Carlo Evaluation of Two-Level Logistic Regression for Assessing Person Fit
ERIC Educational Resources Information Center
Woods, Carol M.
2008-01-01
Person fit is the degree to which an item response model fits for individual examinees. Reise (2000) described how two-level logistic regression can be used to detect heterogeneity in person fit, evaluate potential predictors of person fit heterogeneity, and identify potentially aberrant individuals. The method has apparently never been applied to…
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Zhu, Jin
2008-01-01
For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…
ERIC Educational Resources Information Center
Courtney, Jon R.; Prophet, Retta
2011-01-01
Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…
Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.
ERIC Educational Resources Information Center
Meshbane, Alice; Morris, John D.
A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…
Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Demenais, F M
1991-01-01
Statistical models have been developed to delineate the major-gene and non-major-gene factors accounting for the familial aggregation of complex diseases. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affection is defined by a threshold on the liability scale. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype, phenotypes of antecedents and other covariates. An equivalence between these two approaches cannot be derived analytically. I propose a formulation of the regressive logistic models on the supposition of an underlying liability model of disease. Relatives are assumed to have correlated liabilities to the disease; affected persons have liabilities exceeding an estimable threshold. Under the assumption that the correlation structure of the relatives' liabilities follows a regressive model, the regression coefficients on antecedents are expressed in terms of the relevant familial correlations. A parsimonious parameterization is a consequence of the assumed liability model, and a one-to-one correspondence with the parameters of the mixed model can be established. The logits, derived under the class A regressive model and under the class D regressive model, can be extended to include a large variety of patterns of family dependence, as well as gene-environment interactions. PMID:1897524
NASA Astrophysics Data System (ADS)
Heckmann, Tobias; Gegg, Katharina; Becht, Michael
2013-04-01
Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size
Sze, N N; Wong, S C; Lee, C Y
2014-12-01
In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed.
CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model
Wang, Liguo; Park, Hyun Jung; Dasari, Surendra; Wang, Shengqin; Kocher, Jean-Pierre; Li, Wei
2013-01-01
Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and ‘hidden’ transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly. PMID:23335781
Phillips, Kirk T.; Street, W. Nick
2005-01-01
The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients’ changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling. PMID:28269842
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
Assessing the effects of different types of covariates for binary logistic regression
NASA Astrophysics Data System (ADS)
Hamid, Hamzah Abdul; Wah, Yap Bee; Xie, Xian-Jin; Rahman, Hezlin Aryani Abd
2015-02-01
It is well known that the type of data distribution in the independent variable(s) may affect many statistical procedures. This paper investigates and illustrates the effect of different types of covariates on the parameter estimation of a binary logistic regression model. A simulation study with different sample sizes and different types of covariates (uniform, normal, skewed) was carried out. Results showed that parameter estimation of binary logistic regression model is severely overestimated when sample size is less than 150 for covariate which have normal and uniform distribution while the parameter is underestimated when the distribution of covariate is skewed. Parameter estimation improves for all types of covariates when sample size is large, that is at least 500.
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
ERIC Educational Resources Information Center
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing
2012-01-01
Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy.
Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing
2012-01-01
Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy. PMID:23012578
Lin, H. Y.; Desmond, R.; Liu, Y. H.; Bridges, S. L.; Soong, S. J.
2013-01-01
Summary Many complex disease traits are observed to be associated with single nucleotide polymorphism (SNP) interactions. In testing small-scale SNP-SNP interactions, variable selection procedures in logistic regressions are commonly used. The empirical evidence of variable selection for testing interactions in logistic regressions is limited. This simulation study was designed to compare nine variable selection procedures in logistic regressions for testing SNP-SNP interactions. Data on 10 SNPs were simulated for 400 and 1000 subjects (case/control ratio=1). The simulated model included one main effect and two 2-way interactions. The variable selection procedures included automatic selection (stepwise, forward and backward), common 2-step selection, AIC- and BIC-based selection. The hierarchical rule effect, in which all main effects and lower order terms of the highest-order interaction term are included in the model regardless of their statistical significance, was also examined. We found that the stepwise variable selection without the hierarchical rule which had reasonably high authentic (true positive) proportion and low noise (false positive) proportion, is a better method compared to other variable selection procedures. The procedure without the hierarchical rule requires fewer terms in testing interactions, so it can accommodate more SNPs than the procedure with the hierarchical rule. For testing interactions, the procedures without the hierarchical rule had higher authentic proportion and lower noise proportion compared with ones with the hierarchical rule. These variable selection procedures were also applied and compared in a rheumatoid arthritis study. PMID:18231122
Direkvand-Moghadam, Ashraf; Suhrabi, Zainab; Akbari, Malihe
2016-01-01
Background Female sexual dysfunction, which can occur during any stage of a normal sexual activity, is a serious condition for individuals and couples. The present study aimed to determine the prevalence and predictive factors of female sexual dysfunction in women referred to health centers in Ilam, the Western Iran, in 2014. Methods In the present cross-sectional study, 444 women who attended health centers in Ilam were enrolled from May to September 2014. Participants were selected according to the simple random sampling method. Univariate and multivariate logistic regression analyses were used to predict the risk factors of female sexual dysfunction. Diffe rences with an alpha error of 0.05 were regarded as statistically significant. Results Overall, 75.9% of the study population exhibited sexual dysfunction. Univariate logistic regression analysis demonstrated that there was a significant association between female sexual dysfunction and age, menarche age, gravidity, parity, and education (P<0.05). Multivariate logistic regression analysis indicated that, menarche age (odds ratio, 1.26), education level (odds ratio, 1.71), and gravida (odds ratio, 1.59) were independent predictive variables for female sexual dysfunction. Conclusion The majority of Iranian women suffer from sexual dysfunction. A lack of awareness of Iranian women's sexual pleasure and formal training on sexual function and its influencing factors, such as menarche age, gravida, and level of education, may lead to a high prevalence of female sexual dysfunction. PMID:27688863
Lee, Saro
2004-08-01
For landslide susceptibility mapping, this study applied and verified a Bayesian probability model, a likelihood ratio and statistical model, and logistic regression to Janghung, Korea, using a Geographic Information System (GIS). Landslide locations were identified in the study area from interpretation of IRS satellite imagery and field surveys; and a spatial database was constructed from topographic maps, soil type, forest cover, geology and land cover. The factors that influence landslide occurrence, such as slope gradient, slope aspect, and curvature of topography, were calculated from the topographic database. Soil texture, material, drainage, and effective depth were extracted from the soil database, while forest type, diameter, and density were extracted from the forest database. Land cover was classified from Landsat TM satellite imagery using unsupervised classification. The likelihood ratio and logistic regression coefficient were overlaid to determine each factor's rating for landslide susceptibility mapping. Then the landslide susceptibility map was verified and compared with known landslide locations. The logistic regression model had higher prediction accuracy than the likelihood ratio model. The method can be used to reduce hazards associated with landslides and to land cover planning.
Austin, Peter C; Steyerberg, Ewout W
2014-02-10
Predicting the probability of the occurrence of a binary outcome or condition is important in biomedical research. While assessing discrimination is an essential issue in developing and validating binary prediction models, less attention has been paid to methods for assessing model calibration. Calibration refers to the degree of agreement between observed and predicted probabilities and is often assessed by testing for lack-of-fit. The objective of our study was to examine the ability of graphical methods to assess the calibration of logistic regression models. We examined lack of internal calibration, which was related to misspecification of the logistic regression model, and external calibration, which was related to an overfit model or to shrinkage of the linear predictor. We conducted an extensive set of Monte Carlo simulations with a locally weighted least squares regression smoother (i.e., the loess algorithm) to examine the ability of graphical methods to assess model calibration. We found that loess-based methods were able to provide evidence of moderate departures from linearity and indicate omission of a moderately strong interaction. Misspecification of the link function was harder to detect. Visual patterns were clearer with higher sample sizes, higher incidence of the outcome, or higher discrimination. Loess-based methods were also able to identify the lack of calibration in external validation samples when an overfit regression model had been used. In conclusion, loess-based smoothing methods are adequate tools to graphically assess calibration and merit wider application.
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, Mark C; Higgins, Julian Pt
2016-12-01
Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.
NASA Astrophysics Data System (ADS)
Ko, K.; Cheong, B.; Koh, D.
2010-12-01
Groundwater has been used a main source to provide a drinking water in a rural area with no regional potable water supply system in Korea. More than 50 percent of rural area residents depend on groundwater as drinking water. Thus, research on predicting groundwater pollution for the sustainable groundwater usage and protection from potential pollutants was demanded. This study was carried out to know the vulnerability of groundwater nitrate contamination reflecting the effect of land use in Nonsan city of a representative rural area of South Korea. About 47% of the study area is occupied by cultivated land with high vulnerable area to groundwater nitrate contamination because it has higher nitrogen fertilizer input of 62.3 tons/km2 than that of country’s average of 44.0 tons/km2. The two vulnerability assessment methods, logistic regression and DRASTIC model, were tested and compared to know more suitable techniques for the assessment of groundwater nitrate contamination in Nonsan area. The groundwater quality data were acquired from the collection of analyses of 111 samples of small potable supply system in the study area. The analyzed values of nitrate were classified by land use such as resident, upland, paddy, and field area. One dependent and two independent variables were addressed for logistic regression analysis. One dependent variable was a binary categorical data with 0 or 1 whether or not nitrate exceeding thresholds of 1 through 10 mg/L. The independent variables were one continuous data of slope indicating topography and multiple categorical data of land use which are classified by resident, upland, paddy, and field area. The results of the Levene’s test and T-test for slope and land use were showed the significant difference of mean values among groups in 95% confidence level. From the logistic regression, we could know the negative correlation between slope and nitrate which was caused by the decrease of contaminants inputs into groundwater with
Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA.
Mair, Alan; El-Kadi, Aly I
2013-10-01
Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (>1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.
Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA
NASA Astrophysics Data System (ADS)
Mair, Alan; El-Kadi, Aly I.
2013-10-01
Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (> 1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.
NASA Astrophysics Data System (ADS)
Mǎgut, F. L.; Zaharia, S.; Glade, T.; Irimuş, I. A.
2012-04-01
Various methods exist in analyzing spatial landslide susceptibility and classing the results in susceptibility classes. The prediction of spatial landslide distribution can be performed by using a variety of methods based on GIS techniques. The two very common methods of a heuristic assessment and a logistic regression model are employed in this study in order to compare their performance in predicting the spatial distribution of previously mapped landslides for a study area located in Maramureš County, in Northwestern Romania. The first model determines a susceptibility index by combining the heuristic approach with GIS techniques of spatial data analysis. The criteria used for quantifying each susceptibility factor and the expression used to determine the susceptibility index are taken from the Romanian legislation (Governmental Decision 447/2003). This procedure is followed in any Romanian state-ordered study which relies on financial support. The logistic regression model predicts the spatial distribution of landslides by statistically calculating regressive coefficients which describe the dependency of previously mapped landslides on different factors. The identified shallow landslides correspond generally to Pannonian marl and Quaternary contractile clay deposits. The study region is located in the Northwestern part of Romania, including the Baia Mare municipality, the capital of Maramureš County. The study focuses on the former piedmontal region situated to the south of the volcanic mountains Gutâi, in the Baia Mare Depression, where most of the landslide activity has been recorded. In addition, a narrow sector of the volcanic mountains which borders the city of Baia Mare to the north has also been included to test the accuracy of the models in different lithologic units. The results of both models indicate a general medium landslide susceptibility of the study area. The more detailed differences will be discussed with respect to the advantages and
Montgomery, M E; White, M E; Martin, S W
1987-01-01
Results from discriminant analysis and logistic regression were compared using two data sets from a study on predictors of coliform mastitis in dairy cows. Both techniques selected the same set of variables as important predictors and were of nearly equal value in classifying cows as having, or not having mastitis. The logistic regression model made fewer classification errors. The magnitudes of the effects were considerably different for some variables. Given the failure to meet the underlying assumptions of discriminant analysis, the coefficients from logistic regression are preferable. PMID:3453271
Olson, Scott A.; Brouillette, Michael C.
2006-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.
An, Lihua; Fung, Karen Y; Krewski, Daniel
2010-09-01
Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.
Petersen, Jørgen Holm
2016-01-15
This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study.
Bent, Gardner C.; Archfield, Stacey A.
2002-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis
Analyzing Beijing's in-use vehicle emissions test results using logistic regression.
Chang, Cheng; Ortolano, Leonard
2008-10-01
A logistic regression model was built using vehicle emissions test data collected in 2003 for 129 604 motor vehicles in Beijing. The regression model uses vehicle model, model year, inspection station, ownership, and vehicle registration area as covariates to predict the probability that a vehicle fails an annual emissions test on the first try. Vehicle model is the most influential predictor variable: some vehicle models are much more likely to fail in emissions tests than an "average" vehicle. Five out of 14 vehicle models that performed the worst (out of a total of 52 models) were manufactured by foreign companies or by their joint ventures with Chinese enterprises. These 14 vehicle model types may have failed at relatively high rates because of design and manufacturing deficiencies, and such deficiencies cannot be easily detected and corrected without further efforts, such as programs for in-use surveillance and vehicle recall.
Fitzpatrick, Cole D; Rakasi, Saritha; Knodler, Michael A
2017-01-01
Speed is one of the most important factors in traffic safety as higher speeds are linked to increased crash risk and higher injury severities. Nearly a third of fatal crashes in the United States are designated as "speeding-related", which is defined as either "the driver behavior of exceeding the posted speed limit or driving too fast for conditions." While many studies have utilized the speeding-related designation in safety analyses, no studies have examined the underlying accuracy of this designation. Herein, we investigate the speeding-related crash designation through the development of a series of logistic regression models that were derived from the established speeding-related crash typologies and validated using a blind review, by multiple researchers, of 604 crash narratives. The developed logistic regression model accurately identified crashes which were not originally designated as speeding-related but had crash narratives that suggested speeding as a causative factor. Only 53.4% of crashes designated as speeding-related contained narratives which described speeding as a causative factor. Further investigation of these crashes revealed that the driver contributing code (DCC) of "driving too fast for conditions" was being used in three separate situations. Additionally, this DCC was also incorrectly used when "exceeding the posted speed limit" would likely have been a more appropriate designation. Finally, it was determined that the responding officer only utilized one DCC in 82% of crashes not designated as speeding-related but contained a narrative indicating speed as a contributing causal factor. The use of logistic regression models based upon speeding-related crash typologies offers a promising method by which all possible speeding-related crashes could be identified.
Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M
2014-12-01
Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed.
Simultaneous confidence bands for log-logistic regression with applications in risk assessment.
Kerns, Lucy X
2017-01-27
In risk assessment, it is often desired to make inferences on the low dose levels at which a specific benchmark risk is attained. Applications of simultaneous hyperbolic confidence bands for low-dose risk estimation with quantal data under different dose-response models (multistage, Abbott-adjusted Weibull, and Abbott-adjusted log-logistic models) have appeared in the literature. The use of simultaneous three-segment bands under the multistage model has also been proposed recently. In this article, we present explicit formulas for constructing asymptotic one-sided simultaneous hyperbolic and three-segment bands for the simple log-logistic regression model. We use the simultaneous construction to estimate upper hyperbolic and three-segment confidence bands on extra risk and to obtain lower limits on the benchmark dose by inverting the upper bands on risk under the Abbott-adjusted log-logistic model. Monte Carlo simulations evaluate the characteristics of the simultaneous limits. An example is given to illustrate the use of the proposed methods and to compare the two types of simultaneous limits at very low dose levels.
2011-01-01
Background The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (< -3.0), moderately undernourished (-3.0 to -2.01) and nourished (≥-2.0). Since nutrition status is ordinal, an OLR model-proportional odds model (POM) can be developed instead of two separate BLR models to find predictors of both malnutrition and severe malnutrition if the proportional odds assumption satisfies. The assumption is satisfied with low p-value (0.144) due to violation of the assumption for one co-variate. So partial proportional odds model (PPOM) and two BLR models have also been developed to check the applicability of the OLR model. Graphical test has also been adopted for checking the proportional odds assumption. Results All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. Conclusion These findings clearly justify that OLR models (POM and PPOM) are appropriate to find predictors of malnutrition instead of BLR models. PMID:22082256
Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village
NASA Astrophysics Data System (ADS)
Ohyver, Margaretha; Yongharto, Kimmy Octavian
2015-09-01
Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.
Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression
NASA Astrophysics Data System (ADS)
Dineva, S.; Prodanova, K.; Mlachkova, D.
2013-12-01
The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.
Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict
NASA Astrophysics Data System (ADS)
Ismail, Mohd Tahir; Alias, Siti Nor Shadila
2014-07-01
For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..
Battaglin, W.A.
1996-01-01
Agricultural chemicals (herbicides, insecticides, other pesticides and fertilizers) in surface water may constitute a human health risk. Recent research on unregulated rivers in the midwestern USA documents that elevated concentrations of herbicides occur for 1-4 months following application in spring and early summer. In contrast, nitrate concentrations in unregulated rivers are elevated during the fall, winter and spring. Natural and anthropogenic variables of river drainage basins, such as soil permeability, the amount of agricultural chemicals applied or percentage of land planted in corn, affect agricultural chemical concentrations in rivers. Logistic regression (LGR) models are used to investigate relations between various drainage basin variables and the concentration of selected agricultural chemicals in rivers. The method is successful in contributing to the understanding of agricultural chemical concentration in rivers. Overall accuracies of the best LGR models, defined as the number of correct classifications divided by the number of attempted classifications, averaged about 66%.
Confidence interval of difference of proportions in logistic regression in presence of covariates.
Reeve, Russell
2016-03-16
Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
NASA Astrophysics Data System (ADS)
Chan, H. C.; Chang, C. C.; Laio, P. Y.
2012-04-01
Landslide susceptibility analysis usually combines several factors, including the terrain, geology, and hydrology. The analysis tries to find a suitable combination of these factors in order to establish a landslide susceptibility model and calculate the susceptibility value. A potential landslide map can be established by using the calculated the susceptibility value of landslide. This study took Alishan area as an example and aimed to assess landslide susceptibility analysis by Logistic regression, a multivariate analysis method. In order to select the factors efficiently, the calibration and selection procedure were performed. The results were verified by a previous typhoon event. The classification error matrix was used to evaluate the accuracy of landslide predicted by the present model. Finally, this study applied 10-, 25-, 50-, and 100-year return periods precipitation to estimate the susceptibility values for the study area. The landslide susceptibilities were separated into four levels, including high, medium-high, medium, and low, to delineate the map of potential landslide.
Spackman, K. A.
1991-01-01
This paper presents maximum likelihood back-propagation (ML-BP), an approach to training neural networks. The widely reported original approach uses least squares back-propagation (LS-BP), minimizing the sum of squared errors (SSE). Unfortunately, least squares estimation does not give a maximum likelihood (ML) estimate of the weights in the network. Logistic regression, on the other hand, gives ML estimates for single layer linear models only. This report describes how to obtain ML estimates of the weights in a multi-layer model, and compares LS-BP to ML-BP using several examples. It shows that in many neural networks, least squares estimation gives inferior results and should be abandoned in favor of maximum likelihood estimation. Questions remain about the potential uses of multi-level connectionist models in such areas as diagnostic systems and risk-stratification in outcomes research. PMID:1807606
Logistic regression analysis of pedestrian casualty risk in passenger vehicle collisions in China.
Kong, Chunyu; Yang, Jikuang
2010-07-01
A large number of pedestrian fatalities were reported in China since the 1990s, however the exposure of pedestrians in public traffic has never been measured quantitatively using in-depth accident data. This study aimed to investigate the association between the impact speed and risk of pedestrian casualties in passenger vehicle collisions based on real-world accident cases in China. The cases were selected from a database of in-depth investigation of vehicle accidents in Changsha-IVAC. The sampling criteria were defined as (1) the accident was a frontal impact that occurred between 2003 and 2009; (2) the pedestrian age was above 14; (3) the injury according to the Abbreviated Injury Scale (AIS) was 1+; (4) the accident involved passenger cars, SUVs, or MPVs; and (5) the vehicle impact speed can be determined. The selected IVAC data set, which included 104 pedestrian accident cases, was weighted based on the national traffic accident data. The logistical regression models of the risks for pedestrian fatalities and AIS 3+ injuries were developed in terms of vehicle impact speed using the unweighted and weighted data sets. A multiple logistic regression model on the risk of pedestrian AIS 3+ injury was developed considering the age and impact speed as two variables. It was found that the risk of pedestrian fatality is 26% at 50 km/h, 50% at 58 km/h, and 82% at 70 km/h. At an impact speed of 80 km/h, the pedestrian rarely survives. The weighted risk curves indicated that the risks of pedestrian fatality and injury in China were higher than that in other high-income countries, whereas the risks of pedestrian casualty was lower than in these countries 30 years ago. The findings could have a contribution to better understanding of the exposures of pedestrians in urban traffic in China, and provide background knowledge for the development of strategies for pedestrian protection.
Modeling Group Size and Scalar Stress by Logistic Regression from an Archaeological Perspective
Alberti, Gianmarco
2014-01-01
Johnson’s scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups’ size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson’s theory and on Dunbar’s findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122–132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147–170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion. PMID:24626241
Optimization of Game Formats in U-10 Soccer Using Logistic Regression Analysis
Amatria, Mario; Arana, Javier; Anguera, M. Teresa; Garzón, Belén
2016-01-01
Abstract Small-sided games provide young soccer players with better opportunities to develop their skills and progress as individual and team players. There is, however, little evidence on the effectiveness of different game formats in different age groups, and furthermore, these formats can vary between and even within countries. The Royal Spanish Soccer Association replaced the traditional grassroots 7-a-side format (F-7) with the 8-a-side format (F-8) in the 2011-12 season and the country’s regional federations gradually followed suit. The aim of this observational methodology study was to investigate which of these formats best suited the learning needs of U-10 players transitioning from 5-aside futsal. We built a multiple logistic regression model to predict the success of offensive moves depending on the game format and the area of the pitch in which the move was initiated. Success was defined as a shot at the goal. We also built two simple logistic regression models to evaluate how the game format influenced the acquisition of technicaltactical skills. It was found that the probability of a shot at the goal was higher in F-7 than in F-8 for moves initiated in the Creation Sector-Own Half (0.08 vs 0.07) and the Creation Sector-Opponent's Half (0.18 vs 0.16). The probability was the same (0.04) in the Safety Sector. Children also had more opportunities to control the ball and pass or take a shot in the F-7 format (0.24 vs 0.20), and these were also more likely to be successful in this format (0.28 vs 0.19). PMID:28031768
Modeling group size and scalar stress by logistic regression from an archaeological perspective.
Alberti, Gianmarco
2014-01-01
Johnson's scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.
Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression
Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent
2015-01-01
Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569
NASA Astrophysics Data System (ADS)
Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.
2016-05-01
West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.
Lin, Wan-Yu; Schaid, Daniel J
2009-04-01
Recently, a genomic distance-based regression for multilocus associations was proposed (Wessel and Schork [2006] Am. J. Hum. Genet. 79:792-806) in which either locus or haplotype scoring can be used to measure genetic distance. Although it allows various measures of genomic similarity and simultaneous analyses of multiple phenotypes, its power relative to other methods for case-control analyses is not well known. We compare the power of traditional methods with this new distance-based approach, for both locus-scoring and haplotype-scoring strategies. We discuss the relative power of these association methods with respect to five properties: (1) the marker informativity; (2) the number of markers; (3) the causal allele frequency; (4) the preponderance of the most common high-risk haplotype; (5) the correlation between the causal single-nucleotide polymorphism (SNP) and its flanking markers. We found that locus-based logistic regression and the global score test for haplotypes suffered from power loss when many markers were included in the analyses, due to many degrees of freedom. In contrast, the distance-based approach was not as vulnerable to more markers or more haplotypes. A genotype counting measure was more sensitive to the marker informativity and the correlation between the causal SNP and its flanking markers. After examining the impact of the five properties on power, we found that on average, the genomic distance-based regression that uses a matching measure for diplotypes was the most powerful and robust method among the seven methods we compared.
Huang, Hai-Hui; Liu, Xiao-Ying; Liang, Yong
2016-01-01
Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods. PMID:27136190
Multiple logistic regression model of signalling practices of drivers on urban highways
NASA Astrophysics Data System (ADS)
Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana
2015-05-01
Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.
A logistic regression based approach for the prediction of flood warning threshold exceedance
NASA Astrophysics Data System (ADS)
Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara
2016-04-01
A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances
Geographical variation of unmet medical needs in Italy: a multivariate logistic regression analysis
2013-01-01
Background Unmet health needs should be, in theory, a minor issue in Italy where a publicly funded and universally accessible health system exists. This, however, does not seem to be the case. Moreover, in the last two decades responsibilities for health care have been progressively decentralized to regional governments, which have differently organized health service delivery within their territories. Regional decision-making has affected the use of health care services, further increasing the existing geographical disparities in the access to care across the country. This study aims at comparing self-perceived unmet needs across Italian regions and assessing how the reported reasons - grouped into the categories of availability, accessibility and acceptability – vary geographically. Methods Data from the 2006 Italian component of the European Union Statistics on Income and Living Conditions are employed to explore reasons and predictors of self-reported unmet medical needs among 45,175 Italian respondents aged 18 and over. Multivariate logistic regression models are used to determine adjusted rates for overall unmet medical needs and for each of the three categories of reasons. Results Results show that, overall, 6.9% of the Italian population stated having experienced at least one unmet medical need during the last 12 months. The unadjusted rates vary markedly across regions, thus resulting in a clear-cut north–south divide (4.6% in the North-East vs. 10.6% in the South). Among those reporting unmet medical needs, the leading reason was problems of accessibility related to cost or transportation (45.5%), followed by acceptability (26.4%) and availability due to the presence of too long waiting lists (21.4%). In the South, more than one out of two individuals with an unmet need refrained from seeing a physician due to economic reasons. In the northern regions, working and family responsibilities contribute relatively more to the underutilization of medical
Barros, Aluísio JD; Hirakata, Vânia N
2003-01-01
Background Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. Methods We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Results Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Conclusions Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations. PMID:14567763
Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid
2014-01-01
Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.
Erickson, Wallace P.; Nick, Todd G.; Ward, David H.; Peck, Roxy; Haugh, Larry D.; Goodman, Arnold
1998-01-01
Izembek Lagoon, an estuary in Alaska, is a very important staging area for Pacific brant, a small migratory goose. Each fall, nearly the entire Pacific Flyway population of 130,000 brant flies to Izembek Lagoon and feeds on eelgrass to accumulate fat reserves for nonstop transoceanic migration to wintering areas as distant as Mexico. In the past 10 years, offshore drilling activities in this area have increased, and, as a result, the air traffic in and out of the nearby Cold Bay airport has also increased. There has been a concern that this increased air traffic could affect the brant by disturbing them from their feeding and resting activities, which in turn could result in reduced energy intake and buildup. This may increase the mortality rates during their migratory journey. Because of these concerns, a study was conducted to investigate the flight response of brant to overflights of large helicopters. Response was measured on flocks during experimental overflights of large helicopters flown at varying altitudes and lateral (perpendicular) distances from the flocks. Logistic regression models were developed for predicting probability of flight response as a function of these distance variables. Results of this study may be used in the development of new FAA guidelines for aircraft near Izembek Lagoon.
Greenland, Sander; Mansournia, Mohammad Ali
2015-10-15
Penalization is a very general method of stabilizing or regularizing estimates, which has both frequentist and Bayesian rationales. We consider some questions that arise when considering alternative penalties for logistic regression and related models. The most widely programmed penalty appears to be the Firth small-sample bias-reduction method (albeit with small differences among implementations and the results they provide), which corresponds to using the log density of the Jeffreys invariant prior distribution as a penalty function. The latter representation raises some serious contextual objections to the Firth reduction, which also apply to alternative penalties based on t-distributions (including Cauchy priors). Taking simplicity of implementation and interpretation as our chief criteria, we propose that the log-F(1,1) prior provides a better default penalty than other proposals. Penalization based on more general log-F priors is trivial to implement and facilitates mean-squared error reduction and sensitivity analyses of penalty strength by varying the number of prior degrees of freedom. We caution however against penalization of intercepts, which are unduly sensitive to covariate coding and design idiosyncrasies.
Snedden, Gregg A.; Steyer, Gregory D.
2013-01-01
Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.
Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model
NASA Astrophysics Data System (ADS)
Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd
2016-10-01
Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.
[Clinical research XX. From clinical judgment to multiple logistic regression model].
Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O
2014-01-01
The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account.
Lyles, Robert H; Tang, Li; Superak, Hillary M; King, Caroline C; Celentano, David D; Lo, Yungtai; Sobel, Jack D
2011-07-01
Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.
Li, Wentian; Sun, Fengzhu; Grosse, Ivo
2004-01-01
One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression models, gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, L(D|M), and the expected maximum likelihood of the model given an ensemble of surrogate data with randomly permuted label, L(D(0)|M). Typically, the computational burden for obtaining L(D(0)M) is immense, often exceeding the limits of available computing resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme-value problem. We present the derivation of an asymptotic distribution of the extreme-value as well as its mean, median, and variance. Using this distribution, we propose two gene selection criteria, and we apply them to two microarray datasets and three classification tasks for illustration.
Predicting the occurrence of wildfires with binary structured additive regression models.
Ríos-Pena, Laura; Kneib, Thomas; Cadarso-Suárez, Carmen; Marey-Pérez, Manuel
2017-02-01
Wildfires are one of the main environmental problems facing societies today, and in the case of Galicia (north-west Spain), they are the main cause of forest destruction. This paper used binary structured additive regression (STAR) for modelling the occurrence of wildfires in Galicia. Binary STAR models are a recent contribution to the classical logistic regression and binary generalized additive models. Their main advantage lies in their flexibility for modelling non-linear effects, while simultaneously incorporating spatial and temporal variables directly, thereby making it possible to reveal possible relationships among the variables considered. The results showed that the occurrence of wildfires depends on many covariates which display variable behaviour across space and time, and which largely determine the likelihood of ignition of a fire. The joint possibility of working on spatial scales with a resolution of 1 × 1 km cells and mapping predictions in a colour range makes STAR models a useful tool for plotting and predicting wildfire occurrence. Lastly, it will facilitate the development of fire behaviour models, which can be invaluable when it comes to drawing up fire-prevention and firefighting plans.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity
ERIC Educational Resources Information Center
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
ERIC Educational Resources Information Center
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
ERIC Educational Resources Information Center
Hidalgo, M. Dolores; Lopez-Pina, Jose Antonio
2004-01-01
This article compares several procedures in their efficacy for detecting differential item functioning (DIF): logistic regression analysis, the Mantel-Haenszel (MH) procedure, and the modified Mantel-Haenszel procedure by Mazor, Clauser, and Hambleton. It also compares the effect size measures that these procedures provide. In this study,…
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-22
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
ERIC Educational Resources Information Center
Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.
2006-01-01
Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of
LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS.
Almquist, Zack W; Butts, Carter T
2014-08-01
Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach.
Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.
Heinze, Georg; Puhr, Rainer
2010-03-30
Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.
LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS
Almquist, Zack W.; Butts, Carter T.
2015-01-01
Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218
Nakasone, Yutaka Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori
2007-09-15
We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.
Duchesne, Thierry; Fortin, Daniel
2017-01-01
Conditional logistic regression (CLR) is widely used to analyze habitat selection and movement of animals when resource availability changes over space and time. Observations used for these analyses are typically autocorrelated, which biases model-based variance estimation of CLR parameters. This bias can be corrected using generalized estimating equations (GEE), an approach that requires partitioning the data into independent clusters. Here we establish the link between clustering rules in GEE and their effectiveness to remove statistical biases in variance estimation of CLR parameters. The current lack of guidelines is such that broad variation in clustering rules can be found among studies (e.g., 14–450 clusters) with unknown consequences on the robustness of statistical inference. We simulated datasets reflecting conditions typical of field studies. Longitudinal data were generated based on several parameters of habitat selection with varying strength of autocorrelation and some individuals having more observations than others. We then evaluated how changing the number of clusters impacted the effectiveness of variance estimators. Simulations revealed that 30 clusters were sufficient to get unbiased and relatively precise estimates of variance of parameter estimates. The use of destructive sampling to increase the number of independent clusters was successful at removing statistical bias, but only when observations were temporally autocorrelated and the strength of inter-individual heterogeneity was weak. GEE also provided robust estimates of variance for different magnitudes of unbalanced datasets. Our simulations demonstrate that GEE should be estimated by assigning each individual to a cluster when at least 30 animals are followed, or by using destructive sampling for studies with fewer individuals having intermediate level of behavioural plasticity in selection and temporally autocorrelated observations. The simulations provide valuable information to
Binary logistic regression analysis of hard palate dimensions for sexing human crania
Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna
2016-01-01
Sex determination is the preliminary step in every forensic investigation and the hard palate assumes significance in cranial sexing in cases involving burns and explosions due to its resistant nature and secluded location. This study analyzes the sexing potential of incisive foramen to posterior nasal spine length, palatine process of maxilla length, horizontal plate of palatine bone length and transverse length between the greater palatine foramina. The study deviates from the conventional method of measuring the maxillo-alveolar length and breadth as the dimensions considered in this study are more heat resistant and useful in situations with damaged alveolar margins. The study involves 50 male and 50 female adult dry skulls of Indian ethnic group. The dimensions measured were statistically analyzed using Student's t test, binary logistic regression and receiver operating characteristic curve. It was observed that the incisive foramen to posterior nasal spine length is a definite sex marker with sex predictability of 87.2%. The palatine process of maxilla length with 66.8% sex predictability and the horizontal plate of palatine bone length with 71.9% sex predictability cannot be relied upon as definite sex markers. The transverse length between the greater palatine foramina is statistically insignificant in sexing crania (P=0.318). Considering a significant overlap of values in both the sexes the palatal dimensions singularly cannot be relied upon for sexing. Nevertheless, considering the high sex predictability of incisive foramen to posterior nasal spine length this dimension can definitely be used to supplement other sexing evidence available to precisely conclude the cranial sex. PMID:27382518
Held, Elizabeth; Cape, Joshua; Tintle, Nathan
2016-01-01
Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach
2016-01-01
Background Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. Objective The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Methods Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. Results College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. Conclusion The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of e
Wagner, Philippe; Ghith, Nermin; Leckie, George
2016-01-01
Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Kleinman, Lawrence C; Norton, Edward C
2009-01-01
Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213
NASA Astrophysics Data System (ADS)
Conoscenti, Christian; Angileri, Silvia; Cappadonia, Chiara; Rotigliano, Edoardo; Agnesi, Valerio; Märker, Michael
2014-01-01
This research aims at characterizing susceptibility conditions to gully erosion by means of GIS and multivariate statistical analysis. The study area is a 9.5 km2 river catchment in central-northern Sicily, where agriculture activities are limited by intense erosion. By means of field surveys and interpretation of aerial images, we prepared a digital map of the spatial distribution of 260 gullies in the study area. In addition, from available thematic maps, a 5 m cell size digital elevation model and field checks, we derived 27 environmental attributes that describe the variability of lithology, land use, topography and road position. These attributes were selected for their potential influence on erosion processes, while the dependent variable was given by presence or absence of gullies within two different types of mapping units: 5 m grid cells and slope units (average size = 2.66 ha). The functional relationships between gully occurrence and the controlling factors were obtained from forward stepwise logistic regression to calculate the probability to host a gully for each mapping unit. In order to train and test the predictive models, three calibration and three validation subsets, of both grid cells and slope units, were randomly selected. Results of validation, based on ROC (receiving operating characteristic) curves, attest for acceptable to excellent accuracies of the models, showing better predictive skill and more stable performance of the susceptibility model based on grid cells.
NASA Astrophysics Data System (ADS)
Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio
2015-08-01
In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler
2013-02-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.
Li, J.; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
NASA Astrophysics Data System (ADS)
Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin
2016-06-01
Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.
Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A
2013-08-01
As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.
Hanson, Erin K; Mirza, Mohid; Rekab, Kamel; Ballantyne, Jack
2014-11-01
We report the identification of sensitive and specific miRNA biomarkers for menstrual blood, a tissue that might provide probative information in certain specialized instances. We incorporated these biomarkers into qPCR assays and developed a quantitative statistical model using logistic regression that permits the prediction of menstrual blood in a forensic sample with a high, and measurable, degree of accuracy. Using the developed model, we achieved 100% accuracy in determining the body fluid of interest for a set of test samples (i.e. samples not used in model development). The development, and details, of the logistic regression model are described. Testing and evaluation of the finalized logistic regression modeled assay using a small number of samples was carried out to preliminarily estimate the limit of detection (LOD), specificity in admixed samples and expression of the menstrual blood miRNA biomarkers throughout the menstrual cycle (25-28 days). The LOD was <1 ng of total RNA, the assay performed as expected with admixed samples and menstrual blood was identified only during the menses phase of the female reproductive cycle in two donors.
Zhao, Rui-Na; Zhang, Bo; Yang, Xiao; Jiang, Yu-Xin; Lai, Xing-Jian; Zhang, Xiao-Yan
2015-12-01
The purpose of the study described here was to determine specific characteristics of thyroid microcarcinoma (TMC) and explore the value of contrast-enhanced ultrasound (CEUS) combined with conventional ultrasound (US) in the diagnosis of TMC. Characteristics of 63 patients with TMC and 39 with benign sub-centimeter thyroid nodules were retrospectively analyzed. Multivariate logistic regression analysis was performed to determine independent risk factors. Four variables were included in the logistic regression models: age, shape, blood flow distribution and enhancement pattern. The area under the receiver operating characteristic curve was 0.919. With 0.113 selected as the cutoff value, sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 90.5%, 82.1%, 89.1%, 84.2% and 87.3%, respectively. Independent risk factors for TMC determined with the combination of CEUS and conventional US were age, shape, blood flow distribution and enhancement pattern. Age was negatively correlated with malignancy, whereas shape, blood flow distribution and enhancement pattern were positively correlated. The logistic regression model involving CEUS and conventional US was found to be effective in the diagnosis of sub-centimeter thyroid nodules.
Bent, Gardner C.; Steeves, Peter A.
2006-01-01
A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial
NASA Astrophysics Data System (ADS)
Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.
2014-05-01
Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution
Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni
2016-01-01
Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients’ functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan’s community-based occupational therapy (OT) service referral based on experts’ beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services. PMID:26863544
Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal
2017-01-01
Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047
Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye
2016-01-01
Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-11-01
The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
NASA Technical Reports Server (NTRS)
Smith, Kelly; Gay, Robert; Stachowiak, Susan
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross
Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu; Yoshinari, Kouichi; Honda, Hiroshi
2017-03-01
Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System were used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment.
Perez, Ivan; Chavez, Allison K.; Ponce, Dario
2016-01-01
Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732
Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby
2003-01-01
In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.
Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian
2016-01-01
Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar
2016-08-15
Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd.
2009-01-01
Background Laser-Doppler imaging (LDI) of cutaneous blood flow is beginning to be used by burn surgeons to predict the healing time of burn wounds; predicted healing time is used to determine wound treatment as either dressings or surgery. In this paper, we do a statistical analysis of the performance of the technique. Methods We used data from a study carried out by five burn centers: LDI was done once between days 2 to 5 post burn, and healing was assessed at both 14 days and 21 days post burn. Random-effects ordinal logistic regression and other models such as the continuation ratio model were used to model healing-time as a function of the LDI data, and of demographic and wound history variables. Statistical methods were also used to study the false-color palette, which enables the laser-Doppler imager to be used by clinicians as a decision-support tool. Results Overall performance is that diagnoses are over 90% correct. Related questions addressed were what was the best blood flow summary statistic and whether, given the blood flow measurements, demographic and observational variables had any additional predictive power (age, sex, race, % total body surface area burned (%TBSA), site and cause of burn, day of LDI scan, burn center). It was found that mean laser-Doppler flux over a wound area was the best statistic, and that, given the same mean flux, women recover slightly more slowly than men. Further, the likely degradation in predictive performance on moving to a patient group with larger %TBSA than those in the data sample was studied, and shown to be small. Conclusion Modeling healing time is a complex statistical problem, with random effects due to multiple burn areas per individual, and censoring caused by patients missing hospital visits and undergoing surgery. This analysis applies state-of-the art statistical methods such as the bootstrap and permutation tests to a medical problem of topical interest. New medical findings are that age and %TBSA are
Thomas, Laine; Stefanski, Leonard A.; Davidian, Marie
2013-01-01
In clinical studies, covariates are often measured with error due to biological fluctuations, device error and other sources. Summary statistics and regression models that are based on mismeasured data will differ from the corresponding analysis based on the “true” covariate. Statistical analysis can be adjusted for measurement error, however various methods exhibit a tradeo between convenience and performance. Moment Adjusted Imputation (MAI) is method for measurement error in a scalar latent variable that is easy to implement and performs well in a variety of settings. In practice, multiple covariates may be similarly influenced by biological fluctuastions, inducing correlated multivariate measurement error. The extension of MAI to the setting of multivariate latent variables involves unique challenges. Alternative strategies are described, including a computationally feasible option that is shown to perform well. PMID:24072947
Predictors of psychiatric disorders in liver transplantation candidates: logistic regression models.
Rocca, Paola; Cocuzza, Elena; Rasetti, Roberta; Rocca, Giuseppe; Zanalda, Enrico; Bogetto, Filippo
2003-07-01
This study has two goals. The first goal is to assess the prevalence of psychiatric disorders in orthotopic liver transplantation (OLT) candidates by means of standardized procedures because there has been little research concerning psychiatric problems of potential OLT candidates using standardized instruments. The second goal focuses on identifying predictors of these psychiatric disorders. One hundred sixty-five elective OLT candidates were assessed by our unit. Psychiatric diagnoses were based on the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition. Patients also were assessed using the Hamilton Depression Rating Scale (HDRS) and the Spielberger Anxiety Index, State and Trait forms (STAI-X1 and STAI-X2). Severity of cirrhosis was assessed by applying Child-Pugh score criteria. Chi-squared and general linear model analysis of variance were used to test the univariate association between patient characteristics and both clinical psychiatric diagnoses and severity of psychiatric diseases. Variables with P less than.10 in univariate analyses were included in multiple regression models. Forty-three percent of patients presented at least one psychiatric diagnosis. Child-Pugh score and previous psychiatric diagnoses were independent significant predictors of depressive disorders. Severity of psychiatric symptoms measured by psychometric scales (HDRS, STAI-X1, and STAI-X2) was associated with Child-Pugh score in the multiple regression model. Our data suggest a high rate of psychiatric disorders, particularly adjustment disorders, in our sample of OLT candidates. Severity of liver disease emerges as the most important variable in predicting severity of psychiatric disorders in these patients.
Almquist, Zack W; Butts, Carter T
2013-10-01
Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention-designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs.
2013-01-01
Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060
Jostins, Luke; McVean, Gilean
2016-01-01
Motivation: For many classes of disease the same genetic risk variants underly many related phenotypes or disease subtypes. Multinomial logistic regression provides an attractive framework to analyze multi-category phenotypes, and explore the genetic relationships between these phenotype categories. We introduce Trinculo, a program that implements a wide range of multinomial analyses in a single fast package that is designed to be easy to use by users of standard genome-wide association study software. Availability and implementation: An open source C implementation, with code and binaries for Linux and Mac OSX, is available for download at http://sourceforge.net/projects/trinculo Supplementary information: Supplementary data are available at Bioinformatics online. Contact: lj4@well.ox.ac.uk PMID:26873930
Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F
2007-03-20
Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In
Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y
2016-08-01
Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs.
Calef, M.P.; McGuire, A.D.; Epstein, H.E.; Rupp, T.S.; Shugart, H.H.
2005-01-01
Aim: To understand drivers of vegetation type distribution and sensitivity to climate change. Location: Interior Alaska. Methods: A logistic regression model was developed that predicts the potential equilibrium distribution of four major vegetation types: tundra, deciduous forest, black spruce forest and white spruce forest based on elevation, aspect, slope, drainage type, fire interval, average growing season temperature and total growing season precipitation. The model was run in three consecutive steps. The hierarchical logistic regression model was used to evaluate how scenarios of changes in temperature, precipitation and fire interval may influence the distribution of the four major vegetation types found in this region. Results: At the first step, tundra was distinguished from forest, which was mostly driven by elevation, precipitation and south to north aspect. At the second step, forest was separated into deciduous and spruce forest, a distinction that was primarily driven by fire interval and elevation. At the third step, the identification of black vs. white spruce was driven mainly by fire interval and elevation. The model was verified for Interior Alaska, the region used to develop the model, where it predicted vegetation distribution among the steps with an accuracy of 60-83%. When the model was independently validated for north-west Canada, it predicted vegetation distribution among the steps with an accuracy of 53-85%. Black spruce remains the dominant vegetation type under all scenarios, potentially expanding most under warming coupled with increasing fire interval. White spruce is clearly limited by moisture once average growing season temperatures exceeded a critical limit (+2 ??C). Deciduous forests expand their range the most when any two of the following scenarios are combined: decreasing fire interval, warming and increasing precipitation. Tundra can be replaced by forest under warming but expands under precipitation increase. Main
NASA Astrophysics Data System (ADS)
Espino, Natalia V.
Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.
NASA Astrophysics Data System (ADS)
Madhu, B.; Ashok, N. C.; Balasubramanian, S.
2014-11-01
Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.
Franklin, Erik C.; Kelley, Christopher; Frazer, L. Neil; Toonen, Robert J.
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence”) threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i. PMID:27441122
Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.
Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.
1994-06-01
Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.
Wyss, Richard; Ellis, Alan R; Brookhart, M Alan; Girman, Cynthia J; Jonsson Funk, Michele; LoCasale, Robert; Stürmer, Til
2014-09-15
The covariate-balancing propensity score (CBPS) extends logistic regression to simultaneously optimize covariate balance and treatment prediction. Although the CBPS has been shown to perform well in certain settings, its performance has not been evaluated in settings specific to pharmacoepidemiology and large database research. In this study, we use both simulations and empirical data to compare the performance of the CBPS with logistic regression and boosted classification and regression trees. We simulated various degrees of model misspecification to evaluate the robustness of each propensity score (PS) estimation method. We then applied these methods to compare the effect of initiating glucagonlike peptide-1 agonists versus sulfonylureas on cardiovascular events and all-cause mortality in the US Medicare population in 2007-2009. In simulations, the CBPS was generally more robust in terms of balancing covariates and reducing bias compared with misspecified logistic PS models and boosted classification and regression trees. All PS estimation methods performed similarly in the empirical example. For settings common to pharmacoepidemiology, logistic regression with balance checks to assess model specification is a valid method for PS estimation, but it can require refitting multiple models until covariate balance is achieved. The CBPS is a promising method to improve the robustness of PS models.
A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.
Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John
2011-01-01
This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.
Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.
2015-01-01
Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235
NASA Astrophysics Data System (ADS)
Xiao, Rui; Su, Shiliang; Mai, Gengchen; Zhang, Zhonghao; Yang, Chenxue
2015-02-01
Cash crop expansion has been a major land use change in tropical and subtropical regions worldwide. Quantifying the determinants of cash crop expansion should provide deeper spatial insights into the dynamics and ecological consequences of cash crop expansion. This paper investigated the process of cash crop expansion in Hangzhou region (China) from 1985 to 2009 using remotely sensed data. The corresponding determinants (neighborhood, physical, and proximity) and their relative effects during three periods (1985-1994, 1994-2003, and 2003-2009) were quantified by logistic regression modeling and variance partitioning. Results showed that the total area of cash crops increased from 58,874.1 ha in 1985 to 90,375.1 ha in 2009, with a net growth of 53.5%. Cash crops were more likely to grow in loam soils. Steep areas with higher elevation would experience less likelihood of cash crop expansion. A consistently higher probability of cash crop expansion was found on places with abundant farmland and forest cover in the three periods. Besides, distance to river and lake, distance to county center, and distance to provincial road were decisive determinants for farmers' choice of cash crop plantation. Different categories of determinants and their combinations exerted different influences on cash crop expansion. The joint effects of neighborhood and proximity determinants were the strongest, and the unique effect of physical determinants decreased with time. Our study contributed to understanding of the proximate drivers of cash crop expansion in subtropical regions.
NASA Astrophysics Data System (ADS)
Mousavi, S. Mostafa; Horton, Stephen P.; Langston, Charles A.; Samei, Borhan
2016-10-01
We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocentres and seismic features of the recorded signals in time, frequency and time-frequency domains. We applied the technique to 440 microearthquakes -1.7 < Mw < 1.29, induced by an underground cavern collapse in the Napoleonville Salt Dome in Bayou Corne, Louisiana. Forty different seismic attributes of whole seismograms including degree of polarization and spectral attributes were measured. A selected set of features was then used to train the system to discriminate between deep and shallow events based on the knowledge gained from existing patterns. The cross-validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 and 2000 m with 88 per cent and 90.7 per cent accuracy using logistic regression and artificial neural network models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2-D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine-learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.
Wilson, Asa B; Kerr, Bernard J; Bastian, Nathaniel D; Fulton, Lawrence V
2012-01-01
From 1980 to 1999, rural designated hospitals closed at a disproportionally high rate. In response to this emergent threat to healthcare access in rural settings, the Balanced Budget Act of 1997 made provisions for the creation of a new rural hospital--the critical access hospital (CAH). The conversion to CAH and the associated cost-based reimbursement scheme significantly slowed the closure rate of rural hospitals. This work investigates which methods can ensure the long-term viability of small hospitals. This article uses a two-step design to focus on a hypothesized relationship between technical efficiency of CAHs and a recently developed set of financial monitors for these entities. The goal is to identify the financial performance measures associated with efficiency. The first step uses data envelopment analysis (DEA) to differentiate efficient from inefficient facilities within a data set of 183 CAHs. Determining DEA efficiency is an a priori categorization of hospitals in the data set as efficient or inefficient. In the second step, DEA efficiency is the categorical dependent variable (efficient = 0, inefficient = 1) in the subsequent binary logistic regression (LR) model. A set of six financial monitors selected from the array of 20 measures were the LR independent variables. We use a binary LR to test the null hypothesis that recently developed CAH financial indicators had no predictive value for categorizing a CAH as efficient or inefficient, (i.e., there is no relationship between DEA efficiency and fiscal performance).
Wu, Liang; Deng, Fei; Xie, Zhong; Hu, Sheng; Shen, Shu; Shi, Junming; Liu, Dan
2016-01-01
Severe fever with thrombocytopenia syndrome (SFTS) is caused by severe fever with thrombocytopenia syndrome virus (SFTSV), which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity), the average proportion of rural population and the average proportion of primary industries over three years (2010–2012). We constructed a geographically weighted logistic regression (GWLR) model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1) meteorological factors have a strong influence on the SFTSV cover; (2) a GWLR model is suitable for exploring SFTSV cover in mainland China; (3) our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies. PMID:27845737
Herndon, Nic; Caragea, Doina
2016-01-01
Supervised classifiers are highly dependent on abundant labeled training data. Alternatives for addressing the lack of labeled data include: labeling data (but this is costly and time consuming); training classifiers with abundant data from another domain (however, the classification accuracy usually decreases as the distance between domains increases); or complementing the limited labeled data with abundant unlabeled data from the same domain and learning semi-supervised classifiers (but the unlabeled data can mislead the classifier). A better alternative is to use both the abundant labeled data from a source domain, the limited labeled data and optionally the unlabeled data from the target domain to train classifiers in a domain adaptation setting. We propose two such classifiers, based on logistic regression, and evaluate them for the task of splice site prediction – a difficult and essential step in gene prediction. Our classifiers achieved high accuracy, with highest areas under the precision-recall curve between 50.83% and 82.61%. PMID:26849871
Dai, Huanping; Micheyl, Christophe
2012-11-01
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
NASA Astrophysics Data System (ADS)
Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.
2017-02-01
Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.
Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco
2017-03-02
Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.
Liu, Hongjie; Li, Tianhao; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua
2016-01-01
Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p < 0.05), and the coefficient was 0.178 (p < 0.05); also they were related with five flavors (p < 0.05), and the coefficient was 0.145 (p < 0.05); they were not related with channel tropism (p > 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs. PMID:27656240
NASA Astrophysics Data System (ADS)
Song, Sutao; Chen, Gongxiang; Zhan, Yu; Zhang, Jiacai; Yao, Li
2014-03-01
Recently, sparse algorithms, such as Sparse Multinomial Logistic Regression (SMLR), have been successfully applied in decoding visual information from functional magnetic resonance imaging (fMRI) data, where the contrast of visual stimuli was predicted by a classifier. The contrast classifier combined brain activities of voxels with sparse weights. For sparse algorithms, the goal is to learn a classifier whose weights distributed as sparse as possible by introducing some prior belief about the weights. There are two ways to introduce a sparse prior constraints for weights: the Automatic Relevance Determination (ARD-SMLR) and Laplace prior (LAP-SMLR). In this paper, we presented comparison results between the ARD-SMLR and LAP-SMLR models in computational time, classification accuracy and voxel selection. Results showed that, for fMRI data, no significant difference was found in classification accuracy between these two methods when voxels in V1 were chosen as input features (totally 1017 voxels). As for computation time, LAP-SMLR was superior to ARD-SMLR; the survived voxels for ARD-SMLR was less than LAP-SMLR. Using simulation data, we confirmed the classification performance for the two SMLR models was sensitive to the sparsity of the initial features, when the ratio of relevant features to the initial features was larger than 0.01, ARD-SMLR outperformed LAP-SMLR; otherwise, LAP-SMLR outperformed LAP-SMLR. Simulation data showed ARD-SMLR was more efficient in selecting relevant features.
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Fufaa, Gudeta; Cai, Bin
2016-01-01
Context Reduced insulin sensitivity is one of the traditional risk factors for chronic diseases such as type 2 diabetes. Reduced insulin sensitivity leads to insulin resistance, which in turn can lead to the development of type 2 diabetes. Few studies have examined factors such as blood pressure, tobacco and alcohol consumption that influence changes in insulin sensitivity over time especially among young adults. Purpose To examine temporal changes in insulin sensitivity in young adults (18-30 years of age at baseline) over a period of 20 years by taking into account the effects of tobacco and alcohol consumptions at baseline. In other words, the purpose of the present study is to examine if baseline tobacco and alcohol consumptions can be used in predicting lowered insulin sensitivity. Method This is a retrospective study using data collected by the Coronary Artery Risk Development in Young Adults (CARDIA) study from the National Heart, Lung, and Blood Institute. Participants were enrolled into the study in 1985 (baseline) and followed up to 2005. Insulin sensitivity, measured by the quantitative insulin sensitivity check index (QUICKI), was recorded at baseline and 20 years later, in 2005. The number of participants included in the study was 3,547. The original study included a total of 5,112 participants at baseline. Of these, 54.48% were female, and 45.52% were male; 45.31% were 18 to 24 years of age, and 54.69% were 25 to 30 years of age. Ordinal logistic regression was used to assess changes in insulin sensitivity. Changes in insulin sensitivity from baseline were calculated and grouped into three categories (more than 15%, more than 8.5% to at most 15%, and at most 8.5%), which provided the basis for employing ordinal logistic regression to assess changes in insulin sensitivity. The effects of alcohol and smoking consumption at baseline on the change in insulin sensitivity were accounted for by including these variables in the model. Results Daily alcohol
Tang, Li-Na; Ye, Xiao-Zhou; Yan, Qiu-Ge; Chang, Hong-Juan; Ma, Yu-Qiao; Liu, De-Bin; Li, Zhi-Gen; Yu, Yi-Zhen
2017-02-01
The risk factors of high trait anger of juvenile offenders were explored through questionnaire study in a youth correctional facility of Hubei province, China. A total of 1090 juvenile offenders in Hubei province were investigated by self-compiled social-demographic questionnaire, Childhood Trauma Questionnaire (CTQ), and State-Trait Anger Expression Inventory-II (STAXI-II). The risk factors were analyzed by chi-square tests, correlation analysis, and binary logistic regression analysis with SPSS 19.0. A total of 1082 copies of valid questionnaires were collected. High trait anger group (n=316) was defined as those who scored in the upper 27th percentile of STAXI-II trait anger scale (TAS), and the rest were defined as low trait anger group (n=766). The risk factors associated with high level of trait anger included: childhood emotional abuse, childhood sexual abuse, step family, frequent drug abuse, and frequent internet using (P<0.05 or P<0.01). Birth sequence, number of sibling, ranking in the family, identity of the main care-taker, the education level of care-taker, educational style of care-taker, family income, relationship between parents, social atmosphere of local area, frequent drinking, and frequent smoking did not predict to high level of trait anger (P>0.05). It was suggested that traumatic experience in childhood and unhealthy life style may significantly increase the level of trait anger in adulthood. The risk factors of high trait anger and their effects should be taken into consideration seriously.
NASA Astrophysics Data System (ADS)
Yao, W.; Poleswki, P.; Krzystek, P.
2016-06-01
The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.
Boutilier, J; Chan, T; Lee, T; Craig, T; Sharpe, M
2014-06-15
Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.
Wei, Peng; Tang, Hongwei; Li, Donghui
2014-01-01
Most complex human diseases are likely the consequence of the joint actions of genetic and environmental factors. Identification of gene-environment (GxE) interactions not only contributes to a better understanding of the disease mechanisms, but also improves disease risk prediction and targeted intervention. In contrast to the large number of genetic susceptibility loci discovered by genome-wide association studies, there have been very few successes in identifying GxE interactions which may be partly due to limited statistical power and inaccurately measured exposures. While existing statistical methods only consider interactions between genes and static environmental exposures, many environmental/lifestyle factors, such as air pollution and diet, change over time, and cannot be accurately captured at one measurement time point or by simply categorizing into static exposure categories. There is a dearth of statistical methods for detecting gene by time-varying environmental exposure interactions. Here we propose a powerful functional logistic regression (FLR) approach to model the time-varying effect of longitudinal environmental exposure and its interaction with genetic factors on disease risk. Capitalizing on the powerful functional data analysis framework, our proposed FLR model is capable of accommodating longitudinal exposures measured at irregular time points and contaminated by measurement errors, commonly encountered in observational studies. We use extensive simulations to show that the proposed method can control the Type I error and is more powerful than alternative ad hoc methods. We demonstrate the utility of this new method using data from a case-control study of pancreatic cancer to identify the windows of vulnerability of lifetime body mass index on the risk of pancreatic cancer as well as genes which may modify this association. PMID:25219575
Mamouridis, Valeria; Klein, Nadja; Kneib, Thomas; Cadarso Suarez, Carmen; Maynou, Francesc
2017-01-01
We analysed the landings per unit effort (LPUE) from the Barcelona trawl fleet targeting the red shrimp (Aristeus antennatus) using novel Bayesian structured additive distributional regression to gain a better understanding of the dynamics and determinants of variation in LPUE. The data set, covering a time span of 17 years, includes fleet-dependent variables (e.g. the number of trips performed by vessels), temporal variables (inter- and intra-annual variability) and environmental variables (the North Atlantic Oscillation index). Based on structured additive distributional regression, we evaluate (i) the gain in replacing purely linear predictors by additive predictors including nonlinear effects of continuous covariates, (ii) the inclusion of vessel-specific effects based on either fixed or random effects, (iii) different types of distributions for the response, and (iv) the potential gain in not only modelling the location but also the scale/shape parameter of these distributions. Our findings support that flexible model variants are indeed able to improve the fit considerably and that additional insights can be gained. Tools to select within several model specifications and assumptions are discussed in detail as well.
Marginal regression approach for additive hazards models with clustered current status data.
Su, Pei-Fang; Chi, Yunchan
2014-01-15
Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics and demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Because the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n(1/3) to a complicated limiting distribution. Hence, semiparametric regression models such as the additive hazards model have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n(1/2) , for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intracluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model for clustered current status data. The comparative results from simulation studies are presented, and the application of the proposed estimating functions to one real data set is illustrated.
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
NASA Astrophysics Data System (ADS)
Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida
2016-10-01
bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.
NASA Astrophysics Data System (ADS)
Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.
2014-02-01
Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In
NASA Astrophysics Data System (ADS)
Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.
2013-06-01
Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
NASA Astrophysics Data System (ADS)
Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange
2016-04-01
In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.
Inokuchi, Shota; Kitayama, Tetsushi; Fujii, Koji; Nakahara, Hiroaki; Nakanishi, Hiroaki; Saito, Kazuyuki; Mizuno, Natsuko; Sekiguchi, Kazumasa
2016-03-01
Phenomena called allele dropouts are often observed in crime stain profiles. Allele dropouts are generated because one of a pair of heterozygous alleles is underrepresented by stochastic influences and is indicated by a low peak detection threshold. Therefore, it is important that such risks are statistically evaluated. In recent years, attempts to interpret allele dropout probabilities by logistic regression using the information on peak heights have been reported. However, these previous studies are limited to the use of a human identification kit and fragment analyzer. In the present study, we calculated allele dropout probabilities by logistic regression using contemporary capillary electrophoresis instruments, 3500xL Genetic Analyzer and 3130xl Genetic Analyzer with various commercially available human identification kits such as AmpFℓSTR® Identifiler® Plus PCR Amplification Kit. Furthermore, the differences in logistic curves between peak detection thresholds using analytical threshold (AT) and values recommended by the manufacturer were compared. The standard logistic curves for calculating allele dropout probabilities from the peak height of sister alleles were characterized. The present study confirmed that ATs were lower than the values recommended by the manufacturer in human identification kits; therefore, it is possible to reduce allele dropout probabilities and obtain more information using AT as the peak detection threshold.
Regression analysis of mixed recurrent-event and panel-count data with additive rate models.
Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L
2015-03-01
Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study.
Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim
2014-01-01
The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.
Yi, Honggang; Wo, Hongmei; Zhao, Yang; Zhang, Ruyang; Dai, Junchen; Jin, Guangfu; Ma, Hongxia; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng
2015-01-01
Abstract With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the performance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data. PMID:26243516
Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling.
Melcher, Michael; Scharl, Theresa; Luchner, Markus; Striedner, Gerald; Leisch, Friedrich
2017-02-01
The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc.
Heller, J; Innocent, G T; Denwood, M; Reid, S W J; Kelly, L; Mellor, D J
2011-05-01
Meticillin-resistant Staphylococcus aureus (MRSA) is an important nosocomial and community-acquired pathogen with zoonotic potential. The relationship between MRSA in humans and companion animals is poorly understood. This study presents a quantitative exposure assessment, based on expert opinion and published data, in the form of a second order stochastic simulation model with accompanying logistic regression sensitivity analysis that aims to define the most important factors for MRSA acquisition in dogs. The simulation model was parameterised using expert opinion estimates, along with published and unpublished data. The outcome of the model was biologically plausible and found to be dominated by uncertainty over variability. The sensitivity analysis, in the form of four separate logistic regression models, found that both veterinary and non-veterinary routes of acquisition of MRSA are likely to be relevant for dogs. The effects of exposure to, and probability of, transmission of MRSA from the home environment were ranked as the most influential predictors in all sensitivity analyses, although it is unlikely that this environmental source of MRSA is independent of alternative sources of MRSA (human and/or animal). Exposure to and transmission from MRSA positive family members were also found to be influential for acquisition of MRSA in pet dogs, along with veterinary clinic attendance and, while exposure to and transmission from the veterinary clinic environment was also found to be influential, it was difficult to differentiate between the importance of independent sources of MRSA within the veterinary clinic. The implementation of logistic regression analyses directly to the input/output relationship within the simulation model presented in this paper represents the application of a variance based sensitivity analysis technique in the area of veterinary medicine and is a useful means of ranking the relative importance of input variables.
Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G
2016-05-13
We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P < 0.05). The results of logistic regression analysis indicted that individual detection of serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153 and their combined detection was effective for diagnosing colorectal cancer. Combined detection had a better diagnostic effect with a sensitivity of 94.2% and specificity of 97.7%; combining serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153, with the support vector machine diagnosis model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.
Mei, Suyu; Zhang, Kun
2016-01-01
Protein-protein interaction (PPI) networks are naturally viewed as infrastructure to infer signalling pathways. The descriptors of signal events between two interacting proteins such as upstream/downstream signal flow, activation/inhibition relationship and protein modification are indispensable for inferring signalling pathways from PPI networks. However, such descriptors are not available in most cases as most PPI networks are seldom semantically annotated. In this work, we extend ℓ2-regularized logistic regression to the scenario of multi-label learning for predicting the activation/inhibition relationships in human PPI networks. The phenomenon that both activation and inhibition relationships exist between two interacting proteins is computationally modelled by multi-label learning framework. The problem of GO (gene ontology) sparsity is tackled by introducing the homolog knowledge as independent homolog instances. ℓ2-regularized logistic regression is accordingly adopted here to penalize the homolog noise and to reduce the computational complexity of the double-sized training data. Computational results show that the proposed method achieves satisfactory multi-label learning performance and outperforms the existing phenotype correlation method on the experimental data of Drosophila melanogaster. Several predictions have been validated against recent literature. The predicted activation/inhibition relationships in human PPI networks are provided in the supplementary file for further biomedical research. PMID:27819359
NASA Astrophysics Data System (ADS)
Dixon, Barnali
2009-09-01
Accurate and inexpensive identification of potentially contaminated wells is critical for water resources protection and management. The objectives of this study are to 1) assess the suitability of approximation tools such as neural networks (NN) and support vector machines (SVM) integrated in a geographic information system (GIS) for identifying contaminated wells and 2) use logistic regression and feature selection methods to identify significant variables for transporting contaminants in and through the soil profile to the groundwater. Fourteen GIS derived soil hydrogeologic and landuse parameters were used as initial inputs in this study. Well water quality data (nitrate-N) from 6,917 wells provided by Florida Department of Environmental Protection (USA) were used as an output target class. The use of the logistic regression and feature selection methods reduced the number of input variables to nine. Receiver operating characteristics (ROC) curves were used for evaluation of these approximation tools. Results showed superior performance with the NN as compared to SVM especially on training data while testing results were comparable. Feature selection did not improve accuracy; however, it helped increase the sensitivity or true positive rate (TPR). Thus, a higher TPR was obtainable with fewer variables.
Belletti, Nicoletta; Kamdem, Sylvain Sado; Patrignani, Francesca; Lanciotti, Rosalba; Covelli, Alessandro; Gardini, Fausto
2007-09-01
The combined effects of a mild heat treatment (55 degrees C) and the presence of three aroma compounds [citron essential oil, citral, and (E)-2-hexenal] on the spoilage of noncarbonated beverages inoculated with different amounts of a Saccharomyces cerevisiae strain were evaluated. The results, expressed as growth/no growth, were elaborated using a logistic regression in order to assess the probability of beverage spoilage as a function of thermal treatment length, concentration of flavoring agents, and yeast inoculum. The logit models obtained for the three substances were extremely precise. The thermal treatment alone, even if prolonged for 20 min, was not able to prevent yeast growth. However, the presence of increasing concentrations of aroma compounds improved the stability of the products. The inhibiting effect of the compounds was enhanced by a prolonged thermal treatment. In fact, it influenced the vapor pressure of the molecules, which can easily interact within microbial membranes when they are in gaseous form. (E)-2-Hexenal showed a threshold level, related to initial inoculum and thermal treatment length, over which yeast growth was rapidly inhibited. Concentrations over 100 ppm of citral and thermal treatment longer than 16 min allowed a 90% probability of stability for bottles inoculated with 10(5) CFU/bottle. Citron gave the most interesting responses: beverages with 500 ppm of essential oil needed only 3 min of treatment to prevent yeast growth. In this framework, the logistic regression proved to be an important tool to study alternative hurdle strategies for the stabilization of noncarbonated beverages.
ERIC Educational Resources Information Center
Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti
2010-01-01
In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…
ERIC Educational Resources Information Center
Wing, Coady; Cook, Thomas D.
2013-01-01
The sharp regression discontinuity design (RDD) has three key weaknesses compared to the randomized clinical trial (RCT). It has lower statistical power, it is more dependent on statistical modeling assumptions, and its treatment effect estimates are limited to the narrow subpopulation of cases immediately around the cutoff, which is rarely of…
Camargos, Vitor Passos; César, Cibele Comini; Caiaffa, Waleska Teixeira; Xavier, Cesar Coelho; Proietti, Fernando Augusto
2011-12-01
Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.
Diaz, Andres; Rossow, Stephanie; Ciarlet, Max; Marthaler, Douglas
2016-01-01
Rotaviruses (RV) are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively) had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82%) were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs) were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs) for RVA and RVB detection were lower for 1–3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1–3 day old pigs was significantly higher (p < 0.001) than pigs in the 4–20 days old and >55 day old age groups. Furthermore, pigs in the 21–55 day old age group had statistically higher cORs of RV co-detection compared to 1–3 day old pigs (p < 0.001). The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America. PMID:27145176
NASA Astrophysics Data System (ADS)
WU, Chunhung
2015-04-01
The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2
Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi
2016-12-01
There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.
Choi, Seung W.; Gibbons, Laura E.; Crane, Paul K.
2011-01-01
Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group–specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient–Reported Outcomes Measurement Information System. PMID:21572908
Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji
2005-01-01
Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.
Dubovyk, Olena; Menz, Gunter; Conrad, Christopher; Kan, Elena; Machwitz, Miriam; Khamzina, Asia
2013-06-01
Advancing land degradation in the irrigated areas of Central Asia hinders sustainable development of this predominantly agricultural region. To support decisions on mitigating cropland degradation, this study combines linear trend analysis and spatial logistic regression modeling to expose a land degradation trend in the Khorezm region, Uzbekistan, and to analyze the causes. Time series of the 250-m MODIS NDVI, summed over the growing seasons of 2000-2010, were used to derive areas with an apparent negative vegetation trend; this was interpreted as an indicator of land degradation. About one third (161,000 ha) of the region's area experienced negative trends of different magnitude. The vegetation decline was particularly evident on the low-fertility lands bordering on the natural sandy desert, suggesting that these areas should be prioritized in mitigation planning. The results of logistic modeling indicate that the spatial pattern of the observed trend is mainly associated with the level of the groundwater table (odds = 330 %), land-use intensity (odds = 103 %), low soil quality (odds = 49 %), slope (odds = 29 %), and salinity of the groundwater (odds = 26 %). Areas, threatened by land degradation, were mapped by fitting the estimated model parameters to available data. The elaborated approach, combining remote-sensing and GIS, can form the basis for developing a common tool for monitoring land degradation trends in irrigated croplands of Central Asia.
Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.
2015-01-01
Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294
Robertson, John M.; Soehn, Matthias; Yan Di
2010-05-01
Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to develop a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.
Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.
2016-01-01
Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.
Additional results on 'Reducing geometric dilution of precision using ridge regression'
NASA Astrophysics Data System (ADS)
Kelly, Robert J.
1990-07-01
Kelly (1990) presented preliminary results on the feasibility of using ridge regression (RR) to reduce the effects of geometric dilution of precision (GDOP) error inflation in position-fix navigation systems. Recent results indicate that RR will not reduce GDOP bias inflation when biaslike measurement errors last much longer than the aircraft guidance-loop response time. This conclusion precludes the use of RR on navigation systems whose dominant error sources are biaslike; e.g., the GPS selective-availability error source. The simulation results given by Kelly are, however, valid for the conditions defined. Although RR has not yielded a satisfactory solution to the general GDOP problem, it has illuminated the role that multicollinearity plays in navigation signal processors such as the Kalman filter. Bias inflation, initial position guess errors, ridge-parameter selection methodology, and the recursive ridge filter are discussed.
Analysis of the Potential Impact of Additive Manufacturing on Army Logistics
2013-12-01
highlighting the state of AM during the Leading Edge Forum titled 3D Printing and the Future of Additive Manufacturing. In the program, CSC highlighted...great success is Boeing. CSC (2012) described Boeing’s experience using AM as follows: Boeing, a pioneer in 3D printing , has printed 22,000...components that are used in a variety of aircraft. For example, Boeing has used 3D printing to produce environmental control ducting (ECD) for its new 787
NASA Astrophysics Data System (ADS)
Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.
2014-12-01
This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust
O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C
2016-02-01
This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.
2009-01-01
Background There is a growing awareness that interaction between multiple genes play an important role in the risk of common, complex multi-factorial diseases. Many common diseases are affected by certain genotype combinations (associated with some genes and their interactions). The identification and characterization of these susceptibility genes and gene-gene interaction have been limited by small sample size and large number of potential interactions between genes. Several methods have been proposed to detect gene-gene interaction in a case control study. The penalized logistic regression (PLR), a variant of logistic regression with L2 regularization, is a parametric approach to detect gene-gene interaction. On the other hand, the Multifactor Dimensionality Reduction (MDR) is a nonparametric and genetic model-free approach to detect genotype combinations associated with disease risk. Methods We compared the power of MDR and PLR for detecting two-way and three-way interactions in a case-control study through extensive simulations. We generated several interaction models with different magnitudes of interaction effect. For each model, we simulated 100 datasets, each with 200 cases and 200 controls and 20 SNPs. We considered a wide variety of models such as models with just main effects, models with only interaction effects or models with both main and interaction effects. We also compared the performance of MDR and PLR to detect gene-gene interaction associated with acute rejection(AR) in kidney transplant patients. Results In this paper, we have studied the power of MDR and PLR for detecting gene-gene interaction in a case-control study through extensive simulation. We have compared their performances for different two-way and three-way interaction models. We have studied the effect of different allele frequencies on these methods. We have also implemented their performance on a real dataset. As expected, none of these methods were consistently better for all
NASA Astrophysics Data System (ADS)
Nahib, Irmadi; Suryanta, Jaka
2017-01-01
Forest destruction, climate change and global warming could reduce an indirect forest benefit because forest is the largest carbon sink and it plays a very important role in global carbon cycle. To support Reducing Emissions from Deforestation and Forest Degradation (REDD +) program, people pay attention of forest cover changes as the basis for calculating carbon stock changes. This study try to explore the forest cover dynamics as well as the prediction model of forest cover in Indragiri Hulu Regency, Riau Province Indonesia. The study aims to analyse some various explanatory variables associated with forest conversion processes and predict forest cover change using logistic regression model (LRM). The main data used in this study is Land use/cover map (1990 – 2011). Performance of developed model was assessed through a comparison of the predicted model of forest cover change and the actual forest cover in 2011. The analysis result showed that forest cover has decreased continuously between 1990 and 2011, up to the loss of 165,284.82 ha (35.19 %) of forest area. The LRM successfully predicted the forest cover for the period 2010 with reasonably high accuracy (ROC = 92.97 % and 70.26 %).
NASA Astrophysics Data System (ADS)
Sidi, P.; Mamat, M.; Sukono; Supian, S.
2017-01-01
Floods have always occurred in the Citarum river basin. The adverse effects caused by floods can cover all their property, including the destruction of houses. The impact due to damage to residential buildings is usually not small. Indeed, each of flooding, the government and several social organizations providing funds to repair the building. But the donations are given very limited, so it cannot cover the entire cost of repair was necessary. The presence of insurance products for property damage caused by the floods is considered very important. However, if its presence is also considered necessary by the public or not? In this paper, the factors that affect the supply and demand of insurance product for damaged building due to floods are analyzed. The method used in this analysis is the ordinal logistic regression. Based on the analysis that the factors that affect the supply and demand of insurance product for damaged building due to floods, it is included: age, economic circumstances, family situations, insurance motivations, and lifestyle. Simultaneously that the factors affecting supply and demand of insurance product for damaged building due to floods mounted to 65.7%.
Allahyari, Elahe; Jafari, Peyman; Bagheri, Zahra
2016-01-01
Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected. PMID:27403207
Enticott, Joanne C; Cheng, I-Hao; Russell, Grant; Szwarc, Josef; Braitberg, George; Peek, Anne; Meadows, Graham
2015-01-01
This study investigated if people born in refugee source countries are disproportionately represented among those receiving a diagnosis of mental illness within emergency departments (EDs). The setting was the Cities of Greater Dandenong and Casey, the resettlement region for one-twelfth of Australia's refugees. An epidemiological, secondary data analysis compared mental illness diagnoses received in EDs by refugee and non-refugee populations. Data was the Victorian Emergency Minimum Dataset in the 2008-09 financial year. Univariate and multivariate logistic regression created predictive models for mental illness using five variables: age, sex, refugee background, interpreter use and preferred language. Collinearity, model fit and model stability were examined. Multivariate analysis showed age and sex to be the only significant risk factors for mental illness diagnosis in EDs. 'Refugee status', 'interpreter use' and 'preferred language' were not associatedwith a mental health diagnosis following risk adjustment forthe effects ofage and sex. The disappearance ofthe univariate association after adjustment for age and sex is a salutary lesson for Medicare Locals and other health planners regarding the importance of adjusting analyses of health service data for demographic characteristics.
Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.
1985-08-01
The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.
Mumcu Kucuker, Derya; Baskent, Emin Zeki
2015-01-01
Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans.
JACOB, BENJAMIN G.; NOVAK, ROBERT J.; TOE, LAURENT; SANFO, MOUSSA S.; AFRIYIE, ABENA N.; IBRAHIM, MOHAMMED A.; GRIFFITH, DANIEL A.; UNNASCH, THOMAS R.
2013-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter
Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R
2012-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Mendes, Renata G.; de Souza, César R.; Machado, Maurício N.; Correa, Paulo R.; Di Thommazo-Luporini, Luciana; Arena, Ross; Myers, Jonathan; Pizzolato, Ednaldo B.
2015-01-01
Introduction In coronary artery bypass (CABG) surgery, the common complications are the need for reintubation, prolonged mechanical ventilation (PMV) and death. Thus, a reliable model for the prognostic evaluation of those particular outcomes is a worthwhile pursuit. The existence of such a system would lead to better resource planning, cost reductions and an increased ability to guide preventive strategies. The aim of this study was to compare different methods – logistic regression (LR) and artificial neural networks (ANNs) – in accomplishing this goal. Material and methods Subjects undergoing CABG (n = 1315) were divided into training (n = 1053) and validation (n = 262) groups. The set of independent variables consisted of age, gender, weight, height, body mass index, diabetes, creatinine level, cardiopulmonary bypass, presence of preserved ventricular function, moderate and severe ventricular dysfunction and total number of grafts. The PMV was also an input for the prediction of death. The ability of ANN to discriminate outcomes was assessed using receiver-operating characteristic (ROC) analysis and the results were compared using a multivariate LR. Results The ROC curve areas for LR and ANN models, respectively, were: for reintubation 0.62 (CI: 0.50–0.75) and 0.65 (CI: 0.53–0.77); for PMV 0.67 (CI: 0.57–0.78) and 0.72 (CI: 0.64–0.81); and for death 0.86 (CI: 0.79–0.93) and 0.85 (CI: 0.80–0.91). No differences were observed between models. Conclusions The ANN has similar discriminating power in predicting reintubation, PMV and death outcomes. Thus, both models may be applicable as a predictor for these outcomes in subjects undergoing CABG. PMID:26322087
Obeso, José M.; García, Pilar; Martínez, Beatriz; Arroyo-López, Francisco Noé; Garrido-Fernández, Antonio; Rodriguez, Ana
2010-01-01
The use of bacteriophages provides an attractive approach to the fight against food-borne pathogenic bacteria, since they can be found in different environments and are unable to infect humans, both characteristics of which support their use as biocontrol agents. Two lytic bacteriophages, vB_SauS-phiIPLA35 (phiIPLA35) and vB_SauS-phiIPLA88 (phiIPLA88), previously isolated from the dairy environment inhibited the growth of Staphylococcus aureus. To facilitate the successful application of both bacteriophages as biocontrol agents, probabilistic models for predicting S. aureus inactivation by the phages in pasteurized milk were developed. A linear logistic regression procedure was used to describe the survival/death interface of S. aureus after 8 h of storage as a function of the initial phage titer (2 to 8 log10 PFU/ml), initial bacterial contamination (2 to 6 log10 CFU/ml), and temperature (15 to 37°C). Two successive models were built, with the first including only data from the experimental design and a global one in which results derived from the validation experiments were also included. The temperature, interaction temperature-initial level of bacterial contamination, and initial level of bacterial contamination-phage titer contributed significantly to the first model prediction. However, only the phage titer and temperature were significantly involved in the global model prediction. The predictions of both models were fail-safe and highly consistent with the observed S. aureus responses. Nevertheless, the global model, deduced from a higher number of experiments (with a higher degree of freedom), was dependent on a lower number of variables and had an apparent better fit. Therefore, it can be considered a convenient evolution of the first model. Besides, the global model provides the minimum phage concentration (about 2 × 108 PFU/ml) required to inactivate S. aureus in milk at different temperatures, irrespective of the bacterial contamination level. PMID
2013-01-01
Background Osteoporotic hip fractures with a significant morbidity and excess mortality among the elderly have imposed huge health and economic burdens on societies worldwide. In this age- and sex-matched case control study, we examined the risk factors of hip fractures and assessed the fracture risk by conditional logistic regression (CLR) and ensemble artificial neural network (ANN). The performances of these two classifiers were compared. Methods The study population consisted of 217 pairs (149 women and 68 men) of fractures and controls with an age older than 60 years. All the participants were interviewed with the same standardized questionnaire including questions on 66 risk factors in 12 categories. Univariate CLR analysis was initially conducted to examine the unadjusted odds ratio of all potential risk factors. The significant risk factors were then tested by multivariate analyses. For fracture risk assessment, the participants were randomly divided into modeling and testing datasets for 10-fold cross validation analyses. The predicting models built by CLR and ANN in modeling datasets were applied to testing datasets for generalization study. The performances, including discrimination and calibration, were compared with non-parametric Wilcoxon tests. Results In univariate CLR analyses, 16 variables achieved significant level, and six of them remained significant in multivariate analyses, including low T score, low BMI, low MMSE score, milk intake, walking difficulty, and significant fall at home. For discrimination, ANN outperformed CLR in both 16- and 6-variable analyses in modeling and testing datasets (p?
Campillo-Gimenez, Boris; Jouini, Wassim; Bayat, Sahar; Cuggia, Marc
2013-01-01
Introduction Case-based reasoning (CBR) is an emerging decision making paradigm in medical research where new cases are solved relying on previously solved similar cases. Usually, a database of solved cases is provided, and every case is described through a set of attributes (inputs) and a label (output). Extracting useful information from this database can help the CBR system providing more reliable results on the yet to be solved cases. Objective We suggest a general framework where a CBR system, viz. K-Nearest Neighbour (K-NN) algorithm, is combined with various information obtained from a Logistic Regression (LR) model, in order to improve prediction of access to the transplant waiting list. Methods LR is applied, on the case database, to assign weights to the attributes as well as the solved cases. Thus, five possible decision making systems based on K-NN and/or LR were identified: a standalone K-NN, a standalone LR and three soft K-NN algorithms that rely on the weights based on the results of the LR. The evaluation was performed under two conditions, either using predictive factors known to be related to registration, or using a combination of factors related and not related to registration. Results and Conclusion The results show that our suggested approach, where the K-NN algorithm relies on both weighted attributes and cases, can efficiently deal with non relevant attributes, whereas the four other approaches suffer from this kind of noisy setups. The robustness of this approach suggests interesting perspectives for medical problem solving tools using CBR methodology. PMID:24039730
Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi
2010-02-01
Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in
Wang, Tien-ni; Wu, Ching-yi; Chen, Chia-ling; Shieh, Jeng-yi; Lu, Lu; Lin, Keh-chung
2013-03-01
Given the growing evidence for the effects of constraint-induced therapy (CIT) in children with cerebral palsy (CP), there is a need for investigating the characteristics of potential participants who may benefit most from this intervention. This study aimed to establish predictive models for the effects of pediatric CIT on motor and functional outcomes. Therapists administered CIT to 49 children (aged 3-11 years) with CP. Sessions were 1-3.5h a day, twice a week, for 3-4 weeks. Parents were asked to document the number of restraint hours outside of the therapy sessions. Domains of treatment outcomes included motor capacity (measured by the Peabody Developmental Motor Scales II), motor performance (measured by the Pediatric Motor Activity Log), and functional independence (measured by the Pediatric Functional Independence Measure). Potential predictors included age, affected side, compliance (measured by time of restraint), and the initial level of motor impairment severity. Tests were administered before, immediately after, and 3 months after the intervention. Logistic regression analyses showed that total amount of restraint time was the only significant predictor for improved motor capacity immediately after CIT. Younger children who restrained the less affected arm for a longer time had a greater chance to achieve clinically significant improvements in motor performance. For outcomes of functional independence in daily life, younger age was associated with clinically meaningful improvement in the self-care domain. Baseline motor abilities were significantly predictive of better improvement in mobility and cognition. Significant predictors varied according to the aspects of motor outcomes after 3 months of follow-up. The potential predictors identified in this study allow clinicians to target those children who may benefit most from CIT.
NASA Astrophysics Data System (ADS)
Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael
2016-04-01
Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to
NASA Astrophysics Data System (ADS)
Ciprian Margarint, Mihai; Niculita, Mihai
2014-05-01
The regions with monoclinic geological structure are large portions of earth surface where the repetition of similar landform patterns is very distinguished, the scarps of cuestas being characterized by similar values of morphometrical variables. Landslides are associated with these scarps of cuestas and consequently, a very high value of landslide susceptibility can be reported on its surface. In these regions, landslide susceptibility mapping can be realized for the entire region, or for test areas, with accurate, reliable, and available datasets, concerning multi-temporal inventories and landslide predictors. Because of the similar geomorphologic and landslide distribution we think that if any relevance of using test areas for extrapolating susceptibility models is present, these areas should be targeted first. This study case try to establish the level of usability of landslide predictors influence, obtained for a 90 km2 sample located in the northern part of the Moldavian Plateau (N-E Romania), in other areas of the same physio-geographic region. In a first phase, landslide susceptibility assessment was carried out and validated using logistic regression (LR) approach, using a multiple landslide inventory. This inventory was created using ortorectified aerial images from 1978 and 2005, for each period being considered both old and active landslides. The modeling strategy was based on a distinctly inventory of depletion areas of all landslide, for 1978 phase, and on a number of 30 covariates extracted from topographical and aerial images (both from 1978 and 2005 periods). The geomorphometric variables were computed from a Digital Elevation Model (DEM) obtained by interpolation from 1:5000 contour data (2.5 m equidistance), at 10x10 m resolution. Distance from river network, distance from roads and land use were extracted from topographic maps and aerial images. By applying Akaike Information Criterion (AIC) the covariates with significance under 0.001 level
de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most
NASA Technical Reports Server (NTRS)
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
ERIC Educational Resources Information Center
Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.
2013-01-01
The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…
NASA Technical Reports Server (NTRS)
Duda, David P.; Minnis, Patrick
2009-01-01
Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.
NASA Astrophysics Data System (ADS)
Roşca, S.; Bilaşco, Ş.; Petrea, D.; Fodorean, I.; Vescan, I.; Filip, S.; Măguţ, F.-L.
2015-11-01
The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2) is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area) and predictability of the logistic model (AUROC = 0.63 for the validation area).
Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong
2015-01-01
Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
ERIC Educational Resources Information Center
Davidson, J. Cody
2016-01-01
Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…
Optimal estimation for regression models on τ-year survival probability.
Kwak, Minjung; Kim, Jinseog; Jung, Sin-Ho
2015-01-01
A logistic regression method can be applied to regressing the [Formula: see text]-year survival probability to covariates, if there are no censored observations before time [Formula: see text]. But if some observations are incomplete due to censoring before time [Formula: see text], then the logistic regression cannot be applied. Jung (1996) proposed to modify the score function for logistic regression to accommodate the right-censored observations. His modified score function, motivated for a consistent estimation of regression parameters, becomes a regular logistic score function if no observations are censored before time [Formula: see text]. In this article, we propose a modification of Jung's estimating function for an optimal estimation for the regression parameters in addition to consistency. We prove that the optimal estimator is more efficient than Jung's estimator. This theoretical comparison is illustrated with a real example data analysis and simulations.
NASA Technical Reports Server (NTRS)
Smalheer, C. V.
1973-01-01
The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.
AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad
2016-01-01
Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463
Frangos, Christos C; Frangos, Constantinos C; Sotiropoulos, Ioannis
2011-01-01
The aim of this paper is to investigate the relationships between Problematic Internet Use (PIU) among university students in Greece and factors such as gender, age, family condition, academic performance in the last semester of their studies, enrollment in unemployment programs, amount of Internet use per week (in general and per application), additional personal habits or dependencies (number of coffees, alcoholic drinks drunk per day, taking substances, cigarettes smoked per day), and negative psychological beliefs. Data were gathered from 2,358 university students from across Greece. The prevalence of PIU was 34.7% in our sample, and PIU was significantly associated with gender, parental family status, grade of studies during the previous semester, staying or not with parents, enrollment of the student in an unemployment program, and whether the student paid a subscription to the Internet (p < 0.0001). On average, problematic Internet users use MSN, forums, YouTube, pornographic sites, chat rooms, advertisement sites, Google, Yahoo!, their e-mail, ftp, games, and blogs more than non-problematic Internet users. PIU was also associated with other potential addictive personal habits of smoking, drinking alcohol or coffee, and taking drugs. Significant risk factors for PIU were being male, enrolment in unemployment programs, presence of negative beliefs, visiting pornographic sites, and playing online games. Thus PIU is prevalent among Greek university students and attention should be given to it by health officials.
A Latent Transition Model with Logistic Regression
ERIC Educational Resources Information Center
Chung, Hwan; Walls, Theodore A.; Park, Yousung
2007-01-01
Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian…
Estimation of propensity scores using generalized additive models.
Woo, Mi-Ja; Reiter, Jerome P; Karr, Alan F
2008-08-30
Propensity score matching is often used in observational studies to create treatment and control groups with similar distributions of observed covariates. Typically, propensity scores are estimated using logistic regressions that assume linearity between the logistic link and the predictors. We evaluate the use of generalized additive models (GAMs) for estimating propensity scores. We compare logistic regressions and GAMs in terms of balancing covariates using simulation studies with artificial and genuine data. We find that, when the distributions of covariates in the treatment and control groups overlap sufficiently, using GAMs can improve overall covariate balance, especially for higher-order moments of distributions. When the distributions in the two groups overlap insufficiently, GAM more clearly reveals this fact than logistic regression does. We also demonstrate via simulation that matching with GAMs can result in larger reductions in bias when estimating treatment effects than matching with logistic regression.
Foster, Guy M.; Graham, Jennifer L.
2016-04-06
The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes
Pecori, Biagio; Lastoria, Secondo; Caracò, Corradina; Celentani, Marco; Tatangelo, Fabiana; Avallone, Antonio; Rega, Daniela; De Palma, Giampaolo; Mormile, Maria; Budillon, Alfredo; Muto, Paolo; Bianco, Francesco; Aloj, Luigi; Petrillo, Antonella; Delrio, Paolo
2017-01-01
Previous studies indicate that FDG PET/CT may predict pathological response in patients undergoing neoadjuvant chemo-radiotherapy for locally advanced rectal cancer (LARC). Aim of the current study is evaluate if pathological response can be similarly predicted in LARC patients after short course radiation therapy alone. Methods: Thirty-three patients with cT2-3, N0-2, M0 rectal adenocarcinoma treated with hypo fractionated short course neoadjuvant RT (5x5 Gy) with delayed surgery (SCRTDS) were prospectively studied. All patients underwent 3 PET/CT studies at baseline, 10 days from RT end (early), and 53 days from RT end (delayed). Maximal standardized uptake value (SUVmax), mean standardized uptake value (SUVmean) and total lesion glycolysis (TLG) of the primary tumor were measured and recorded at each PET/CT study. We use logistic regression analysis to aggregate different measures of metabolic response to predict the pathological response in the course of SCRTDS. Results: We provide straightforward formulas to classify response and estimate the probability of being a major responder (TRG1-2) or a complete responder (TRG1) for each individual. The formulas are based on the level of TLG at the early PET and on the overall proportional reduction of TLG between baseline and delayed PET studies. Conclusions: This study demonstrates that in the course of SCRTDS it is possible to estimate the probabilities of pathological tumor responses on the basis of PET/CT with FDG. Our formulas make it possible to assess the risks associated to LARC borne by a patient in the course of SCRTDS. These risk assessments can be balanced against other health risks associated with further treatments and can therefore be used to make informed therapy adjustments during SCRTDS. PMID:28060889
NASA Astrophysics Data System (ADS)
Cotos-Yáñez, Tomas R.; Rodríguez-Rajo, F. J.; Jato, M. V.
Betula pollen is a common cause of pollinosis in localities in NW Spain and between 13% and 60% of individuals who are immunosensitive to pollen grains respond positively to its allergens. It is important in the case of all such people to be able to predict pollen concentrations in advance. We therefore undertook an aerobiological study in the city of Vigo (Pontevedra, Spain) from 1995 to 2001, using a Hirst active-impact pollen trap (VPPS 2000) situated in the city centre. Vigo presents a temperate maritime climate with a mean annual temperature of 14.9 °C and 1,412 mm annual total precipitation. This paper analyses two ways of quantifying the prediction of pollen concentration: first by means of a generalized additive regression model with the object of predicting whether the series of interest exceeds a certain threshold; second using a partially linear model to obtain specific prediction values for pollen grains. Both models use a self-explicative part and another formed by exogenous meteorological factors. The models were tested with data from 2001 (year in which the total precipitation registered was almost twice the climatological average overall during the flowering period), which were not used in formulating the models. A highly satisfactory classification and good forecasting results were achieved with the first and second approaches respectively. The estimated line taking into account temperature and a calm S-SW wind, corresponds to the real line recorded during 2001, which gives us an idea of the proposed model's validity.
2012-04-18
of Employment. Official Message. (CMC WASHINGTON DC PPO POE), DTG: 291447Z Mar 2010. 28 Military Sealift Command, Combat Logistics Force Webpage...Ammunition Ship (F-AKE) Concept of Employment. Official Message. (CMC WASHINGTON DC PPO POE), DTG: 291447Z Mar 2010. 28 30 Headquarters u.s. Marine Corps...Maritime Prepositioning Force (MPF) Auxiliary Dry Cargo/Ammunition Ship (I’-AKE) Concept of Employment. Official Message. (CMC WASHINGTON DC PPO POE
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed
Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng
2014-01-01
Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, P<0.001) and cleaning method (reference, chemical cleanser; running water: OR = 7.081, P = 0.010; brushing: OR = 3.567, P = 0.005). Potential risk factors for denture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, P<0.001), and cleaning method (reference, chemical cleanser; running water: OR = 29.184, P<0.001; brushing: OR = 4.236, P = 0.007). Conclusion Denture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369
NASA Astrophysics Data System (ADS)
Wu, Chunhung
2016-04-01
Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall
Nuclear Shuttle Logistics Configuration
NASA Technical Reports Server (NTRS)
1971-01-01
This 1971 artist's concept shows the Nuclear Shuttle in both its lunar logistics configuraton and geosynchronous station configuration. As envisioned by Marshall Space Flight Center Program Development persornel, the Nuclear Shuttle would deliver payloads to lunar orbits or other destinations then return to Earth orbit for refueling and additional missions.
Killeen, Peter R
2015-07-01
The generalized matching law (GML) is reconstructed as a logistic regression equation that privileges no particular value of the sensitivity parameter, a. That value will often approach 1 due to the feedback that drives switching that is intrinsic to most concurrent schedules. A model of that feedback reproduced some features of concurrent data. The GML is a law only in the strained sense that any equation that maps data is a law. The machine under the hood of matching is in all likelihood the very law that was displaced by the Matching Law. It is now time to return the Law of Effect to centrality in our science.
Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning
ERIC Educational Resources Information Center
MacLellan, Christopher J.; Liu, Ran; Koedinger, Kenneth R.
2015-01-01
Additive Factors Model (AFM) and Performance Factors Analysis (PFA) are two popular models of student learning that employ logistic regression to estimate parameters and predict performance. This is in contrast to Bayesian Knowledge Tracing (BKT) which uses a Hidden Markov Model formalism. While all three models tend to make similar predictions,…
Country logistics performance and disaster impact.
Vaillancourt, Alain; Haavisto, Ira
2016-04-01
The aim of this paper is to deepen the understanding of the relationship between country logistics performance and disaster impact. The relationship is analysed through correlation analysis and regression models for 117 countries for the years 2007 to 2012 with disaster impact variables from the International Disaster Database (EM-DAT) and logistics performance indicators from the World Bank. The results show a significant relationship between country logistics performance and disaster impact overall and for five out of six specific logistic performance indicators. These specific indicators were further used to explore the relationship between country logistic performance and disaster impact for three specific disaster types (epidemic, flood and storm). The findings enhance the understanding of the role of logistics in a humanitarian context with empirical evidence of the importance of country logistics performance in disaster response operations.
J. Richard Hess; Kevin L. Kenney; William A. Smith; Ian Bonner; David J. Muth
2015-04-01
Equipment manufacturers have made rapid improvements in biomass harvesting and handling equipment. These improvements have increased transportation and handling efficiencies due to higher biomass densities and reduced losses. Improvements in grinder efficiencies and capacity have reduced biomass grinding costs. Biomass collection efficiencies (the ratio of biomass collected to the amount available in the field) as high as 75% for crop residues and greater than 90% for perennial energy crops have also been demonstrated. However, as collection rates increase, the fraction of entrained soil in the biomass increases, and high biomass residue removal rates can violate agronomic sustainability limits. Advancements in quantifying multi-factor sustainability limits to increase removal rate as guided by sustainable residue removal plans, and mitigating soil contamination through targeted removal rates based on soil type and residue type/fraction is allowing the use of new high efficiency harvesting equipment and methods. As another consideration, single pass harvesting and other technologies that improve harvesting costs cause biomass storage moisture management challenges, which challenges are further perturbed by annual variability in biomass moisture content. Monitoring, sampling, simulation, and analysis provide basis for moisture, time, and quality relationships in storage, which has allowed the development of moisture tolerant storage systems and best management processes that combine moisture content and time to accommodate baled storage of wet material based upon “shelf-life.” The key to improving biomass supply logistics costs has been developing the associated agronomic sustainability and biomass quality technologies and processes that allow the implementation of equipment engineering solutions.
Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-01
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424
Wang, Ming; Flanders, W Dana; Bostick, Roberd M; Long, Qi
2012-12-20
Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. Although a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a 'hybrid' approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific errors. We illustrate our method by using data from a colorectal adenoma study.
Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas
2013-01-01
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes.
Totton, Sarah C; Farrar, Ashley M; Wilkins, Wendy; Bucher, Oliver; Waddell, Lisa A; Wilhelm, Barbara J; McEwen, Scott A; Rajić, Andrijana
2012-10-01
Eating inappropriately prepared poultry meat is a major cause of foodborne salmonellosis. Our objectives were to determine the efficacy of feed and water additives (other than competitive exclusion and antimicrobials) on reducing Salmonella prevalence or concentration in broiler chickens using systematic review-meta-analysis and to explore sources of heterogeneity found in the meta-analysis through meta-regression. Six electronic databases were searched (Current Contents (1999-2009), Agricola (1924-2009), MEDLINE (1860-2009), Scopus (1960-2009), Centre for Agricultural Bioscience (CAB) (1913-2009), and CAB Global Health (1971-2009)), five topic experts were contacted, and the bibliographies of review articles and a topic-relevant textbook were manually searched to identify all relevant research. Study inclusion criteria comprised: English-language primary research investigating the effects of feed and water additives on the Salmonella prevalence or concentration in broiler chickens. Data extraction and study methodological assessment were conducted by two reviewers independently using pretested forms. Seventy challenge studies (n=910 unique treatment-control comparisons), seven controlled studies (n=154), and one quasi-experiment (n=1) met the inclusion criteria. Compared to an assumed control group prevalence of 44 of 1000 broilers, random-effects meta-analysis indicated that the Salmonella cecal colonization in groups with prebiotics (fructooligosaccharide, lactose, whey, dried milk, lactulose, lactosucrose, sucrose, maltose, mannanoligosaccharide) added to feed or water was 15 out of 1000 broilers; with lactose added to feed or water it was 10 out of 1000 broilers; with experimental chlorate product (ECP) added to feed or water it was 21 out of 1000. For ECP the concentration of Salmonella in the ceca was decreased by 0.61 log(10)cfu/g in the treated group compared to the control group. Significant heterogeneity (Cochran's Q-statistic p≤0.10) was observed
78 FR 25853 - Defense Logistics Agency Privacy Program
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-03
... of the Secretary 32 CFR Part 323 RIN 0790-AI86 Defense Logistics Agency Privacy Program AGENCY: Defense Logistics Agency, DoD. ACTION: Final rule. SUMMARY: The Defense Logistics Agency (DLA) is revising... Defense Logistics Agency's implementation of the Privacy Act of 1974, as amended. In addition, DLA...
NASA Space Rocket Logistics Challenges
NASA Technical Reports Server (NTRS)
Neeley, James R.; Jones, James V.; Watson, Michael D.; Bramon, Christopher J.; Inman, Sharon K.; Tuttle, Loraine
2014-01-01
The Space Launch System (SLS) is the new NASA heavy lift launch vehicle and is scheduled for its first mission in 2017. The goal of the first mission, which will be uncrewed, is to demonstrate the integrated system performance of the SLS rocket and spacecraft before a crewed flight in 2021. SLS has many of the same logistics challenges as any other large scale program. Common logistics concerns for SLS include integration of discreet programs geographically separated, multiple prime contractors with distinct and different goals, schedule pressures and funding constraints. However, SLS also faces unique challenges. The new program is a confluence of new hardware and heritage, with heritage hardware constituting seventy-five percent of the program. This unique approach to design makes logistics concerns such as commonality especially problematic. Additionally, a very low manifest rate of one flight every four years makes logistics comparatively expensive. That, along with the SLS architecture being developed using a block upgrade evolutionary approach, exacerbates long-range planning for supportability considerations. These common and unique logistics challenges must be clearly identified and tackled to allow SLS to have a successful program. This paper will address the common and unique challenges facing the SLS programs, along with the analysis and decisions the NASA Logistics engineers are making to mitigate the threats posed by each.
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Incremental logistic regression for customizing automatic diagnostic models.
Tortajada, Salvador; Robles, Montserrat; García-Gómez, Juan Miguel
2015-01-01
In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence.
Assessing Lake Trophic Status: A Proportional Odds Logistic Regression Model
Lake trophic state classifications are good predictors of ecosystem condition and are indicative of both ecosystem services (e.g., recreation and aesthetics), and disservices (e.g., harmful algal blooms). Methods for classifying trophic state are based off the foundational work o...
Analyzing Army Reserve Unsatisfactory Participants through Logistic Regression
2012-06-08
interactions, and referenced Dr. Fredric Jablin’s four stages of socialization (anticipatory socialization, organizational encounter, metamorphosis ...Soldier’s encounter is positive he begins the metamorphosis stage, which “marks the newcomer’s alignment of expectations to those of the organization...anticipatory socialization, encounter, and metamorphosis stages do not have enough time in the unit to make educated decisions. The metamorphosis stage should
Measuring NAVSPASUR Sensor Performance using Logistic Regression Models
1992-09-01
consists of three transmitting and six receiving stations or sites. The three transmitting sites, located in Lake Kickapoo TX, Gila River AZ and Jordan...frequency of 216.980 MHz. The largest of the transmitters, located in Lake Kickapoo TX, operates at 810 kW of output power and consists of eighteen...parameters for the Lake Kickapoo transmitter and the San Diego receiver’ will be used. The following parameters are given [Ref. 1: p. 6-7]: P, = 6.76 X 10
Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies
1982-11-01
prompted close examination of the issue at a workshop on hypertriglyceridemia where some of the cautions and perspectives given in this paper were...longevity," Circulation, 34, 679-697, (1966). 19. Lippel, K., Tyroler, H., Eder, H., Gotto, A., and Vahouny, G. "Rela- tionship of hypertriglyceridemia
Background or Experience? Using Logistic Regression to Predict College Retention
ERIC Educational Resources Information Center
Synco, Tracee M.
2012-01-01
Tinto, Astin and countless others have researched the retention and attrition of students from college for more than thirty years. However, the six year graduation rate for all first-time full-time freshmen for the 2002 cohort was 57%. This study sought to determine the retention variables that predicted continued enrollment of entering freshmen…
NASA Technical Reports Server (NTRS)
Tellado, Joseph
2014-01-01
The presentation contains a status of KSC ISS Logistics Operations. It basically presents current top level ISS Logistics tasks being conducted at KSC, current International Partner activities, hardware processing flow focussing on late Stow operations, list of KSC Logistics POC's, and a backup list of Logistics launch site services. This presentation is being given at the annual International Space Station (ISS) Multi-lateral Logistics Maintenance Control Panel meeting to be held in Turin, Italy during the week of May 13-16. The presentatiuon content doesn't contain any potential lessons learned.
ERIC Educational Resources Information Center
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Liang, Shuyan; Hu, Jun; Xie, Yuanyuan; Zhou, Qing; Zhu, Yanhong; Yang, Xiangliang
2016-01-01
Cancer immunotherapy based on nanodelivery systems has shown potential for treatment of various malignancies, owing to the benefits of tumor targeting of nanoparticles. However, induction of a potent T-cell immune response against tumors still remains a challenge. In this study, polyethylenimine-modified carboxyl-styrene/acrylamide (PS) copolymer nano-spheres were developed as a delivery system of unmethylated cytosine-phosphate-guanine (CpG) oligodeoxynucleotides and transforming growth factor-beta (TGF-β) receptor I inhibitors for cancer immunotherapy. TGF-β receptor I inhibitors (LY2157299, LY) were encapsulated to the PS via hydrophobic interaction, while CpG oligodeoxynucleotides were loaded onto the PS through electrostatic interaction. Compared to the control group, tumor inhibition in the PS-LY/CpG group was up to 99.7% without noticeable toxicity. The tumor regression may be attributed to T-cell activation and amplification in mouse models. The results highlight the additive effect of CpG and TGF-β receptor I inhibitors co-delivered in cancer immunotherapy. PMID:28008250
2006-02-07
Logistics modernization, end-to-end logistics, logistics transformation, and just - in - time logistics -- whichever name it is being called today, it is...demonstrations as to what these systems will do for the end user, consumer confidence will increase and " just - in - time " logistics will lead to a lighter and leaner combat logistics support.
Focused Logistics: Putting Agility in Agile Logistics
2011-05-19
envisioned an agile and adaptable logstics system built around common situational understanding.5 The Focused Logistics concept specified the requirement to...the tracking of resources moving through the TD network. As a result, 112 Ibid, 20. 113 Ibid, 18-20. 39 distribution centers tagged inbound
ERIC Educational Resources Information Center
Walton, Joseph M.; And Others
1978-01-01
Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)
The "Smarter Regression" Add-In for Linear and Logistic Regression in Excel
2007-07-01
only Visual Basic , the built-in programming language of Excel (Walkenbach, 1999). We wanted to avoid the use of external Dynamic Linked Libraries...least two different ways of entering results into workbook cells in Visual Basic . One is to establish an array in Visual Basic , fill up the elements of
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
NASA Technical Reports Server (NTRS)
1963-01-01
This document has been prepared to incorporate all presentation aid material, together with some explanatory text, used during an oral briefing on the Nuclear Lunar Logistics System given at the George C. Marshall Space Flight Center, National Aeronautics and Space Administration, on 18 July 1963. The briefing and this document are intended to present the general status of the NERVA (Nuclear Engine for Rocket Vehicle Application) nuclear rocket development, the characteristics of certain operational NERVA-class engines, and appropriate technical and schedule information. Some of the information presented herein is preliminary in nature and will be subject to further verification, checking and analysis during the remainder of the study program. In addition, more detailed information will be prepared in many areas for inclusion in a final summary report. This work has been performed by REON, a division of Aerojet-General Corporation under Subcontract 74-10039 from the Lockheed Missiles and Space Company. The presentation and this document have been prepared in partial fulfillment of the provisions of the subcontract. From the inception of the NERVA program in July 1961, the stated emphasis has centered around the demonstration of the ability of a nuclear rocket to perform safely and reliably in the space environment, with the understanding that the assignment of a mission (or missions) would place undue emphasis on performance and operational flexibility. However, all were aware that the ultimate justification for the development program must lie in the application of the nuclear propulsion system to the national space objectives.
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Exploration Mission Benefits From Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Broyan, James Lee, Jr.; Ewert, Michael K.; Schlesinger, Thilini
2016-01-01
Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logistical mass can have a significant impact because it also reduces the packaging burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting, and trash processing technologies are under development to increase habitable volume and improve protection against solar storm events. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and to help locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio-frequency-identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and their mission benefits for exploration missions.
NASA Astrophysics Data System (ADS)
Cempírek, Václav; Nachtigall, Petr; Široký, Jaromír
2016-12-01
This paper deals with security of logistic chains according to incorrect declaration of transported goods, fraudulent transport and forwarding companies and possible threats caused by political influences. The main goal of this paper is to highlight possible logistic costs increase due to these fraudulent threats. An analysis of technological processes will beis provided, and an increase of these transport times considering the possible threatswhich will beis evaluated economic costs-wise. In the conclusion, possible threat of companies'` efficiency in logistics due to the costs`, means of transport and increase in human resources` increase will beare pointed out.
NASA Astrophysics Data System (ADS)
Chang, Yoon S.; Oh, Chang H.
Nowadays, environmental management becomes a critical business consideration for companies to survive from many regulations and tough business requirements. Most of world-leading companies are now aware that environment friendly technology and management are critical to the sustainable growth of the company. The environment market has seen continuous growth marking 532B in 2000, and 590B in 2004. This growth rate is expected to grow to 700B in 2010. It is not hard to see the environment-friendly efforts in almost all aspects of business operations. Such trends can be easily found in logistics area. Green logistics aims to make environmental friendly decisions throughout a product lifecycle. Therefore for the success of green logistics, it is critical to have real time tracking capability on the product throughout the product lifecycle and smart solution service architecture. In this chapter, we introduce an RFID based green logistics solution and service.
Lunar Commercial Mining Logistics
NASA Astrophysics Data System (ADS)
Kistler, Walter P.; Citron, Bob; Taylor, Thomas C.
2008-01-01
Innovative commercial logistics is required for supporting lunar resource recovery operations and assisting larger consortiums in lunar mining, base operations, camp consumables and the future commercial sales of propellant over the next 50 years. To assist in lowering overall development costs, ``reuse'' innovation is suggested in reusing modified LTS in-space hardware for use on the moon's surface, developing product lines for recovered gases, regolith construction materials, surface logistics services, and other services as they evolve, (Kistler, Citron and Taylor, 2005) Surface logistics architecture is designed to have sustainable growth over 50 years, financed by private sector partners and capable of cargo transportation in both directions in support of lunar development and resource recovery development. The author's perspective on the importance of logistics is based on five years experience at remote sites on Earth, where remote base supply chain logistics didn't always work, (Taylor, 1975a). The planning and control of the flow of goods and materials to and from the moon's surface may be the most complicated logistics challenges yet to be attempted. Affordability is tied to the innovation and ingenuity used to keep the transportation and surface operations costs as low as practical. Eleven innovations are proposed and discussed by an entrepreneurial commercial space startup team that has had success in introducing commercial space innovation and reducing the cost of space operations in the past. This logistics architecture offers NASA and other exploring nations a commercial alternative for non-essential cargo. Five transportation technologies and eleven surface innovations create the logistics transportation system discussed.
Space Station fluid management logistics
NASA Technical Reports Server (NTRS)
Dominick, Sam M.
1990-01-01
Viewgraphs and discussion on space station fluid management logistics are presented. Topics covered include: fluid management logistics - issues for Space Station Freedom evolution; current fluid logistics approach; evolution of Space Station Freedom fluid resupply; launch vehicle evolution; ELV logistics system approach; logistics carrier configuration; expendable fluid/propellant carrier description; fluid carrier design concept; logistics carrier orbital operations; carrier operations at space station; summary/status of orbital fluid transfer techniques; Soviet progress tanker system; and Soviet propellant resupply system observations.
Regressive systemic sclerosis.
Black, C; Dieppe, P; Huskisson, T; Hart, F D
1986-01-01
Systemic sclerosis is a disease which usually progresses or reaches a plateau with persistence of symptoms and signs. Regression is extremely unusual. Four cases of established scleroderma are described in which regression is well documented. The significance of this observation and possible mechanisms of disease regression are discussed. Images PMID:3718012
Tharrington, Arnold N.
2015-09-09
The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.
Exploration Mission Benefits From Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Broyan, James Lee, Jr.; Schlesinger, Thilini; Ewert, Michael K.
2016-01-01
Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logical mass can have a significant impact because it also reduces the packing burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting and trash processing technologies to increase habitable volume and improve protection against solar storm events are under development. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio frequency identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and there mission benefits for exploration missions.
Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé
2016-01-01
Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418
Logistics of Guinea worm disease eradication in South Sudan.
Jones, Alexander H; Becknell, Steven; Withers, P Craig; Ruiz-Tiben, Ernesto; Hopkins, Donald R; Stobbelaar, David; Makoy, Samuel Yibi
2014-03-01
From 2006 to 2012, the South Sudan Guinea Worm Eradication Program reduced new Guinea worm disease (dracunculiasis) cases by over 90%, despite substantial programmatic challenges. Program logistics have played a key role in program achievements to date. The program uses disease surveillance and program performance data and integrated technical-logistical staffing to maintain flexible and effective logistical support for active community-based surveillance and intervention delivery in thousands of remote communities. Lessons learned from logistical design and management can resonate across similar complex surveillance and public health intervention delivery programs, such as mass drug administration for the control of neglected tropical diseases and other disease eradication programs. Logistical challenges in various public health scenarios and the pivotal contribution of logistics to Guinea worm case reductions in South Sudan underscore the need for additional inquiry into the role of logistics in public health programming in low-income countries.
Logistics of Guinea Worm Disease Eradication in South Sudan
Jones, Alexander H.; Becknell, Steven; Withers, P. Craig; Ruiz-Tiben, Ernesto; Hopkins, Donald R.; Stobbelaar, David; Makoy, Samuel Yibi
2014-01-01
From 2006 to 2012, the South Sudan Guinea Worm Eradication Program reduced new Guinea worm disease (dracunculiasis) cases by over 90%, despite substantial programmatic challenges. Program logistics have played a key role in program achievements to date. The program uses disease surveillance and program performance data and integrated technical–logistical staffing to maintain flexible and effective logistical support for active community-based surveillance and intervention delivery in thousands of remote communities. Lessons learned from logistical design and management can resonate across similar complex surveillance and public health intervention delivery programs, such as mass drug administration for the control of neglected tropical diseases and other disease eradication programs. Logistical challenges in various public health scenarios and the pivotal contribution of logistics to Guinea worm case reductions in South Sudan underscore the need for additional inquiry into the role of logistics in public health programming in low-income countries. PMID:24445199
Assessing risk factors for periodontitis using regression
NASA Astrophysics Data System (ADS)
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Improved Regression Calibration
ERIC Educational Resources Information Center
Skrondal, Anders; Kuha, Jouni
2012-01-01
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…
Logistics Assessment Guidebook
2011-07-01
significant. Another case involved the integration of an aircraft and ground vehicle system. Integration issues were identified early in the design...for height and width; insufficient power requirements to support maintenance actions; and insufficient design of the autonomic logistics system...evaluation at IOT &E and thus on track for Initial Operational Capability (IOC). If not, a plan is in place to ensure they are met (ref DoDI 5000; CJCSM
Logistics Software Implementation.
1986-02-28
to obtain benchmark performance results. The list presented in Appendix A is not final and should be considered only as the initial worksheet . d...has been used in more applications and its imple-... mentation is better understood. All of the preliminary tests have indi- cated excellent ...environment. The fifth-generation languages, such as PROLOG, also provide excellent dynamic data base support. The proposed approach for the Logistics
2014-12-04
2014 by: , Director, Graduate Degree Programs Robert F. Baumann, PhD The opinions and conclusions expressed herein are those of the...Washington, DC: Center for Military History, 1993), 244. 24 Spencer C. Tucker and Priscilla Mary Roberts , Encyclopedia of World War I (Santa Barbara...Theater of Operations. See Richard M. Leighton and Robert W. Coakley, Global Logistics and Strategy 1940-1943 (Washington, DC: Center for Military
Relative risk regression analysis of epidemiologic data.
Prentice, R L
1985-11-01
Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation
Interaction Models for Functional Regression
USSET, JOSEPH; STAICU, ANA-MARIA; MAITY, ARNAB
2015-01-01
A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data. PMID:26744549
George: Gaussian Process regression
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel
2015-11-01
George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.
NASA Astrophysics Data System (ADS)
de Carvalho, R. Egydio; Leonel, Edson D.
2016-12-01
A periodic time perturbation is introduced in the logistic map as an attempt to investigate new scenarios of bifurcations and new mechanisms toward the chaos. With a squared sine perturbation we observe that a point attractor reaches the chaotic attractor without following a cascade of bifurcations. One fixed point of the system presents a new scenario of bifurcations through an infinite sequence of alternating changes of stability. At the bifurcations, the perturbation does not modify the scaling features observed in the convergence toward the stationary state.
Bayesian Isotonic Regression Dose-response (BIRD) Model.
Li, Wen; Fu, Haoda
2016-12-21
Understanding dose-response relationship is a crucial step in drug development. There are a few parametric methods to estimate dose-response curves, such as the Emax model and the logistic model. These parametric models are easy to interpret and, hence, widely used. However, these models often require the inclusion of patients on high-dose levels; otherwise, the model parameters cannot be reliably estimated. To have robust estimation, nonparametric models are used. However, these models are not able to estimate certain important clinical parameters, such as ED50 and Emax. Furthermore, in many therapeutic areas, dose-response curves can be assumed as non-decreasing functions. This creates an additional challenge for nonparametric methods. In this paper, we propose a new Bayesian isotonic regression dose-response model which features advantages from both parametric and nonparametric models. The ED50 and Emax can be derived from this model. Simulations are provided to evaluate the Bayesian isotonic regression dose-response model performance against two parametric models. We apply this model to a data set from a diabetes dose-finding study.
Logistics Lessons Learned in NASA Space Flight
NASA Technical Reports Server (NTRS)
Evans, William A.; DeWeck, Olivier; Laufer, Deanna; Shull, Sarah
2006-01-01
The Vision for Space Exploration sets out a number of goals, involving both strategic and tactical objectives. These include returning the Space Shuttle to flight, completing the International Space Station, and conducting human expeditions to the Moon by 2020. Each of these goals has profound logistics implications. In the consideration of these objectives,a need for a study on NASA logistics lessons learned was recognized. The study endeavors to identify both needs for space exploration and challenges in the development of past logistics architectures, as well as in the design of space systems. This study may also be appropriately applied as guidance in the development of an integrated logistics architecture for future human missions to the Moon and Mars. This report first summarizes current logistics practices for the Space Shuttle Program (SSP) and the International Space Station (ISS) and examines the practices of manifesting, stowage, inventory tracking, waste disposal, and return logistics. The key findings of this examination are that while the current practices do have many positive aspects, there are also several shortcomings. These shortcomings include a high-level of excess complexity, redundancy of information/lack of a common database, and a large human-in-the-loop component. Later sections of this report describe the methodology and results of our work to systematically gather logistics lessons learned from past and current human spaceflight programs as well as validating these lessons through a survey of the opinions of current space logisticians. To consider the perspectives on logistics lessons, we searched several sources within NASA, including organizations with direct and indirect connections with the system flow in mission planning. We utilized crew debriefs, the John Commonsense lessons repository for the JSC Mission Operations Directorate, and the Skylab Lessons Learned. Additionally, we searched the public version of the Lessons Learned
Estimation of the Regression Effect Using a Latent Trait Model.
ERIC Educational Resources Information Center
Quinn, Jimmy L.
A logistic model was used to generate data to serve as a proxy for an immediate retest from item responses to a fourth grade standardized reading comprehension test of 45 items. Assuming that the actual test may be considered a pretest and the proxy data may be considered a retest, the effect of regression was investigated using a percentage of…
Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables
ERIC Educational Resources Information Center
Rakow, Ernest A.
1978-01-01
Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)
NASA Astrophysics Data System (ADS)
Darnah
2016-04-01
Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.
Developing Future Strategic Logistics Leaders
2013-03-01
and our focus on tactical level lessons learned has 12 skewed our emphasis in PME to a very narrow band of excellence. As such our current...A partnership between senior logistics leaders, PME developers, and personnel managers is essential to constructing and maintaining strategic...short term and inconsistent PME system that does not produce strategic logistic leaders. The Logistics Corps, sustainment branches, LOO lead for leader
Space Shuttle operational logistics plan
NASA Technical Reports Server (NTRS)
Botts, J. W.
1983-01-01
The Kennedy Space Center plan for logistics to support Space Shuttle Operations and to establish the related policies, requirements, and responsibilities are described. The Directorate of Shuttle Management and Operations logistics responsibilities required by the Kennedy Organizational Manual, and the self-sufficiency contracting concept are implemented. The Space Shuttle Program Level 1 and Level 2 logistics policies and requirements applicable to KSC that are presented in HQ NASA and Johnson Space Center directives are also implemented.
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…
NASA Astrophysics Data System (ADS)
Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.
2013-02-01
Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local
Astronomical Methods for Nonparametric Regression
NASA Astrophysics Data System (ADS)
Steinhardt, Charles L.; Jermyn, Adam
2017-01-01
I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.
Selected Logistics Models and Techniques.
1984-09-01
Programmable Calculator LCC...Program 27 TI-59 Programmable Calculator LCC Model 30 Unmanned Spacecraft Cost Model 31 iv I: TABLE OF CONTENTS (CONT’D) (Subject Index) LOGISTICS...34"" - % - "° > - " ° .° - " .’ > -% > ]*° - LOGISTICS ANALYSIS MODEL/TECHNIQUE DATA MODEL/TECHNIQUE NAME: TI-59 Programmable Calculator LCC Model TYPE MODEL: Cost Estimating DEVELOPED BY:
Mission Benefits Analysis of Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Ewert, Michael K.; Broyan, James L.
2012-01-01
Future space exploration missions will need to use less logistical supplies if humans are to live for longer periods away from our home planet. Anything that can be done to reduce initial mass and volume of supplies or reuse or recycle items that have been launched will be very valuable. Reuse and recycling also reduce the trash burden and associated nuisances, such as smell, but require good systems engineering and operations integration to reap the greatest benefits. A systems analysis was conducted to quantify the mass and volume savings of four different technologies currently under development by NASA fs Advanced Exploration Systems (AES) Logistics Reduction and Repurposing project. Advanced clothing systems lead to savings by direct mass reduction and increased wear duration. Reuse of logistical items, such as packaging, for a second purpose allows fewer items to be launched. A device known as a heat melt compactor drastically reduces the volume of trash, recovers water and produces a stable tile that can be used instead of launching additional radiation protection. The fourth technology, called trash ]to ]supply ]gas, can benefit a mission by supplying fuel such as methane to the propulsion system. This systems engineering work will help improve logistics planning and overall mission architectures by determining the most effective use, and reuse, of all resources.
Mission Benefits Analysis of Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Ewert, Michael K.; Broyan, James Lee, Jr.
2013-01-01
Future space exploration missions will need to use less logistical supplies if humans are to live for longer periods away from our home planet. Anything that can be done to reduce initial mass and volume of supplies or reuse or recycle items that have been launched will be very valuable. Reuse and recycling also reduce the trash burden and associated nuisances, such as smell, but require good systems engineering and operations integration to reap the greatest benefits. A systems analysis was conducted to quantify the mass and volume savings of four different technologies currently under development by NASA s Advanced Exploration Systems (AES) Logistics Reduction and Repurposing project. Advanced clothing systems lead to savings by direct mass reduction and increased wear duration. Reuse of logistical items, such as packaging, for a second purpose allows fewer items to be launched. A device known as a heat melt compactor drastically reduces the volume of trash, recovers water and produces a stable tile that can be used instead of launching additional radiation protection. The fourth technology, called trash-to-gas, can benefit a mission by supplying fuel such as methane to the propulsion system. This systems engineering work will help improve logistics planning and overall mission architectures by determining the most effective use, and reuse, of all resources.
Calculating a Stepwise Ridge Regression.
ERIC Educational Resources Information Center
Morris, John D.
1986-01-01
Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Marginal longitudinal semiparametric regression via penalized splines
Kadiri, M. Al; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models. PMID:21037941
Tosteson, Tor D; Buzas, Jeffrey S; Demidenko, Eugene; Karagas, Margaret
2003-04-15
Covariate measurement error is often a feature of scientific data used for regression modelling. The consequences of such errors include a loss of power of tests of significance for the regression parameters corresponding to the true covariates. Power and sample size calculations that ignore covariate measurement error tend to overestimate power and underestimate the actual sample size required to achieve a desired power. In this paper we derive a novel measurement error corrected power function for generalized linear models using a generalized score test based on quasi-likelihood methods. Our power function is flexible in that it is adaptable to designs with a discrete or continuous scalar covariate (exposure) that can be measured with or without error, allows for additional confounding variables and applies to a broad class of generalized regression and measurement error models. A program is described that provides sample size or power for a continuous exposure with a normal measurement error model and a single normal confounder variable in logistic regression. We demonstrate the improved properties of our power calculations with simulations and numerical studies. An example is given from an ongoing study of cancer and exposure to arsenic as measured by toenail concentrations and tap water samples.
Kramer, S.
1996-12-31
In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.
NASA Astrophysics Data System (ADS)
Yang, Yu-Xiang; Chen, Fei-Yang; Tong, Tong
According to the characteristic of e-waste reverse logistics, environmental performance evaluation system of electronic waste reverse logistics enterprise is proposed. We use fuzzy analytic hierarchy process method to evaluate the system. In addition, this paper analyzes the enterprise X, as an example, to discuss the evaluation method. It's important to point out attributes and indexes which should be strengthen during the process of ewaste reverse logistics and provide guidance suggestions to domestic e-waste reverse logistics enterprises.
Logistics in smallpox: the legacy.
Wickett, John; Carrasco, Peter
2011-12-30
Logistics, defined as "the time-related positioning of resources" was critical to the implementation of the smallpox eradication strategy of surveillance and containment. Logistical challenges in the smallpox programme included vaccine delivery, supplies, staffing, vehicle maintenance, and financing. Ensuring mobility was essential as health workers had to travel to outbreaks to contain them. Three examples illustrate a range of logistic challenges which required imagination and innovation. Standard price lists were developed to expedite vehicle maintenance and repair in Bihar, India. Innovative staffing ensured an adequate infrastructure for vehicle maintenance in Bangladesh. The use of disaster relief mechanisms in Somalia provided airlifts, vehicles and funding within 27 days of their initiation. In contrast the Expanded Programme on Immunization (EPI) faces more complex logistical challenges.
What Exactly is Space Logistics?
2011-01-01
information on the space shuttle in the news and have inferred by now that resupply missions to the International Space Sta- tion, Hubble Space ... Telescope repair missions, and satellite deployment missions are space logistics—and you would be technically correct. The science of logistics as applied...What Exactly is Space Logistics? James C. Breidenbach Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the
Logistics Transformation: The Paradigm Shift
2007-05-14
NUMBER 6. AUTHOR(S) MAJ Derrick A. Corbett 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND...ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT U.S. Army Command and General Staff College School of Advanced Military Studies 250 Gibbon...logistics transformation be defined. In the 18th Century, the French invented “a third military science which they called Logistique , or Logistics
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking
Lages, Martin; Scheel, Anne
2016-01-01
We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440
Quantifying effects of post-processing with visual grading regression
NASA Astrophysics Data System (ADS)
Smedby, Örjan; Fredrikson, Mats; De Geer, Jakob; Sandborg, Michael
2012-02-01
For optimization and evaluation of image quality, one can use visual grading experiments, where observers rate some aspect of image quality on an ordinal scale. To take into account the ordinal character of the data, ordinal logistic regression is used in the statistical analysis, an approach known as visual grading regression (VGR). In the VGR model one may include factors such as imaging parameters and post-processing procedures, in addition to patient and observer identity. In a single-image study, 9 radiologists graded 24 cardiac CTA images acquired with ECG-modulated tube current using standard settings (310 mAs), reduced dose (62 mAs) and reduced dose after post-processing. Image quality was assessed using visual grading with five criteria, each with a five-level ordinal scale from 1 (best) to 5 (worst). The VGR model included one term estimating the dose effect (log of mAs setting) and one term estimating the effect of postprocessing. The model predicted that 115 mAs would be required to reach an 80% probability of a score of 1 or 2 for visually sharp reproduction of the heart without the post-processing filter. With the post-processing filter, the corresponding figure would be 86 mAs. Thus, applying the post-processing corresponded to a dose reduction of 25%. For other criteria, the dose-reduction was estimated to 16-26%. Using VGR, it is thus possible to quantify the potential for dose-reduction of post-processing filters.
Joint regression analysis of correlated data using Gaussian copulas.
Song, Peter X-K; Li, Mingyao; Yuan, Ying
2009-03-01
This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Logistics Management: New trends in the Reverse Logistics
NASA Astrophysics Data System (ADS)
Antonyová, A.; Antony, P.; Soewito, B.
2016-04-01
Present level and quality of the environment are directly dependent on our access to natural resources, as well as their sustainability. In particular production activities and phenomena associated with it have a direct impact on the future of our planet. Recycling process, which in large enterprises often becomes an important and integral part of the production program, is usually in small and medium-sized enterprises problematic. We can specify a few factors, which have direct impact on the development and successful application of the effective reverse logistics system. Find the ways to economically acceptable model of reverse logistics, focusing on converting waste materials for renewable energy, is the task in progress.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Operational Logistics: Lessons from the Inchon Landing.
2007-11-02
logistics at the strategic, operational, or tactical levels. An analysis of Operation Chromite , the amphibious landing at Inchon, reveals that logistical...commander’s intent, and define the logistics focus of effort demonstrate that operational logistics was a key enabler in Operation Chromite . This analysis
Logistics background study: underground mining
Hanslovan, J. J.; Visovsky, R. G.
1982-02-01
Logistical functions that are normally associated with US underground coal mining are investigated and analyzed. These functions imply all activities and services that support the producing sections of the mine. The report provides a better understanding of how these functions impact coal production in terms of time, cost, and safety. Major underground logistics activities are analyzed and include: transportation and personnel, supplies and equipment; transportation of coal and rock; electrical distribution and communications systems; water handling; hydraulics; and ventilation systems. Recommended areas for future research are identified and prioritized.
Continual Improvement in Shuttle Logistics
NASA Technical Reports Server (NTRS)
Flowers, Jean; Schafer, Loraine
1995-01-01
It has been said that Continual Improvement (CI) is difficult to apply to service oriented functions, especially in a government agency such as NASA. However, a constrained budget and increasing requirements are a way of life at NASA Kennedy Space Center (KSC), making it a natural environment for the application of CI tools and techniques. This paper describes how KSC, and specifically the Space Shuttle Logistics Project, a key contributor to KSC's mission, has embraced the CI management approach as a means of achieving its strategic goals and objectives. An overview of how the KSC Space Shuttle Logistics Project has structured its CI effort and examples of some of the initiatives are provided.
Evaluating differential effects using regression interactions and regression mixture models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Orbe, Jesus; Ferreira, Eva; Núñez-Antón, Vicente
2003-01-01
In this work we study the effect of several covariates on a censored response variable with unknown probability distribution. A semiparametric model is proposed to consider situations where the functional form of the effect of one or more covariates is unknown, as is the case in the application presented in this work. We provide its estimation procedure and, in addition, a bootstrap technique to make inference on the parameters. A simulation study has been carried out to show the good performance of the proposed estimation process and to analyse the effect of the censorship. Finally, we present the results when the methodology is applied to AIDS diagnosed patients.
NASA Space Rocket Logistics Challenges
NASA Technical Reports Server (NTRS)
Bramon, Chris; Neeley, James R.; Jones, James V.; Watson, Michael D.; Inman, Sharon K.; Tuttle, Loraine
2014-01-01
The Space Launch System (SLS) is the new NASA heavy lift launch vehicle in development and is scheduled for its first mission in 2017. SLS has many of the same logistics challenges as any other large scale program. However, SLS also faces unique challenges. This presentation will address the SLS challenges, along with the analysis and decisions to mitigate the threats posed by each.
Equivalent Linear Logistic Test Models.
ERIC Educational Resources Information Center
Bechger, Timo M.; Verstralen, Huub H. F. M.; Verhelst, Norma D.
2002-01-01
Discusses the Linear Logistic Test Model (LLTM) and demonstrates that there are many equivalent ways to specify a model. Analyzed a real data set (300 responses to 5 analogies) using a Lagrange multiplier test for the specification of the model, and demonstrated that there may be many ways to change the specification of an LLTM and achieve the…
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Error bounds in cascading regressions
Karlinger, M.R.; Troutman, B.M.
1985-01-01
Cascading regressions is a technique for predicting a value of a dependent variable when no paired measurements exist to perform a standard regression analysis. Biases in coefficients of a cascaded-regression line as well as error variance of points about the line are functions of the correlation coefficient between dependent and independent variables. Although this correlation cannot be computed because of the lack of paired data, bounds can be placed on errors through the required properties of the correlation coefficient. The potential meansquared error of a cascaded-regression prediction can be large, as illustrated through an example using geomorphologic data. ?? 1985 Plenum Publishing Corporation.
Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA
Jung, Yoonsuh; Huang, Jianhua Z.
2014-01-01
In genome-wide association studies, the primary task is to detect biomarkers in the form of Single Nucleotide Polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs comparing to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently the most commonly used approach is still to analyze one SNP at a time. In this paper, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the L1-penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a Majorization-Minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a Multiple Sclerosis data set and simulated data sets and shows promise in biomarker detection. PMID:25642005
On regression adjustment for the propensity score.
Vansteelandt, S; Daniel, R M
2014-10-15
Propensity scores are widely adopted in observational research because they enable adjustment for high-dimensional confounders without requiring models for their association with the outcome of interest. The results of statistical analyses based on stratification, matching or inverse weighting by the propensity score are therefore less susceptible to model extrapolation than those based solely on outcome regression models. This is attractive because extrapolation in outcome regression models may be alarming, yet difficult to diagnose, when the exposed and unexposed individuals have very different covariate distributions. Standard regression adjustment for the propensity score forms an alternative to the aforementioned propensity score methods, but the benefits of this are less clear because it still involves modelling the outcome in addition to the propensity score. In this article, we develop novel insights into the properties of this adjustment method. We demonstrate that standard tests of the null hypothesis of no exposure effect (based on robust variance estimators), as well as particular standardised effects obtained from such adjusted regression models, are robust against misspecification of the outcome model when a propensity score model is correctly specified; they are thus not vulnerable to the aforementioned problem of extrapolation. We moreover propose efficient estimators for these standardised effects, which retain a useful causal interpretation even when the propensity score model is misspecified, provided the outcome regression model is correctly specified.
Logistics, electronic commerce, and the environment
NASA Astrophysics Data System (ADS)
Sarkis, Joseph; Meade, Laura; Talluri, Srinivas
2002-02-01
Organizations realize that a strong supporting logistics or electronic logistics (e-logistics) function is important from both commercial and consumer perspectives. The implications of e-logistics models and practices cover the forward and reverse logistics functions of organizations. They also have direct and profound impact on the natural environment. This paper will focus on a discussion of forward and reverse e-logistics and their relationship to the natural environment. After discussion of the many pertinent issues in these areas, directions of practice and implications for study and research are then described.
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Logistics support of lunar bases
NASA Astrophysics Data System (ADS)
Woodcock, Gordon R.
Concepts for lunar base design and operations have been studied for over twenty years. A brief summary of past concepts is presented, with reference to new ideas on design, uses and logistics schemes. Transportation systems and mission modes are reviewed, showing the significant cost benefits of reusable transportation at the higher traffic rates representative of lunar base buildup and logistics operations. Parametric studies are presented over the range of base size from 6 crew to 1000 crew, and the "choke points", or barriers to growth are identified and means for their resolution presented. The leverages of food growth on the Moon, crew stay time, use of lunar oxygen, indigenous resources for construction of facilities, and transportation systems size and operations modes are presented. It is concluded that bases as large as 1000 people are affordable at less than twice the cost of an initial base of six people if these leverages of advanced basing technologies are exploited.
Logistic analysis of algae cultivation.
Slegers, P M; Leduc, S; Wijffels, R H; van Straten, G; van Boxtel, A J B
2015-03-01
Energy requirements for resource transport of algae cultivation are unknown. This work describes the quantitative analysis of energy requirements for water and CO2 transport. Algae cultivation models were combined with the quantitative logistic decision model 'BeWhere' for the regions Benelux (Northwest Europe), southern France and Sahara. For photobioreactors, the energy consumed for transport of water and CO2 turns out to be a small percentage of the energy contained in the algae biomass (0.1-3.6%). For raceway ponds the share for transport is higher (0.7-38.5%). The energy consumption for transport is the lowest in the Benelux due to good availability of both water and CO2. Analysing transport logistics is still important, despite the low energy consumption for transport. The results demonstrate that resource requirements, resource distribution and availability and transport networks have a profound effect on the location choices for algae cultivation.
Logistical teamwork tames Madagascar wildcats
Twa, W.
1986-03-01
Amoco Production Company's exploration program in western Madagascar's Sakalaya coastal plain exemplifies the unique logistical challenges both operator and drilling contractor must undergo to reach the few remaining onshore frontier areas. Sakalava is characterized by deep rivers, flood prone tributaries, and a lone 40 km hard-surface road. Problems caused by a lack of port facilities and oil field services are complicated by thousands of square miles of unimproved wooded plains. Rainwater from the nearby mountains of central Madagascar frequently floods rivers in the Sakalava coastal plain leaving impassable marshes in their wake. Prior to this project, about 45 wells had been drilled in Madagascar. Most recently, state oil company Omnis contracted Bawden Drilling International Inc. to drill nine wells for its heavy oil project on the Tsimioro oil prospect. Bawden provided both logistical and drilling services for that program.
Fleet Logistics Center, Puget Sound
2012-08-01
ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Supply Systems Command,Fleet Logistics Center, Puget Sound,467 W Street , Bremerton ,WA,98314-5100 8... Bremerton , WA Established: October 1967 Name Changes: Naval Supply Center Puget Sound, Fleet and Industrial Supply Center Puget...or Sasebo) deployed Ships in the Western Pacific (WestPac) Naval Base Kitsap at Bremerton and Bangor (NBK at Bremerton or Bangor) Navy Region
A Predictive Logistic Regression Model of World Conflict Using Open Source Data
2015-03-26
model that predicts the future state of the world where nations will either be in a state of “ violent conflict” or “not in violent conflict” based on...state of violent conflict in 2015, seventeen of them are new to conflict since the last published list in 2013. A prediction tool is created to allow...war- game subject matter experts and students to identify future predicted violent conflict and the responsible variables. v Dedication
Comparative Analysis of Decision Trees with Logistic Regression in Predicting Fault-Prone Classes
NASA Astrophysics Data System (ADS)
Singh, Yogesh; Takkar, Arvinder Kaur; Malhotra, Ruchika
There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Decision Tree (DT) methods have been successfully applied for solving classification problems in many applications. This paper evaluates the capability of three DT methods and compares its performance with statistical method in predicting fault prone software classes using publicly available NASA data set. The results indicate that the prediction performance of DT is generally better than statistical model. However, similar types of studies are required to be carried out in order to establish the acceptability of the DT models.
ERIC Educational Resources Information Center
Breidenbach, Daniel H.; French, Brian F.
2011-01-01
Many factors can influence a student's decision to withdraw from college. Intervention programs aimed at retention can benefit from understanding the factors related to such decisions, especially in underrepresented groups. The Institutional Integration Scale (IIS) has been suggested as a predictor of student persistence. Accurate prediction of…
ERIC Educational Resources Information Center
Liu, Xing
2008-01-01
The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…
Technology Transfer Automated Retrieval System (TEKTRAN)
Improvement of cold tolerance of winter wheat (Triticum aestivum L.) through breeding methods has been problematic. A better understanding of how individual wheat cultivars respond to components of the freezing process may provide new information that can be used to develop more cold tolerance culti...
ERIC Educational Resources Information Center
Miller, Thomas E.; Herreid, Charlene H.
2009-01-01
This is the fifth in a series of articles describing an attrition prediction and intervention project at the University of South Florida (USF) in Tampa. The project was originally presented in the 83(2) issue (Miller 2007). The statistical model for predicting attrition was described in the 83(3) issue (Miller and Herreid 2008). The methods and…
AP® Potential Predicted by PSAT/NMSQT® Scores Using Logistic Regression. Statistical Report 2014-1
ERIC Educational Resources Information Center
Zhang, Xiuyuan; Patel, Priyank; Ewing, Maureen
2014-01-01
AP Potential™ is an educational guidance tool that uses PSAT/NMSQT® scores to identify students who have the potential to do well on one or more Advanced Placement® (AP®) Exams. Students identified as having AP potential, perhaps students who would not have been otherwise identified, should consider enrolling in the corresponding AP course if they…
ERIC Educational Resources Information Center
Kellermeyer, Steven Bruce
2011-01-01
In the last few decades high-stakes testing has become more political than educational. The Districts within Arizona are bound by the mandates of both AZ LEARNS and the No Child Left Behind Act of 2001. At the time of this writing, both legislative mandates relied on the Arizona Instrument for Measuring Standards (AIMS) as State Tests for gauging…
ERIC Educational Resources Information Center
Del Prette, Zilda Aparecida Pereira; Prette, Almir Del; De Oliveira, Lael Almeida; Gresham, Frank M.; Vance, Michael J.
2012-01-01
Social skills are specific behaviors that individuals exhibit in order to successfully complete social tasks whereas social competence represents judgments by significant others that these social tasks have been successfully accomplished. The present investigation identified the best sociobehavioral predictors obtained from different raters…
2012-03-22
preconditions for unsafe acts which ultimately result in active failures (mishaps). The DoD HFACS classification system has been used to categorize the...reoccur as future individuals repeat the same locally rational acts , while a classification scheme on these errors merely provides a label to what in...boundaries, or public endangerment ”. The second document, AFRLMAN 99-103, defines Class A through C mishaps much like the DoD classification
Exploring Person Fit with an Approach Based on Multilevel Logistic Regression
ERIC Educational Resources Information Center
Walker, A. Adrienne; Engelhard, George, Jr.
2015-01-01
The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…
NASA Astrophysics Data System (ADS)
Uzunova, Yordanka; Prodanova, Krasimira; Spassov, Lubomir
2016-12-01
Orthotopic liver transplantation (OLT) is the only curative treatment for end-stage liver disease. Early diagnosis and treatment of infections after OLT are usually associated with improved outcomes. This study's objective is to identify reliable factors that can predict postoperative infectious morbidity. 27 children were included in the analysis. They underwent liver transplantation in our department. The correlation between two parameters (the level of blood glucose at 5th postoperative day and the duration of the anhepatic phase) and postoperative infections was analyzed, using univariate analysis. In this analysis, an independent predictive factor was derived which adequately identifies patients at risk of infectious complications after a liver transplantation.
ERIC Educational Resources Information Center
Tang, Hui; Kirk, John; Pienta, Norbert J.
2014-01-01
This paper includes two experiments, one investigating complexity factors in stoichiometry word problems, and the other identifying students' problem-solving protocols by using eye-tracking technology. The word problems used in this study had five different complexity factors, which were randomly assigned by a Web-based tool that we developed. The…
Comparing the Discrete and Continuous Logistic Models
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2008-01-01
The solutions of the discrete logistic growth model based on a difference equation and the continuous logistic growth model based on a differential equation are compared and contrasted. The investigation is conducted using a dynamic interactive spreadsheet. (Contains 5 figures.)
EMS incident management: emergency medical logistics.
Maniscalco, P M; Christen, H T
1999-01-01
If you had to get x amount of supplies to point A or point B, or both, in 10 minutes, how would you do it? The answer lies in the following steps: 1. Develop a logistics plan. 2. Use emergency management as a partner agency for developing your logistics plan. 3. Implement a push logistics system by determining what supplies/medications and equipment are important. 4. Place mass casualty/disaster caches at key locations for rapid deployment. Have medication/fluid caches available at local hospitals. 5. Develop and implement command caches for key supervisors and managers. 6. Anticipate the logistics requirements of a terrorism/tactical violence event based on a community threat assessment. 7. Educate the public about preparing a BLS family disaster kit. 8. Test logistics capabilities at disaster exercises. 9. Budget for logistics needs. 10. Never underestimate the importance of logistics. When logistics support fails, the EMS system fails.
Rank regression: an alternative regression approach for data with outliers.
Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin
2014-10-01
Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
The Evolution of Centralized Operational Logistics
2012-05-17
Command Reports 1-10; Lieutenant General William G. Pagonis and Jeffery L . Cruikshank. Moving Mountains: Lessons in Leadership and Logistics from the Gulf...1943- 1945, (Washington, DC: Government Printing Office, 1955), 320-321. 22 Alan Gropman, The Big " L "- American Logistics in World War II...29 Carter B. Magruder, Recurring Logistic Problems As I Have Observed Them, 29. 30 Ibid., 245; Lieutenant Colonel Robert L . Burke, “Corps Logistic
Research on 6R Military Logistics Network
NASA Astrophysics Data System (ADS)
Jie, Wan; Wen, Wang
The building of military logistics network is an important issue for the construction of new forces. This paper has thrown out a concept model of 6R military logistics network model based on JIT. Then we conceive of axis spoke y logistics centers network, flexible 6R organizational network, lean 6R military information network based grid. And then the strategy and proposal for the construction of the three sub networks of 6Rmilitary logistics network are given.
A tutorial on Bayesian Normal linear regression
NASA Astrophysics Data System (ADS)
Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens
2015-12-01
Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.
Nowcasting sunshine number using logistic modeling
NASA Astrophysics Data System (ADS)
Brabec, Marek; Badescu, Viorel; Paulescu, Marius
2013-04-01
In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.
Crisis Management- Operational Logistics & Asset Visibility Technologies
2006-06-01
greatly increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common...increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common logistical...FRAMEWORK............................................................29 A. INTEGRATED COMMUNICATIONS NETWORK MODEL ................29 B. INTEGRATED
Forecasting Workload for Defense Logistics Agency Distribution
2014-12-01
NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA MBA PROFESSIONAL REPORT FORECASTING WORKLOAD FOR DEFENSE LOGISTICS AGENCY...DATE December 2014 3. REPORT TYPE AND DATES COVERED MBA Professional Report 4. TITLE AND SUBTITLE FORECASTING WORKLOAD FOR DEFENSE LOGISTICS ...maximum 200 words) The Defense Logistics Agency (DLA) predicts issue and receipt workload for its distribution agency in order to maintain
Naval Logistics Integration Through Interoperable Supply Systems
2014-06-13
NAVAL LOGISTICS INTEGRATION THROUGH INTEROPERABLE SUPPLY SYSTEMS A thesis presented to the Faculty of the U.S. Army Command...Thesis 3. DATES COVERED (From - To) AUG 2013 – JUNE 2014 4. TITLE AND SUBTITLE Naval Logistics Integration through Interoperable Supply Systems...Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT This research investigates how the Navy and the Marine Corps could increase Naval Logistics
Logistics hardware and services control system
NASA Technical Reports Server (NTRS)
Koromilas, A.; Miller, K.; Lamb, T.
1973-01-01
Software system permits onsite direct control of logistics operations, which include spare parts, initial installation, tool control, and repairable parts status and control, through all facets of operations. System integrates logistics actions and controls receipts, issues, loans, repairs, fabrications, and modifications and assets in predicting and allocating logistics parts and services effectively.
Modeling confounding by half-sibling regression
Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-01-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154
Modeling confounding by half-sibling regression.
Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-07-05
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.
Spencer, Michael
1974-01-01
Food additives are discussed from the food technology point of view. The reasons for their use are summarized: (1) to protect food from chemical and microbiological attack; (2) to even out seasonal supplies; (3) to improve their eating quality; (4) to improve their nutritional value. The various types of food additives are considered, e.g. colours, flavours, emulsifiers, bread and flour additives, preservatives, and nutritional additives. The paper concludes with consideration of those circumstances in which the use of additives is (a) justified and (b) unjustified. PMID:4467857
Multiple Regression and Its Discontents
ERIC Educational Resources Information Center
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Logistics Junior Officer Development in a Period of Persistent Conflict
2013-05-23
study are consistent with a similar study by Colonel Rodney Fogg in 2011. Colonel Fogg’s analysis included additional factors including manning...War College, 2008. Fogg , Rodney. “How a Decade of Conflict Affected Junior Logistics Officer Development.” Master of Strategic Studies research
Mini pressurized logistics module (MPLM)
NASA Astrophysics Data System (ADS)
Vallerani, E.; Brondolo, D.; Basile, L.
1996-06-01
The MPLM Program was initiated through a Memorandum of Understanding (MOU) between the United States' National Aeronautics and Space Administration (NASA) and Italy's ASI, the Italian Space Agency, that was signed on 6 December 1991. The MPLM is a pressurized logistics module that will be used to transport supplies and materials (up to 20,000 lb), including user experiments, between Earth and International Space Station Alpha (ISSA) using the Shuttle, to support active and passive storage, and to provide a habitable environment for two people when docked to the Station. The Italian Space Agency has selected Alenia Spazio to develop MPLM modules that have always been considered a key element for the new International Space Station taking benefit from its design flexibility and consequent possible cost saving based on the maximum utilization of the Shuttle launch capability for any mission. In the frame of the very recent agreement between the U.S. and Russia for cooperation in space, that foresees the utilization of MIR 1 hardware, the Italian MPLM will remain an important element of the logistics system, being the only pressurized module designed for re-entry. Within the new scenario of anticipated Shuttle flights to MIR 1 during Space Station phase 1, MPLM remains a candidate for one or more missions to provide MIR 1 resupply capabilities and advanced ISSA hardware/procedures verification. Based on the concept of Flexible Carriers, Alenia Spazio is providing NASA with three MPLM flight units that can be configured according to the requirements of the Human-Tended Capability (HTC) and Permanent Human Capability (PHC) of the Space Station. Configurability will allow transportation of passive cargo only, or a combination of passive and cold cargo accommodated in R/F racks. Having developed and qualified the baseline configuration with respect to the worst enveloping condition, each unit could be easily configured to the passive or active version depending upon the
Model building in nonproportional hazard regression.
Rodríguez-Girondo, Mar; Kneib, Thomas; Cadarso-Suárez, Carmen; Abu-Assi, Emad
2013-12-30
Recent developments of statistical methods allow for a very flexible modeling of covariates affecting survival times via the hazard rate, including also the inspection of possible time-dependent associations. Despite their immediate appeal in terms of flexibility, these models typically introduce additional difficulties when a subset of covariates and the corresponding modeling alternatives have to be chosen, that is, for building the most suitable model for given data. This is particularly true when potentially time-varying associations are given. We propose to conduct a piecewise exponential representation of the original survival data to link hazard regression with estimation schemes based on of the Poisson likelihood to make recent advances for model building in exponential family regression accessible also in the nonproportional hazard regression context. A two-stage stepwise selection approach, an approach based on doubly penalized likelihood, and a componentwise functional gradient descent approach are adapted to the piecewise exponential regression problem. These three techniques were compared via an intensive simulation study. An application to prognosis after discharge for patients who suffered a myocardial infarction supplements the simulation to demonstrate the pros and cons of the approaches in real data analyses.
Functional Generalized Additive Models.
McLean, Mathew W; Hooker, Giles; Staicu, Ana-Maria; Scheipl, Fabian; Ruppert, David
2014-01-01
We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F(·,·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as in Müller and Yao (2008), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F(·,·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. R code for performing the simulations and fitting the FGAM can be found in supplemental materials available online.
Stochastic dynamics and logistic population growth
NASA Astrophysics Data System (ADS)
Méndez, Vicenç; Assaf, Michael; Campos, Daniel; Horsthemke, Werner
2015-06-01
The Verhulst model is probably the best known macroscopic rate equation in population ecology. It depends on two parameters, the intrinsic growth rate and the carrying capacity. These parameters can be estimated for different populations and are related to the reproductive fitness and the competition for limited resources, respectively. We investigate analytically and numerically the simplest possible microscopic scenarios that give rise to the logistic equation in the deterministic mean-field limit. We provide a definition of the two parameters of the Verhulst equation in terms of microscopic parameters. In addition, we derive the conditions for extinction or persistence of the population by employing either the momentum-space spectral theory or the real-space Wentzel-Kramers-Brillouin approximation to determine the probability distribution function and the mean time to extinction of the population. Our analytical results agree well with numerical simulations.
Stochastic dynamics and logistic population growth.
Méndez, Vicenç; Assaf, Michael; Campos, Daniel; Horsthemke, Werner
2015-06-01
The Verhulst model is probably the best known macroscopic rate equation in population ecology. It depends on two parameters, the intrinsic growth rate and the carrying capacity. These parameters can be estimated for different populations and are related to the reproductive fitness and the competition for limited resources, respectively. We investigate analytically and numerically the simplest possible microscopic scenarios that give rise to the logistic equation in the deterministic mean-field limit. We provide a definition of the two parameters of the Verhulst equation in terms of microscopic parameters. In addition, we derive the conditions for extinction or persistence of the population by employing either the momentum-space spectral theory or the real-space Wentzel-Kramers-Brillouin approximation to determine the probability distribution function and the mean time to extinction of the population. Our analytical results agree well with numerical simulations.
Logistics support economy and efficiency through consolidation and automation
NASA Technical Reports Server (NTRS)
Savage, G. R.; Fontana, C. J.; Custer, J. D.
1985-01-01
An integrated logistics support system, which would provide routine access to space and be cost-competitive as an operational space transportation system, was planned and implemented to support the NSTS program launch-on-time goal of 95 percent. A decision was made to centralize the Shuttle logistics functions in a modern facility that would provide office and training space and an efficient warehouse area. In this warehouse, the emphasis is on automation of the storage and retrieval function, while utilizing state-of-the-art warehousing and inventory management technology. This consolidation, together with the automation capabilities being provided, will allow for more effective utilization of personnel and improved responsiveness. In addition, this facility will be the prime support for the fully integrated logistics support of the operations era NSTS and reduce the program's management, procurement, transportation, and supply costs in the operations era.
Estimating peak flow characteristics at ungaged sites by ridge regression
Tasker, Gary D.
1982-01-01
A regression simulation model, is combined with a multisite streamflow generator to simulate a regional regression of 50-year peak discharge against a set of basin characteristics. Monte Carlo experiments are used to compare the unbiased ordinary lease squares parameter estimator with Hoerl and Kennard's (1970a) ridge estimator in which the biasing parameter is that proposed by Hoerl, Kennard, and Baldwin (1975). The simulation results indicate a substantial improvement in parameter estimation using ridge regression when the correlation between basin characteristics is more than about 0.90. In addition, results indicate a strong potential for improving the mean square error of prediction of a peak-flow characteristic versus basin characteristics regression model when the basin characteristics are approximately colinear. The simulation covers a range of regression parameters, streamflow statistics, and basin characteristics commonly found in regional regression studies.
Complementary Log Regression for Sufficient-Cause Modeling of Epidemiologic Data.
Lin, Jui-Hsiang; Lee, Wen-Chung
2016-12-13
The logistic regression model is the workhorse of epidemiological data analysis. The model helps to clarify the relationship between multiple exposures and a binary outcome. Logistic regression analysis is readily implemented using existing statistical software, and this has contributed to it becoming a routine procedure for epidemiologists. In this paper, the authors focus on a causal model which has recently received much attention from the epidemiologic community, namely, the sufficient-component cause model (causal-pie model). The authors show that the sufficient-component cause model is associated with a particular 'link' function: the complementary log link. In a complementary log regression, the exponentiated coefficient of a main-effect term corresponds to an adjusted 'peril ratio', and the coefficient of a cross-product term can be used directly to test for causal mechanistic interaction (sufficient-cause interaction). The authors provide detailed instructions on how to perform a complementary log regression using existing statistical software and use three datasets to illustrate the methodology. Complementary log regression is the model of choice for sufficient-cause analysis of binary outcomes. Its implementation is as easy as conventional logistic regression.
XRA image segmentation using regression
NASA Astrophysics Data System (ADS)
Jin, Jesse S.
1996-04-01
Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.
Demonstration of a Fiber Optic Regression Probe
NASA Technical Reports Server (NTRS)
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
A development of logistics management models for the Space Transportation System
NASA Technical Reports Server (NTRS)
Carrillo, M. J.; Jacobsen, S. E.; Abell, J. B.; Lippiatt, T. F.
1983-01-01
A new analytic queueing approach was described which relates stockage levels, repair level decisions, and the project network schedule of prelaunch operations directly to the probability distribution of the space transportation system launch delay. Finite source population and limited repair capability were additional factors included in this logistics management model developed specifically for STS maintenance requirements. Data presently available to support logistics decisions were based on a comparability study of heavy aircraft components. A two-phase program is recommended by which NASA would implement an integrated data collection system, assemble logistics data from previous STS flights, revise extant logistics planning and resource requirement parameters using Bayes-Lin techniques, and adjust for uncertainty surrounding logistics systems performance parameters. The implementation of these recommendations can be expected to deliver more cost-effective logistics support.
Regressive evolution in Astyanax cavefish.
Jeffery, William R
2009-01-01
A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment.
... or natural. Natural food additives include: Herbs or spices to add flavor to foods Vinegar for pickling ... Certain colors improve the appearance of foods. Many spices, as well as natural and man-made flavors, ...
A linear regression solution to the spatial autocorrelation problem
NASA Astrophysics Data System (ADS)
Griffith, Daniel A.
The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.
Cactus: An Introduction to Regression
ERIC Educational Resources Information Center
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Multiple Regression: A Leisurely Primer.
ERIC Educational Resources Information Center
Daniel, Larry G.; Onwuegbuzie, Anthony J.
Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…
Weighting Regressions by Propensity Scores
ERIC Educational Resources Information Center
Freedman, David A.; Berk, Richard A.
2008-01-01
Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…
Quantile Regression with Censored Data
ERIC Educational Resources Information Center
Lin, Guixian
2009-01-01
The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…
Integrated Computer System of Management in Logistics
NASA Astrophysics Data System (ADS)
Chwesiuk, Krzysztof
2011-06-01
This paper aims at presenting a concept of an integrated computer system of management in logistics, particularly in supply and distribution chains. Consequently, the paper includes the basic idea of the concept of computer-based management in logistics and components of the system, such as CAM and CIM systems in production processes, and management systems for storage, materials flow, and for managing transport, forwarding and logistics companies. The platform which integrates computer-aided management systems is that of electronic data interchange.
Optimal distributions for multiplex logistic networks.
Solá Conde, Luis E; Used, Javier; Romance, Miguel
2016-06-01
This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates.
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd.
Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors
Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David
2013-01-01
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials. PMID:24293988
Logistics Reduction Technologies for Exploration Missions
NASA Technical Reports Server (NTRS)
Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.
2014-01-01
Human exploration missions under study are limited by the launch mass capacity of existing and planned launch vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Although mass is typically the focus of exploration missions, due to its strong impact on launch vehicle and habitable volume for the crew, logistics volume also needs to be considered. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing six logistics technologies guided by a systems engineering cradle-to-grave approach to enable after-use crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. Reduction of mass has a corresponding and significant impact to logistical volume. The reduction of logistical volume can reduce the overall pressurized vehicle mass directly, or indirectly benefit the mission by allowing for an increase in habitable volume during the mission. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as mission durations increase. Early studies have shown that the use of advanced logistics technologies can save approximately 20 m(sup 3) of volume during transit alone for a six-person Mars conjunction class mission.
Analysis of Jingdong Mall Logistics Distribution Model
NASA Astrophysics Data System (ADS)
Shao, Kang; Cheng, Feng
In recent years, the development of electronic commerce in our country to speed up the pace. The role of logistics has been highlighted, more and more electronic commerce enterprise are beginning to realize the importance of logistics in the success or failure of the enterprise. In this paper, the author take Jingdong Mall for example, performing a SWOT analysis of their current situation of self-built logistics system, find out the problems existing in the current Jingdong Mall logistics distribution and give appropriate recommendations.
Regression Verification Using Impact Summaries
NASA Technical Reports Server (NTRS)
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program
Rudolf Keller
2004-08-10
In this project, a concept to improve the performance of aluminum production cells by introducing potlining additives was examined and tested. Boron oxide was added to cathode blocks, and titanium was dissolved in the metal pool; this resulted in the formation of titanium diboride and caused the molten aluminum to wet the carbonaceous cathode surface. Such wetting reportedly leads to operational improvements and extended cell life. In addition, boron oxide suppresses cyanide formation. This final report presents and discusses the results of this project. Substantial economic benefits for the practical implementation of the technology are projected, especially for modern cells with graphitized blocks. For example, with an energy savings of about 5% and an increase in pot life from 1500 to 2500 days, a cost savings of $ 0.023 per pound of aluminum produced is projected for a 200 kA pot.
Harrup, Mason K; Rollins, Harry W
2013-11-26
An additive comprising a phosphazene compound that has at least two reactive functional groups and at least one capping functional group bonded to phosphorus atoms of the phosphazene compound. One of the at least two reactive functional groups is configured to react with cellulose and the other of the at least two reactive functional groups is configured to react with a resin, such as an amine resin of a polycarboxylic acid resin. The at least one capping functional group is selected from the group consisting of a short chain ether group, an alkoxy group, or an aryloxy group. Also disclosed are an additive-resin admixture, a method of treating a wood product, and a wood product.
Embedded Sensors for Measuring Surface Regression
NASA Technical Reports Server (NTRS)
Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.
2006-01-01
non-eroding end of the sensor. The sensor signal can be transmitted from inside a high-pressure chamber to the ambient environment, using commercially available feedthrough connectors. Miniaturized internal recorders or wireless data transmission could also potentially be employed to eliminate the need for producing penetrations in the chamber case. The rungs are designed so that as each successive rung is eroded away, the resistance changes by an amount that yields a readily measurable signal larger than the background noise. (In addition, signal-conditioning techniques are used in processing the resistance readings to mitigate the effect of noise.) Hence, each discrete change of resistance serves to indicate the arrival of the regressing host material front at the known depth of the affected resistor rung. The average rate of regression between two adjacent resistors can be calculated simply as the distance between the resistors divided by the time interval between their resistance jumps. Advanced data reduction techniques have also been developed to establish the instantaneous surface position and regression rate when the regressing front is between rungs.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer.
Biomass Supply Logistics and Infrastructure
Sokhansanj, Shahabaddine
2009-04-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the Biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews methods of estimating the quantities of biomass followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and Transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.
Biomass supply logistics and infrastructure.
Sokhansanj, Shahabaddine; Hess, J Richard
2009-01-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.
FUTURE LOGISTICS AND OPERATIONAL ADAPTABILITY
Houck, Roger P.
2009-10-01
While we cannot predict the future, we can ascertain trends and examine them through the use of alternative futures methodologies and tools. From a logistics perspective, we know that many different futures are possible, all of which are obviously dependent on decisions we make in the present. As professional logisticians we are obligated to provide the field - our Soldiers - with our best professional opinion of what will result in success on the battlefield. Our view of the future should take history and contemporary conflict into account, but it must also consider that continuity with the past cannot be taken for granted. If we are too focused on past and current experience, then our vision of the future will be limited indeed. On the one hand, the future must be explained in language that does not defy common sense. On the other hand, the pace of change is such that we must conduct qualitative and quantitative trend analyses, forecasting, and explorative scenario development in ways that allow for significant breaks - or "shocks" - that may "change the game". We will need capabilities and solutions that are constantly evolving - and improving - to match the operational tempo of a radically changing threat environment. For those who provide quartermaster services, this article will briefly examine what this means from the perspective of creating what might be termed a preferred future.
The Automated Logistics Element Planning System (ALEPS)
NASA Technical Reports Server (NTRS)
Schwaab, Douglas G.
1991-01-01
The design and functions of ALEPS (Automated Logistics Element Planning System) is a computer system that will automate planning and decision support for Space Station Freedom Logistical Elements (LEs) resupply and return operations. ALEPS provides data management, planning, analysis, monitoring, interfacing, and flight certification for support of LE flight load planning activities. The prototype ALEPS algorithm development is described.
Visualizing the Logistic Map with a Microcontroller
ERIC Educational Resources Information Center
Serna, Juan D.; Joshi, Amitabh
2012-01-01
The logistic map is one of the simplest nonlinear dynamical systems that clearly exhibits the route to chaos. In this paper, we explore the evolution of the logistic map using an open-source microcontroller connected to an array of light-emitting diodes (LEDs). We divide the one-dimensional domain interval [0,1] into ten equal parts, an associate…
Antibiotic lock therapy: review of technique and logistical challenges
Justo, Julie Ann; Bookstaver, P Brandon
2014-01-01
Antibiotic lock therapy (ALT) for the prevention and treatment of catheter-related bloodstream infections is a simple strategy in theory, yet its real-world application may be delayed or avoided due to technical questions and/or logistical challenges. This review focuses on these latter aspects of ALT, including preparation information for a variety of antibiotic lock solutions (ie, aminoglycosides, beta-lactams, fluoroquinolones, folate antagonists, glycopeptides, glycylcyclines, lipopeptides, oxazolidinones, polymyxins, and tetracyclines) and common clinical issues surrounding ALT administration. Detailed data regarding concentrations, additives, stability/compatibility, and dwell times are summarized. Logistical challenges such as lock preparation procedures, use of additives (eg, heparin, citrate, or ethylenediaminetetraacetic acid), timing of initiation and therapy duration, optimal dwell time and catheter accessibility, and risks of ALT are also described. Development of local protocols is recommended in order to avoid these potential barriers and encourage utilization of ALT where appropriate. PMID:25548523
The Automated Logistics Element Planning System (ALEPS)
NASA Technical Reports Server (NTRS)
Schwaab, Douglas G.
1992-01-01
ALEPS, which is being developed to provide the SSF program with a computer system to automate logistics resupply/return cargo load planning and verification, is presented. ALEPS will make it possible to simultaneously optimize both the resupply flight load plan and the return flight reload plan for any of the logistics carriers. In the verification mode ALEPS will support the carrier's flight readiness reviews and control proper execution of the approved plans. It will also support the SSF inventory management system by providing electronic block updates to the inventory database on the cargo arriving at or departing the station aboard a logistics carrier. A prototype drawer packing algorithm is described which is capable of generating solutions for 3D packing of cargo items into a logistics carrier storage accommodation. It is concluded that ALEPS will provide the capability to generate and modify optimized loading plans for the logistics elements fleet.
[Factors associated with the addition of salt to prepared food].
de Castro, Raquel da Silva Assunção; Giatti, Luana; Barreto, Sandhi Maria
2014-05-01
The scope of this research was to investigate the potential differences between men and women in the addition of salt to prepared food. The study included 47,557 individuals aged 18 to 64 participating in the Risk and Protection Factors for Chronic Disease Surveillance System by Telephone Interview carried out in 26 Brazilian state capitals and the Federal District in 2006. Differences between men and women were tested by the chi-square test and the association magnitudes between the dependent and independent variables were estimated by the Odds Ratio obtained by Multiple Logistic Regression analysis. The prevalence of the addition of salt to prepared food was 8.3%, being higher among men (9,8% vs 6,9%, p < 0.01). After adjustment, the addition of salt to prepared food was higher in individuals with self-rated fair to poor health, reporting cardiovascular disease and living in the North of Brazil. Hypertensive individuals reported addition of less salt to prepared food. Educational level was not associated with salt usage. Men add more salt than women. Public health policies aimed at reducing salt intake by the population should take into account the gender differences in salt intake and the factors that contribute to such differences.
Regression analysis of cytopathological data
Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.
1982-12-01
Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.
Logistics Modeling for Lunar Exploration Systems
NASA Technical Reports Server (NTRS)
Andraschko, Mark R.; Merrill, R. Gabe; Earle, Kevin D.
2008-01-01
The extensive logistics required to support extended crewed operations in space make effective modeling of logistics requirements and deployment critical to predicting the behavior of human lunar exploration systems. This paper discusses the software that has been developed as part of the Campaign Manifest Analysis Tool in support of strategic analysis activities under the Constellation Architecture Team - Lunar. The described logistics module enables definition of logistics requirements across multiple surface locations and allows for the transfer of logistics between those locations. A key feature of the module is the loading algorithm that is used to efficiently load logistics by type into carriers and then onto landers. Attention is given to the capabilities and limitations of this loading algorithm, particularly with regard to surface transfers. These capabilities are described within the context of the object-oriented software implementation, with details provided on the applicability of using this approach to model other human exploration scenarios. Some challenges of incorporating probabilistics into this type of logistics analysis model are discussed at a high level.
ISS Logistics Hardware Disposition and Metrics Validation
NASA Technical Reports Server (NTRS)
Rogers, Toneka R.
2010-01-01
I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.
Multiatlas segmentation as nonparametric regression.
Awate, Suyash P; Whitaker, Ross T
2014-09-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.
Multiatlas Segmentation as Nonparametric Regression
Awate, Suyash P.; Whitaker, Ross T.
2015-01-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528
Mapping geogenic radon potential by regression kriging.
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-02-15
Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly.
Recognition of caudal regression syndrome.
Boulas, Mari M
2009-04-01
Caudal regression syndrome, also referred to as caudal dysplasia and sacral agenesis syndrome, is a rare congenital malformation characterized by varying degrees of developmental failure early in gestation. It involves the lower extremities, the lumbar and coccygeal vertebrae, and corresponding segments of the spinal cord. This is a rare disorder, and true pathogenesis is unclear. The etiology is thought to be related to maternal diabetes, genetic predisposition, and vascular hypoperfusion, but no true causative factor has been determined. Fetal diagnostic tools allow for early recognition of the syndrome, and careful examination of the newborn is essential to determine the extent of the disorder. Associated organ system dysfunction depends on the severity of the disease. Related defects are structural, and systematic problems including respiratory, cardiac, gastrointestinal, urinary, orthopedic, and neurologic can be present in varying degrees of severity and in different combinations. A multidisciplinary approach to management is crucial. Because the primary pathology is irreversible, treatment is only supportive.
Practical Session: Multiple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Featured Partner: Saddle Creek Logistics Services
This EPA fact sheet spotlights Saddle Creek Logistics as a SmartWay partner committed to sustainability in reducing greenhouse gas emissions and air pollution caused by freight transportation, partly by growing its compressed natural gas (CNG) vehicles for
Lumbar herniated disc: spontaneous regression
Yüksel, Kasım Zafer
2017-01-01
Background Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. Methods This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3−L4, L4−L5 or L5−S1 were enrolled. Results The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3−L4, L4−L5, and L5−S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5−22). Conclusions It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery. PMID:28119770
A Gibbs sampler for multivariate linear regression
NASA Astrophysics Data System (ADS)
Mantz, Adam B.
2016-04-01
Kelly described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modelled by a flexible mixture of Gaussians rather than assumed to be uniform. Here, I extend the Kelly algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Secondly, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.
Scientific Progress or Regress in Sports Physiology?
Böning, Dieter
2016-11-01
In modern societies there is strong belief in scientific progress, but, unfortunately, a parallel partial regress occurs because of often avoidable mistakes. Mistakes are mainly forgetting, erroneous theories, errors in experiments and manuscripts, prejudice, selected publication of "positive" results, and fraud. An example of forgetting is that methods introduced decades ago are used without knowing the underlying theories: Basic articles are no longer read or cited. This omission may cause incorrect interpretation of results. For instance, false use of actual base excess instead of standard base excess for calculation of the number of hydrogen ions leaving the muscles raised the idea that an unknown fixed acid is produced in addition to lactic acid during exercise. An erroneous theory led to the conclusion that lactate is not the anion of a strong acid but a buffer. Mistakes occur after incorrect application of a method, after exclusion of unwelcome values, during evaluation of measurements by false calculations, or during preparation of manuscripts. Co-authors, as well as reviewers, do not always carefully read papers before publication. Peer reviewers might be biased against a hypothesis or an author. A general problem is selected publication of positive results. An example of fraud in sports medicine is the presence of doped subjects in groups of investigated athletes. To reduce regress, it is important that investigators search both original and recent articles on a topic and conscientiously examine the data. All co-authors and reviewers should read the text thoroughly and inspect all tables and figures in a manuscript.
Investigating the strategic antecedents of agility in humanitarian logistics.
L'Hermitte, Cécile; Brooks, Benjamin; Bowles, Marcus; Tatham, Peter H
2016-12-16
This study investigates the strategic antecedents of operational agility in humanitarian logistics. It began by identifying the particular actions to be taken at the strategic level of a humanitarian organisation to support field-level agility. Next, quantitative data (n=59) were collected on four strategic-level capabilities (being purposeful, action-focused, collaborative, and learning-oriented) and on operational agility (field responsiveness and flexibility). Using a quantitative analysis, the study tested the relationship between organisational capacity building and operational agility and found that the four strategic-level capabilities are fundamental building blocks of agility. Collectively they account for 52 per cent of the ability of humanitarian logisticians to deal with ongoing changes and disruptions in the field. This study emphasises the need for researchers and practitioners to embrace a broader perspective of agility in humanitarian logistics. In addition, it highlights the inherently strategic nature of agility, the development of which involves focusing simultaneously on multiple drivers.
Sustainment and Logistics in Better Buying Power
2015-07-01
37 Defense AT&L: July–August 2015 Sustainment and Logistics in Better Buying Power David J. Berteau Berteau is Assistant Secretary of Defense...Better Buying Power (BBP) in 2010, its key sustain - ment initiative has focused on Per- formance-Based Logistics (PBL). With the updated guid- ance...for BBP 3.0 issued April 9, it is worth expanding the view of these updated initiatives through the sustainment prism. This article finds that
Logistics: Supply Based or Distribution Based?
2007-04-01
supply chain management, the need to adapt systems as conditions and capabilities change may not be clear or the root causes of problems may not be...logistics-system and supply - chain design and management positions. They need to become adept at integrating the full range of options available to...distribution-based and supply-based designs, the Army, in conjunction with its joint supply - chain partners, should seek optimal, balanced logistics system designs that it can adapt quickly to changing conditions.
Logistics Handbook for Strategic Mobility Planning
1992-09-01
AD-A257 075 MTMCTEA REFERENCE 92-700-2 LOGISTICS HANDBOOK FOR STRATEGIC 7 DTIGMOBILITY >CT 2 8 1992 SEPTEMBER 1992 MILITARY TRAFFIC MANAGEMENT...unlimited.,, I MTMCTEA Reference 92-700-2 LOGISTICS HANDBOOK FOR STRATEGIC MOBILITY PLANNING September 1992 Aces gaj 0 n For (NTIS cz MILITARY TRAFFIC ...Military Traffic Management Command Transportation Engineering Agency (MTMCTEA) transportation and transportability studies. D. REFERENCES Source references
Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana
2015-01-01
Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP
Logistics Reduction Technologies for Exploration Missions
NASA Technical Reports Server (NTRS)
Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.
2014-01-01
Human exploration missions under study are very limited by the launch mass capacity of existing and planned vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Consequently, crew item logistical mass is typically competing with vehicle systems for mass allocation. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing five logistics technologies guided by a systems engineering cradle-to-grave approach to enable used crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as the mission duration increases. This paper provides a description and the challenges of the five technologies under development and the estimated overall mission benefits of each technology.
Early development and regression in Rett syndrome.
Lee, J Y L; Leonard, H; Piek, J P; Downs, J
2013-12-01
This study utilized developmental profiling to examine symptoms in 14 girls with genetically confirmed Rett syndrome and whose families were participating in the Australian Rett syndrome or InterRett database. Regression was mostly characterized by loss of hand and/or communication skills (13/14) except one girl demonstrated slowing of skill development. Social withdrawal and inconsolable crying often developed simultaneously (9/14), with social withdrawal for shorter duration than inconsolable crying. Previously acquired gross motor skills declined in just over half of the sample (8/14), mostly observed as a loss of balance. Early abnormalities such as vomiting and strabismus were also seen. Our findings provide additional insight into the early clinical profile of Rett syndrome.
ISS Update: Logistics Reduction and Repurposing (Part 1)
Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...
ISS Update: Logistics Reduction and Repurposing (Part 2)
Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...
Genetics Home Reference: caudal regression syndrome
... of a genetic condition? Genetic and Rare Diseases Information Center Frequency Caudal regression syndrome is estimated to occur in 1 to ... parts of the skeleton, gastrointestinal system, and genitourinary ... caudal regression syndrome results from the presence of an abnormal ...
Generalization of the logistic distribution in the dynamic model of wind direction
NASA Astrophysics Data System (ADS)
Kaplya, E. V.
2016-12-01
Statistical regularity in the dynamics of wind direction has been found. The density distribution of the increment of the wind-direction angle has been approximated using a generalized advanced logistic distribution. The advanced logistic distribution involves an additional power-law parameter. The parameters of the approximation function have been computed from experimental data using the method of least squares. The consistency of the proposed function with meteorological data has been tested using Pearson's chisquared test and the Kolmogorov test.
Semiparametric regression during 2003–2007*
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2010-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800
Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo
2012-11-01
For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability.
Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan
2015-06-01
Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations.
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Developmental Regression in Autism Spectrum Disorders
ERIC Educational Resources Information Center
Rogers, Sally J.
2004-01-01
The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Synthesizing Regression Results: A Factored Likelihood Method
ERIC Educational Resources Information Center
Wu, Meng-Jia; Becker, Betsy Jane
2013-01-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…
Streamflow forecasting using functional regression
NASA Astrophysics Data System (ADS)
Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.
2016-07-01
Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.
Survival analysis and Cox regression.
Benítez-Parejo, N; Rodríguez del Águila, M M; Pérez-Vicente, S
2011-01-01
The data provided by clinical trials are often expressed in terms of survival. The analysis of survival comprises a series of statistical analytical techniques in which the measurements analysed represent the time elapsed between a given exposure and the outcome of a certain event. Despite the name of these techniques, the outcome in question does not necessarily have to be either survival or death, and may be healing versus no healing, relief versus pain, complication versus no complication, relapse versus no relapse, etc. The present article describes the analysis of survival from both a descriptive perspective, based on the Kaplan-Meier estimation method, and in terms of bivariate comparisons using the log-rank statistic. Likewise, a description is provided of the Cox regression models for the study of risk factors or covariables associated to the probability of survival. These models are defined in both simple and multiple forms, and a description is provided of how they are calculated and how the postulates for application are checked - accompanied by illustrating examples with the shareware application R.
Estimating equivalence with quantile regression
Cade, B.S.
2011-01-01
Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.
NASA Space Exploration Logistics Workshop Proceedings
NASA Technical Reports Server (NTRS)
deWeek, Oliver; Evans, William A.; Parrish, Joe; James, Sarah
2006-01-01
As NASA has embarked on a new Vision for Space Exploration, there is new energy and focus around the area of manned space exploration. These activities encompass the design of new vehicles such as the Crew Exploration Vehicle (CEV) and Crew Launch Vehicle (CLV) and the identification of commercial opportunities for space transportation services, as well as continued operations of the Space Shuttle and the International Space Station. Reaching the Moon and eventually Mars with a mix of both robotic and human explorers for short term missions is a formidable challenge in itself. How to achieve this in a safe, efficient and long-term sustainable way is yet another question. The challenge is not only one of vehicle design, launch, and operations but also one of space logistics. Oftentimes, logistical issues are not given enough consideration upfront, in relation to the large share of operating budgets they consume. In this context, a group of 54 experts in space logistics met for a two-day workshop to discuss the following key questions: 1. What is the current state-of the art in space logistics, in terms of architectures, concepts, technologies as well as enabling processes? 2. What are the main challenges for space logistics for future human exploration of the Moon and Mars, at the intersection of engineering and space operations? 3. What lessons can be drawn from past successes and failures in human space flight logistics? 4. What lessons and connections do we see from terrestrial analogies as well as activities in other areas, such as U.S. military logistics? 5. What key advances are required to enable long-term success in the context of a future interplanetary supply chain? These proceedings summarize the outcomes of the workshop, reference particular presentations, panels and breakout sessions, and record specific observations that should help guide future efforts.
Metal Additive Manufacturing: A Review
NASA Astrophysics Data System (ADS)
Frazier, William E.
2014-06-01
This paper reviews the state-of-the-art of an important, rapidly emerging, manufacturing technology that is alternatively called additive manufacturing (AM), direct digital manufacturing, free form fabrication, or 3D printing, etc. A broad contextual overview of metallic AM is provided. AM has the potential to revolutionize the global parts manufacturing and logistics landscape. It enables distributed manufacturing and the productions of parts-on-demand while offering the potential to reduce cost, energy consumption, and carbon footprint. This paper explores the material science, processes, and business consideration associated with achieving these performance gains. It is concluded that a paradigm shift is required in order to fully exploit AM potential.