Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Investigating bias in squared regression structure coefficients
Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273
Investigating bias in squared regression structure coefficients.
Nimon, Kim F; Zientek, Linda R; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients.
Parametric modeling of quantile regression coefficient functions.
Frumento, Paolo; Bottai, Matteo
2016-03-01
Estimating the conditional quantiles of outcome variables of interest is frequent in many research areas, and quantile regression is foremost among the utilized methods. The coefficients of a quantile regression model depend on the order of the quantile being estimated. For example, the coefficients for the median are generally different from those of the 10th centile. In this article, we describe an approach to modeling the regression coefficients as parametric functions of the order of the quantile. This approach may have advantages in terms of parsimony, efficiency, and may expand the potential of statistical modeling. Goodness-of-fit measures and testing procedures are discussed, and the results of a simulation study are presented. We apply the method to analyze the data that motivated this work. The described method is implemented in the qrcm R package.
On regression adjustment for the propensity score.
Vansteelandt, S; Daniel, R M
2014-10-15
Propensity scores are widely adopted in observational research because they enable adjustment for high-dimensional confounders without requiring models for their association with the outcome of interest. The results of statistical analyses based on stratification, matching or inverse weighting by the propensity score are therefore less susceptible to model extrapolation than those based solely on outcome regression models. This is attractive because extrapolation in outcome regression models may be alarming, yet difficult to diagnose, when the exposed and unexposed individuals have very different covariate distributions. Standard regression adjustment for the propensity score forms an alternative to the aforementioned propensity score methods, but the benefits of this are less clear because it still involves modelling the outcome in addition to the propensity score. In this article, we develop novel insights into the properties of this adjustment method. We demonstrate that standard tests of the null hypothesis of no exposure effect (based on robust variance estimators), as well as particular standardised effects obtained from such adjusted regression models, are robust against misspecification of the outcome model when a propensity score model is correctly specified; they are thus not vulnerable to the aforementioned problem of extrapolation. We moreover propose efficient estimators for these standardised effects, which retain a useful causal interpretation even when the propensity score model is misspecified, provided the outcome regression model is correctly specified.
Biases and Standard Errors of Standardized Regression Coefficients
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Chan, Wai
2011-01-01
The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…
Penalized spline estimation for functional coefficient regression models.
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan
2010-04-01
The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.
Some Improvements in Confidence Intervals for Standardized Regression Coefficients.
Dudgeon, Paul
2017-03-13
Yuan and Chan (Psychometrika 76:670-690, 2011. doi: 10.1007/S11336-011-9224-6 ) derived consistent confidence intervals for standardized regression coefficients under fixed and random score assumptions. Jones and Waller (Psychometrika 80:365-378, 2015. doi: 10.1007/S11336-013-9380-Y ) extended these developments to circumstances where data are non-normal by examining confidence intervals based on Browne's (Br J Math Stat Psychol 37:62-83, 1984. doi: 10.1111/j.2044-8317.1984.tb00789.x ) asymptotic distribution-free (ADF) theory. Seven different heteroscedastic-consistent (HC) estimators were investigated in the current study as potentially better solutions for constructing confidence intervals on standardized regression coefficients under non-normality. Normal theory, ADF, and HC estimators were evaluated in a Monte Carlo simulation. Findings confirmed the superiority of the HC3 (MacKinnon and White, J Econ 35:305-325, 1985. doi: 10.1016/0304-4076(85)90158-7 ) and HC5 (Cribari-Neto and Da Silva, Adv Stat Anal 95:129-146, 2011. doi: 10.1007/s10182-010-0141-2 ) interval estimators over Jones and Waller's ADF estimator under all conditions investigated, as well as over the normal theory method. The HC5 estimator was more robust in a restricted set of conditions over the HC3 estimator. Some possible extensions of HC estimators to other effect size measures are considered for future developments.
Coercively Adjusted Auto Regression Model for Forecasting in Epilepsy EEG
Kim, Sun-Hee; Faloutsos, Christos; Yang, Hyung-Jeong
2013-01-01
Recently, data with complex characteristics such as epilepsy electroencephalography (EEG) time series has emerged. Epilepsy EEG data has special characteristics including nonlinearity, nonnormality, and nonperiodicity. Therefore, it is important to find a suitable forecasting method that covers these special characteristics. In this paper, we propose a coercively adjusted autoregression (CA-AR) method that forecasts future values from a multivariable epilepsy EEG time series. We use the technique of random coefficients, which forcefully adjusts the coefficients with −1 and 1. The fractal dimension is used to determine the order of the CA-AR model. We applied the CA-AR method reflecting special characteristics of data to forecast the future value of epilepsy EEG data. Experimental results show that when compared to previous methods, the proposed method can forecast faster and accurately. PMID:23710252
Parametric expressions for the adjusted Hargreaves coefficient in Eastern Spain
NASA Astrophysics Data System (ADS)
Martí, Pau; Zarzo, Manuel; Vanderlinden, Karl; Girona, Joan
2015-10-01
The application of simple empirical equations for estimating reference evapotranspiration (ETo) is the only alternative in many cases to robust approaches with high input requirements, especially at the local scale. In particular, temperature-based approaches present a high potential applicability, among others, because temperature might explain a high amount of ETo variability, and also because it can be measured easily and is one of the most available climatic inputs. One of the most well-known temperature-based approaches, the Hargreaves (HG) equation, requires a preliminary local calibration that is usually performed through an adjustment of the HG coefficient (AHC). Nevertheless, these calibrations are site-specific, and cannot be extrapolated to other locations. So, they become useless in many situations, because they are derived from already available benchmarks based on more robust methods, which will be applied in practice. Therefore, the development of accurate equations for estimating AHC at local scale becomes a relevant task. This paper analyses the performance of calibrated and non-calibrated HG equations at 30 stations in Eastern Spain at daily, weekly, fortnightly and monthly scales. Moreover, multiple linear regression was applied for estimating AHC based on different inputs, and the resulting equations yielded higher performance accuracy than the non-calibrated HG estimates. The approach relying on the ratio mean temperature to temperature range did not provide suitable AHC estimations, and was highly improved by splitting it into two independent predictors. Temperature-based equations were improved by incorporating geographical inputs. Finally, the model relying on temperature and geographic inputs was further improved by incorporating wind speed, even just with simple qualitative information about wind category (e.g. poorly vs. highly windy). The accuracy of the calibrated and non-calibrated HG estimates increased for longer time steps (daily
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, Fei; Mitra, Nandita
2016-04-19
Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.
Interpreting Bivariate Regression Coefficients: Going beyond the Average
ERIC Educational Resources Information Center
Halcoussis, Dennis; Phillips, G. Michael
2010-01-01
Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…
Procedures for adjusting regional regression models of urban-runoff quality using local data
Hoos, A.B.; Sisolak, J.K.
1993-01-01
Statistical operations termed model-adjustment procedures (MAP?s) can be used to incorporate local data into existing regression models to improve the prediction of urban-runoff quality. Each MAP is a form of regression analysis in which the local data base is used as a calibration data set. Regression coefficients are determined from the local data base, and the resulting `adjusted? regression models can then be used to predict storm-runoff quality at unmonitored sites. The response variable in the regression analyses is the observed load or mean concentration of a constituent in storm runoff for a single storm. The set of explanatory variables used in the regression analyses is different for each MAP, but always includes the predicted value of load or mean concentration from a regional regression model. The four MAP?s examined in this study were: single-factor regression against the regional model prediction, P, (termed MAP-lF-P), regression against P,, (termed MAP-R-P), regression against P, and additional local variables (termed MAP-R-P+nV), and a weighted combination of P, and a local-regression prediction (termed MAP-W). The procedures were tested by means of split-sample analysis, using data from three cities included in the Nationwide Urban Runoff Program: Denver, Colorado; Bellevue, Washington; and Knoxville, Tennessee. The MAP that provided the greatest predictive accuracy for the verification data set differed among the three test data bases and among model types (MAP-W for Denver and Knoxville, MAP-lF-P and MAP-R-P for Bellevue load models, and MAP-R-P+nV for Bellevue concentration models) and, in many cases, was not clearly indicated by the values of standard error of estimate for the calibration data set. A scheme to guide MAP selection, based on exploratory data analysis of the calibration data set, is presented and tested. The MAP?s were tested for sensitivity to the size of a calibration data set. As expected, predictive accuracy of all MAP?s for
Buchner, Florian; Wasem, Jürgen; Schillo, Sonja
2017-01-01
Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two-step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity-group-split represent interaction effects of different morbidity groups. In the second step the 'traditional' weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R(2) from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R(2) improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd.
Parametric modeling of quantile regression coefficient functions with censored and truncated data.
Frumento, Paolo; Bottai, Matteo
2017-02-09
Quantile regression coefficient functions describe how the coefficients of a quantile regression model depend on the order of the quantile. A method for parametric modeling of quantile regression coefficient functions was discussed in a recent article. The aim of the present work is to extend the existing framework to censored and truncated data. We propose an estimator and derive its asymptotic properties. We discuss goodness-of-fit measures, present simulation results, and analyze the data that motivated this article. The described estimator has been implemented in the R package qrcm.
Assessing Longitudinal Change: Adjustment for Regression to the Mean Effects
ERIC Educational Resources Information Center
Rocconi, Louis M.; Ethington, Corinna A.
2009-01-01
Pascarella (J Coll Stud Dev 47:508-520, 2006) has called for an increase in use of longitudinal data with pretest-posttest design when studying effects on college students. However, such designs that use multiple measures to document change are vulnerable to an important threat to internal validity, regression to the mean. Herein, we discuss a…
Return period adjustment for runoff coefficients based on analysis in undeveloped Texas watersheds
Dhakal, Nirajan; Fang, Xing; Asquith, William H.; Cleveland, Theodore G.; Thompson, David B.
2013-01-01
The rational method for peak discharge (Qp) estimation was introduced in the 1880s. The runoff coefficient (C) is a key parameter for the rational method that has an implicit meaning of rate proportionality, and the C has been declared a function of the annual return period by various researchers. Rate-based runoff coefficients as a function of the return period, C(T), were determined for 36 undeveloped watersheds in Texas using peak discharge frequency from previously published regional regression equations and rainfall intensity frequency for return periods T of 2, 5, 10, 25, 50, and 100 years. The C(T) values and return period adjustments C(T)/C(T=10 year) determined in this study are most applicable to undeveloped watersheds. The return period adjustments determined for the Texas watersheds in this study and those extracted from prior studies of non-Texas data exceed values from well-known literature such as design manuals and textbooks. Most importantly, the return period adjustments exceed values currently recognized in Texas Department of Transportation design guidance when T>10 years.
Algamal, Zakariya Yahya; Lee, Muhammad Hisyam
2015-12-01
Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.
A Tutorial on Calculating and Interpreting Regression Coefficients in Health Behavior Research
ERIC Educational Resources Information Center
Stellefson, Michael L.; Hanik, Bruce W.; Chaney, Beth H.; Chaney, J. Don
2008-01-01
Regression analyses are frequently employed by health educators who conduct empirical research examining a variety of health behaviors. Within regression, there are a variety of coefficients produced, which are not always easily understood and/or articulated by health education researchers. It is important to not only understand what these…
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
Roesch, Scott C; Vaughn, Allison A; Aldridge, Arianna A; Villodas, Feion
2009-10-01
Many researchers underscore the importance of coping in the daily lives of adolescents, yet very few studies measure this and related constructs at this level. Using a daily diary approach to stress and coping, the current study evaluated a series of mediational coping models in a sample of low-income minority adolescents (N = 89). Specifically, coping was hypothesized to mediate the relationship between attributional style (and dimensions) and daily affect. Using random coefficient regression modeling, the relationship between (a) the locus of causality dimension and positive affect was completely mediated by the use of acceptance and humor as coping strategies; (b) the stability dimension and positive affect was completely mediated by the use of both problem-solving and positive thinking; and (c) the stability dimension and negative affect was partially mediated by the use of religious coping. In addition, the locus of causality and stability (but not globality) dimensions were also directly related to affect. However, the relationship between pessimistic explanatory style and affect was not mediated by coping. Consistent with previous research, these findings suggest that attributions are both directly and indirectly related to indices of affect or adjustment. Thus, attributions may not only influence the type of coping strategy employed, but may also serve as coping strategies themselves.
Adjustment of regional regression equations for urban storm-runoff quality using at-site data
Barks, C.S.
1996-01-01
Regional regression equations have been developed to estimate urban storm-runoff loads and mean concentrations using a national data base. Four statistical methods using at-site data to adjust the regional equation predictions were developed to provide better local estimates. The four adjustment procedures are a single-factor adjustment, a regression of the observed data against the predicted values, a regression of the observed values against the predicted values and additional local independent variables, and a weighted combination of a local regression with the regional prediction. Data collected at five representative storm-runoff sites during 22 storms in Little Rock, Arkansas, were used to verify, and, when appropriate, adjust the regional regression equation predictions. Comparison of observed values of stormrunoff loads and mean concentrations to the predicted values from the regional regression equations for nine constituents (chemical oxygen demand, suspended solids, total nitrogen as N, total ammonia plus organic nitrogen as N, total phosphorus as P, dissolved phosphorus as P, total recoverable copper, total recoverable lead, and total recoverable zinc) showed large prediction errors ranging from 63 percent to more than several thousand percent. Prediction errors for 6 of the 18 regional regression equations were less than 100 percent and could be considered reasonable for water-quality prediction equations. The regression adjustment procedure was used to adjust five of the regional equation predictions to improve the predictive accuracy. For seven of the regional equations the observed and the predicted values are not significantly correlated. Thus neither the unadjusted regional equations nor any of the adjustments were appropriate. The mean of the observed values was used as a simple estimator when the regional equation predictions and adjusted predictions were not appropriate.
Using Raw VAR Regression Coefficients to Build Networks can be Misleading.
Bulteel, Kirsten; Tuerlinckx, Francis; Brose, Annette; Ceulemans, Eva
2016-01-01
Many questions in the behavioral sciences focus on the causal interplay of a number of variables across time. To reveal the dynamic relations between the variables, their (auto- or cross-) regressive effects across time may be inspected by fitting a lag-one vector autoregressive, or VAR(1), model and visualizing the resulting regression coefficients as the edges of a weighted directed network. Usually, the raw VAR(1) regression coefficients are drawn, but we argue that this may yield misleading network figures and characteristics because of two problems. First, the raw regression coefficients are sensitive to scale and variance differences among the variables and therefore may lack comparability, which is needed if one wants to calculate, for example, centrality measures. Second, they only represent the unique direct effects of the variables, which may give a distorted picture when variables correlate strongly. To deal with these problems, we propose to use other VAR(1)-based measures as edges. Specifically, to solve the comparability issue, the standardized VAR(1) regression coefficients can be displayed. Furthermore, relative importance metrics can be computed to include direct as well as shared and indirect effects into the network.
Hwang, J T Gene; Nettleton, Dan
2002-01-01
Estimates of the locations and effects of quantitative trait loci (QTL) can be obtained by regressing phenotype on marker genotype. Under certain basic conditions, the signs of regression coefficients flanking QTL must be the same. There is no guarantee, however, that the signs of the regression coefficient estimates will be the same. We use sign inconsistency to describe the situation in which there is disagreement between the signs of the estimated regression coefficients flanking QTL. The presence of sign inconsistency can undermine the effectiveness of QTL mapping strategies that presume intervals whose markers have regression coefficient estimates of differing sign to be devoid of QTL. This article investigates the likelihood of sign inconsistency under various conditions. We derive an analytic expression for the approximate probability of sign inconsistency in the single-QTL case. We also examine sign inconsistency probabilities when multiple QTL are present through simulation. We have discovered that the probability of sign inconsistency can be unacceptably high, even when the conditions for QTL detection are otherwise quite favorable. PMID:11973322
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models.
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference procedures of the squared multiple correlation coefficient have been extensively developed. In contrast, a full range of statistical methods for the analysis of the squared cross-validity coefficient is considerably far from complete. This article considers a distinct expression for the definition of the squared cross-validity coefficient as the direct connection and monotone transformation to the squared multiple correlation coefficient. Therefore, all the currently available exact methods for interval estimation, power calculation, and sample size determination of the squared multiple correlation coefficient are naturally modified and extended to the analysis of the squared cross-validity coefficient. The adequacies of the existing approximate procedures and the suggested exact method are evaluated through a Monte Carlo study. Furthermore, practical applications in areas of psychology and management are presented to illustrate the essential features of the proposed methodologies. The first empirical example uses 6 control variables related to driver characteristics and traffic congestion and their relation to stress in bus drivers, and the second example relates skills, cognitive performance, and personality to team performance measures. The results in this article can facilitate the recommended practice of cross-validation in psychological and other areas of social science research.
Li, Min; Zhou, Tong; Song, Yanan
2016-07-01
A grain size characterization method based on energy attenuation coefficient spectrum and support vector regression (SVR) is proposed. First, the spectra of the first and second back-wall echoes are cut into several frequency bands to calculate the energy attenuation coefficient spectrum. Second, the frequency band that is sensitive to grain size variation is determined. Finally, a statistical model between the energy attenuation coefficient in the sensitive frequency band and average grain size is established through SVR. Experimental verification is conducted on austenitic stainless steel. The average relative error of the predicted grain size is 5.65%, which is better than that of conventional methods.
Covariate-adjusted confidence interval for the intraclass correlation coefficient.
Shoukri, Mohamed M; Donner, Allan; El-Dali, Abdelmoneim
2013-09-01
A crucial step in designing a new study is to estimate the required sample size. For a design involving cluster sampling, the appropriate sample size depends on the so-called design effect, which is a function of the average cluster size and the intracluster correlation coefficient (ICC). It is well-known that under the framework of hierarchical and generalized linear models, a reduction in residual error may be achieved by including risk factors as covariates. In this paper we show that the covariate design, indicating whether the covariates are measured at the cluster level or at the within-cluster subject level affects the estimation of the ICC, and hence the design effect. Therefore, the distinction between these two types of covariates should be made at the design stage. In this paper we use the nested-bootstrap method to assess the accuracy of the estimated ICC for continuous and binary response variables under different covariate structures. The codes of two SAS macros are made available by the authors for interested readers to facilitate the construction of confidence intervals for the ICC. Moreover, using Monte Carlo simulations we evaluate the relative efficiency of the estimators and evaluate the accuracy of the coverage probabilities of a 95% confidence interval on the population ICC. The methodology is illustrated using a published data set of blood pressure measurements taken on family members.
1991-03-01
Adjusted Estimators for Variance 1Redutilol in Computer Simutlation by Riichiardl L. R’ r March, 1991 D~issertation Advisor: Peter A.W. Lewis Approved for...OF NONLINEAR CONTROLS AND REGRESSION-ADJUSTED ESTIMATORS FOR VARIANCE REDUCTION IN COMPUTER SIMULATION 12. Personal Author(s) Richard L. Ressler 13a...necessary and identify by block number) This dissertation develops new techniques for variance reduction in computer simulation. It demonstrates that
Zheng, Qi; Peng, Limin
2016-01-01
Quantile regression provides a flexible platform for evaluating covariate effects on different segments of the conditional distribution of response. As the effects of covariates may change with quantile level, contemporaneously examining a spectrum of quantiles is expected to have a better capacity to identify variables with either partial or full effects on the response distribution, as compared to focusing on a single quantile. Under this motivation, we study a general adaptively weighted LASSO penalization strategy in the quantile regression setting, where a continuum of quantile index is considered and coefficients are allowed to vary with quantile index. We establish the oracle properties of the resulting estimator of coefficient function. Furthermore, we formally investigate a BIC-type uniform tuning parameter selector and show that it can ensure consistent model selection. Our numerical studies confirm the theoretical findings and illustrate an application of the new variable selection procedure. PMID:28008212
Comparison of the Properties of Regression and Categorical Risk-Adjustment Models
Averill, Richard F.; Muldoon, John H.; Hughes, John S.
2016-01-01
Clinical risk-adjustment, the ability to standardize the comparison of individuals with different health needs, is based upon 2 main alternative approaches: regression models and clinical categorical models. In this article, we examine the impact of the differences in the way these models are constructed on end user applications. PMID:26945302
ERIC Educational Resources Information Center
Olejnik, Stephen; Mills, Jamie; Keselman, Harvey
2000-01-01
Evaluated the use of Mallow's C(p) and Wherry's adjusted R squared (R. Wherry, 1931) statistics to select a final model from a pool of model solutions using computer generated data. Neither statistic identified the underlying regression model any better than, and usually less well than, the stepwise selection method, which itself was poor for…
Franklin, Jessica M; Eddings, Wesley; Glynn, Robert J; Schneeweiss, Sebastian
2015-10-01
Selection and measurement of confounders is critical for successful adjustment in nonrandomized studies. Although the principles behind confounder selection are now well established, variable selection for confounder adjustment remains a difficult problem in practice, particularly in secondary analyses of databases. We present a simulation study that compares the high-dimensional propensity score algorithm for variable selection with approaches that utilize direct adjustment for all potential confounders via regularized regression, including ridge regression and lasso regression. Simulations were based on 2 previously published pharmacoepidemiologic cohorts and used the plasmode simulation framework to create realistic simulated data sets with thousands of potential confounders. Performance of methods was evaluated with respect to bias and mean squared error of the estimated effects of a binary treatment. Simulation scenarios varied the true underlying outcome model, treatment effect, prevalence of exposure and outcome, and presence of unmeasured confounding. Across scenarios, high-dimensional propensity score approaches generally performed better than regularized regression approaches. However, including the variables selected by lasso regression in a regular propensity score model also performed well and may provide a promising alternative variable selection method.
Adjusting for Cell Type Composition in DNA Methylation Data Using a Regression-Based Approach.
Jones, Meaghan J; Islam, Sumaiya A; Edgar, Rachel D; Kobor, Michael S
2017-01-01
Analysis of DNA methylation in a population context has the potential to uncover novel gene and environment interactions as well as markers of health and disease. In order to find such associations it is important to control for factors which may mask or alter DNA methylation signatures. Since tissue of origin and coinciding cell type composition are major contributors to DNA methylation patterns, and can easily confound important findings, it is vital to adjust DNA methylation data for such differences across individuals. Here we describe the use of a regression method to adjust for cell type composition in DNA methylation data. We specifically discuss what information is required to adjust for cell type composition and then provide detailed instructions on how to perform cell type adjustment on high dimensional DNA methylation data. This method has been applied mainly to Illumina 450K data, but can also be adapted to pyrosequencing or genome-wide bisulfite sequencing data.
Yoneoka, Daisuke; Henmi, Masayuki
2016-12-16
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd.
Domain selection for the varying coefficient model via local polynomial regression
Kong, Dehan; Bondell, Howard; Wu, Yichao
2014-01-01
In this article, we consider the varying coefficient model, which allows the relationship between the predictors and response to vary across the domain of interest, such as time. In applications, it is possible that certain predictors only affect the response in particular regions and not everywhere. This corresponds to identifying the domain where the varying coefficient is nonzero. Towards this goal, local polynomial smoothing and penalized regression are incorporated into one framework. Asymptotic properties of our penalized estimators are provided. Specifically, the estimators enjoy the oracle properties in the sense that they have the same bias and asymptotic variance as the local polynomial estimators as if the sparsity is known as a priori. The choice of appropriate bandwidth and computational algorithms are discussed. The proposed method is examined via simulations and a real data example. PMID:25506112
Using Quantile and Asymmetric Least Squares Regression for Optimal Risk Adjustment.
Lorenz, Normann
2016-06-13
In this paper, we analyze optimal risk adjustment for direct risk selection (DRS). Integrating insurers' activities for risk selection into a discrete choice model of individuals' health insurance choice shows that DRS has the structure of a contest. For the contest success function (csf) used in most of the contest literature (the Tullock-csf), optimal transfers for a risk adjustment scheme have to be determined by means of a restricted quantile regression, irrespective of whether insurers are primarily engaged in positive DRS (attracting low risks) or negative DRS (repelling high risks). This is at odds with the common practice of determining transfers by means of a least squares regression. However, this common practice can be rationalized for a new csf, but only if positive and negative DRSs are equally important; if they are not, optimal transfers have to be calculated by means of a restricted asymmetric least squares regression. Using data from German and Swiss health insurers, we find considerable differences between the three types of regressions. Optimal transfers therefore critically depend on which csf represents insurers' incentives for DRS and, if it is not the Tullock-csf, whether insurers are primarily engaged in positive or negative DRS. Copyright © 2016 John Wiley & Sons, Ltd.
Abad, Cesar C. C.; Barros, Ronaldo V.; Bertuzzi, Romulo; Gagliardi, João F. L.; Lima-Silva, Adriano E.; Lambert, Mike I.
2016-01-01
Abstract The aim of this study was to verify the power of VO2max, peak treadmill running velocity (PTV), and running economy (RE), unadjusted or allometrically adjusted, in predicting 10 km running performance. Eighteen male endurance runners performed: 1) an incremental test to exhaustion to determine VO2max and PTV; 2) a constant submaximal run at 12 km·h−1 on an outdoor track for RE determination; and 3) a 10 km running race. Unadjusted (VO2max, PTV and RE) and adjusted variables (VO2max0.72, PTV0.72 and RE0.60) were investigated through independent multiple regression models to predict 10 km running race time. There were no significant correlations between 10 km running time and either the adjusted or unadjusted VO2max. Significant correlations (p < 0.01) were found between 10 km running time and adjusted and unadjusted RE and PTV, providing models with effect size > 0.84 and power > 0.88. The allometrically adjusted predictive model was composed of PTV0.72 and RE0.60 and explained 83% of the variance in 10 km running time with a standard error of the estimate (SEE) of 1.5 min. The unadjusted model composed of a single PVT accounted for 72% of the variance in 10 km running time (SEE of 1.9 min). Both regression models provided powerful estimates of 10 km running time; however, the unadjusted PTV may provide an uncomplicated estimation. PMID:28149382
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
An Efficient Elastic Net with Regression Coefficients Method for Variable Selection of Spectrum Data
Liu, Wenya; Li, Qi
2017-01-01
Using the spectrum data for quality prediction always suffers from noise and colinearity, so variable selection method plays an important role to deal with spectrum data. An efficient elastic net with regression coefficients method (Enet-BETA) is proposed to select the significant variables of the spectrum data in this paper. The proposed Enet-BETA method can not only select important variables to make the quality easy to interpret, but also can improve the stability and feasibility of the built model. Enet-BETA method is not prone to overfitting because of the reduction of redundant variables realized by elastic net method. Hypothesis testing is used to further simplify the model and provide a better insight into the nature of process. The experimental results prove that the proposed Enet-BETA method outperforms the other methods in terms of prediction performance and model interpretation. PMID:28152003
Liu, Wenya; Li, Qi
2017-01-01
Using the spectrum data for quality prediction always suffers from noise and colinearity, so variable selection method plays an important role to deal with spectrum data. An efficient elastic net with regression coefficients method (Enet-BETA) is proposed to select the significant variables of the spectrum data in this paper. The proposed Enet-BETA method can not only select important variables to make the quality easy to interpret, but also can improve the stability and feasibility of the built model. Enet-BETA method is not prone to overfitting because of the reduction of redundant variables realized by elastic net method. Hypothesis testing is used to further simplify the model and provide a better insight into the nature of process. The experimental results prove that the proposed Enet-BETA method outperforms the other methods in terms of prediction performance and model interpretation.
Thomas, Laine; Stefanski, Leonard A.; Davidian, Marie
2013-01-01
In clinical studies, covariates are often measured with error due to biological fluctuations, device error and other sources. Summary statistics and regression models that are based on mismeasured data will differ from the corresponding analysis based on the “true” covariate. Statistical analysis can be adjusted for measurement error, however various methods exhibit a tradeo between convenience and performance. Moment Adjusted Imputation (MAI) is method for measurement error in a scalar latent variable that is easy to implement and performs well in a variety of settings. In practice, multiple covariates may be similarly influenced by biological fluctuastions, inducing correlated multivariate measurement error. The extension of MAI to the setting of multivariate latent variables involves unique challenges. Alternative strategies are described, including a computationally feasible option that is shown to perform well. PMID:24072947
Wang, Na; Makhmalbaf, Atefe; Srivastava, Viraj; Hathaway, John E.
2016-11-23
This paper presents a new technique for and the results of normalizing building energy consumption to enable a fair comparison among various types of buildings located near different weather stations across the U.S. The method was developed for the U.S. Building Energy Asset Score, a whole-building energy efficiency rating system focusing on building envelope, mechanical systems, and lighting systems. The Asset Score is calculated based on simulated energy use under standard operating conditions. Existing weather normalization methods such as those based on heating and cooling degrees days are not robust enough to adjust all climatic factors such as humidity and solar radiation. In this work, over 1000 sets of climate coefficients were developed to separately adjust building heating, cooling, and fan energy use at each weather station in the United States. This paper also presents a robust, standardized weather station mapping based on climate similarity rather than choosing the closest weather station. This proposed simulated-based climate adjustment was validated through testing on several hundreds of thousands of modeled buildings. Results indicated the developed climate coefficients can isolate and adjust for the impacts of local climate for asset rating.
1990-05-01
0.499 ( 0.015) 40.674 .379 218 MAXFRONH 165.697 0.781 ( 0.015) 32.011 .615 702 . • • poll I I SIMPLE BIVARIATE REGRESSIC:’S -- MALE NBER VARI E VARIABLE...103.788 0.212 (0.007) 22.431 .299 122 WSHTSTOM 0.022 0.969 (0.032) 22.434 .299 129 WRISHTST -88.486 0.656 (0.009) 14.425 .710 HOB Y.LRIA VARIABLE
Lyles, Robert H; Tang, Li; Superak, Hillary M; King, Caroline C; Celentano, David D; Lo, Yungtai; Sobel, Jack D
2011-07-01
Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.
Zhang, Y J; Xue, F X; Bai, Z P
2017-03-06
The impact of maternal air pollution exposure on offspring health has received much attention. Precise and feasible exposure estimation is particularly important for clarifying exposure-response relationships and reducing heterogeneity among studies. Temporally-adjusted land use regression (LUR) models are exposure assessment methods developed in recent years that have the advantage of having high spatial-temporal resolution. Studies on the health effects of outdoor air pollution exposure during pregnancy have been increasingly carried out using this model. In China, research applying LUR models was done mostly at the model construction stage, and findings from related epidemiological studies were rarely reported. In this paper, the sources of heterogeneity and research progress of meta-analysis research on the associations between air pollution and adverse pregnancy outcomes were analyzed. The methods of the characteristics of temporally-adjusted LUR models were introduced. The current epidemiological studies on adverse pregnancy outcomes that applied this model were systematically summarized. Recommendations for the development and application of LUR models in China are presented. This will encourage the implementation of more valid exposure predictions during pregnancy in large-scale epidemiological studies on the health effects of air pollution in China.
A note on permutation tests of significance for multiple regression coefficients.
Long, Michael A; Berry, Kenneth J; Mielke, Paul W
2007-04-01
In the vast majority of psychological research utilizing multiple regression analysis, asymptotic probability values are reported. This paper demonstrates that asymptotic estimates of standard errors provided by multiple regression are not always accurate. A resampling permutation procedure is used to estimate the standard errors. In some cases the results differ substantially from the traditional least squares regression estimates.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
Doron, J; Martinent, G
2016-06-23
Understanding more about the stress process is important for the performance of athletes during stressful situations. Grounded in Lazarus's (1991, 1999, 2000) CMRT of emotion, this study tracked longitudinally the relationships between cognitive appraisal, coping, emotions, and performance in nine elite fencers across 14 international matches (representing 619 momentary assessments) using a naturalistic, video-assisted methodology. A series of hierarchical linear modeling analyses were conducted to: (a) explore the relationships between cognitive appraisals (challenge and threat), coping strategies (task- and disengagement oriented coping), emotions (positive and negative) and objective performance; (b) ascertain whether the relationship between appraisal and emotion was mediated by coping; and (c) examine whether the relationship between appraisal and objective performance was mediated by emotion and coping. The results of the random coefficient regression models showed: (a) positive relationships between challenge appraisal, task-oriented coping, positive emotions, and performance, as well as between threat appraisal, disengagement-oriented coping and negative emotions; (b) that disengagement-oriented coping partially mediated the relationship between threat and negative emotions, whereas task-oriented coping partially mediated the relationship between challenge and positive emotions; and (c) that disengagement-oriented coping mediated the relationship between threat and performance, whereas task-oriented coping and positive emotions partially mediated the relationship between challenge and performance. As a whole, this study furthered knowledge during sport performance situations of Lazarus's (1999) claim that these psychological constructs exist within a conceptual unit. Specifically, our findings indicated that the ways these constructs are inter-related influence objective performance within competitive settings.
Kleinman, Lawrence C; Norton, Edward C
2009-01-01
Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213
ERIC Educational Resources Information Center
Jiang, Ying Hong; Smith, Philip L.
This Monte Carlo study explored relationships among standard and unstandardized regression coefficients, structural coefficients, multiple R_ squared, and significance level of predictors for a variety of linear regression scenarios. Ten regression models with three predictors were included, and four conditions were varied that were expected to…
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan
2016-05-01
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders.
Li, J.; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
HE, PENG; ERIKSSON, FRANK; SCHEIKE, THOMAS H.; ZHANG, MEI-JIE
2015-01-01
With competing risks data, one often needs to assess the treatment and covariate effects on the cumulative incidence function. Fine and Gray proposed a proportional hazards regression model for the subdistribution of a competing risk with the assumption that the censoring distribution and the covariates are independent. Covariate-dependent censoring sometimes occurs in medical studies. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with proper adjustments for covariate-dependent censoring. We consider a covariate-adjusted weight function by fitting the Cox model for the censoring distribution and using the predictive probability for each individual. Our simulation study shows that the covariate-adjusted weight estimator is basically unbiased when the censoring time depends on the covariates, and the covariate-adjusted weight approach works well for the variance estimator as well. We illustrate our methods with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research (CIBMTR). Here cancer relapse and death in complete remission are two competing risks. PMID:27034534
Barks, C.S.
1995-01-01
Storm-runoff water-quality data were used to verify and, when appropriate, adjust regional regression models previously developed to estimate urban storm- runoff loads and mean concentrations in Little Rock, Arkansas. Data collected at 5 representative sites during 22 storms from June 1992 through January 1994 compose the Little Rock data base. Comparison of observed values (0) of storm-runoff loads and mean concentrations to the predicted values (Pu) from the regional regression models for nine constituents (chemical oxygen demand, suspended solids, total nitrogen, total ammonia plus organic nitrogen as nitrogen, total phosphorus, dissolved phosphorus, total recoverable copper, total recoverable lead, and total recoverable zinc) shows large prediction errors ranging from 63 to several thousand percent. Prediction errors for six of the regional regression models are less than 100 percent, and can be considered reasonable for water-quality models. Differences between 0 and Pu are due to variability in the Little Rock data base and error in the regional models. Where applicable, a model adjustment procedure (termed MAP-R-P) based upon regression with 0 against Pu was applied to improve predictive accuracy. For 11 of the 18 regional water-quality models, 0 and Pu are significantly correlated, that is much of the variation in 0 is explained by the regional models. Five of these 11 regional models consistently overestimate O; therefore, MAP-R-P can be used to provide a better estimate. For the remaining seven regional models, 0 and Pu are not significanfly correlated, thus neither the unadjusted regional models nor the MAP-R-P is appropriate. A simple estimator, such as the mean of the observed values may be used if the regression models are not appropriate. Standard error of estimate of the adjusted models ranges from 48 to 130 percent. Calibration results may be biased due to the limited data set sizes in the Little Rock data base. The relatively large values of
Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar
2016-08-15
Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd.
Stratton, Kelly G; Cook, Andrea J; Jackson, Lisa A; Nelson, Jennifer C
2015-03-30
Sequential methods are well established for randomized clinical trials (RCTs), and their use in observational settings has increased with the development of national vaccine and drug safety surveillance systems that monitor large healthcare databases. Observational safety monitoring requires that sequential testing methods be better equipped to incorporate confounder adjustment and accommodate rare adverse events. New methods designed specifically for observational surveillance include a group sequential likelihood ratio test that uses exposure matching and generalized estimating equations approach that involves regression adjustment. However, little is known about the statistical performance of these methods or how they compare to RCT methods in both observational and rare outcome settings. We conducted a simulation study to determine the type I error, power and time-to-surveillance-end of group sequential likelihood ratio test, generalized estimating equations and RCT methods that construct group sequential Lan-DeMets boundaries using data from a matched (group sequential Lan-DeMets-matching) or unmatched regression (group sequential Lan-DeMets-regression) setting. We also compared the methods using data from a multisite vaccine safety study. All methods had acceptable type I error, but regression methods were more powerful, faster at detecting true safety signals and less prone to implementation difficulties with rare events than exposure matching methods. Method performance also depended on the distribution of information and extent of confounding by site. Our results suggest that choice of sequential method, especially the confounder control strategy, is critical in rare event observational settings. These findings provide guidance for choosing methods in this context and, in particular, suggest caution when conducting exposure matching.
Adjustment of minimum seismic shear coefficient considering site effects for long-period structures
NASA Astrophysics Data System (ADS)
Guan, Minsheng; Du, Hongbiao; Cui, Jie; Zeng, Qingli; Jiang, Haibo
2016-06-01
Minimum seismic base shear is a key factor employed in the seismic design of long-period structures, which is specified in some of the major national seismic building codes viz. ASCE7-10, NZS1170.5 and GB50011-2010. In current Chinese seismic design code GB50011-2010, however, effects of soil types on the minimum seismic shear coefficient are not considered, which causes problems for long-period structures sited in hard or rock soil to meet the minimum base shear requirement. This paper aims to modify the current minimum seismic shear coefficient by taking into account site effects. For this purpose, effective peak acceleration (EPA) is used as a representation for the ordinate value of the design response spectrum at the plateau. A large amount of earthquake records, for which EPAs are calculated, are examined through the statistical analysis by considering soil conditions as well as the seismic fortification intensities. The study indicates that soil types have a significant effect on the spectral ordinates at the plateau as well as the minimum seismic shear coefficient. Modified factors related to the current minimum seismic shear coefficient are preliminarily suggested for each site class. It is shown that the modified seismic shear coefficients are more effective to the determination of minimum seismic base shear of long-period structures.
Ho Hoang, Khai-Long; Mombaur, Katja
2015-10-15
Dynamic modeling of the human body is an important tool to investigate the fundamentals of the biomechanics of human movement. To model the human body in terms of a multi-body system, it is necessary to know the anthropometric parameters of the body segments. For young healthy subjects, several data sets exist that are widely used in the research community, e.g. the tables provided by de Leva. None such comprehensive anthropometric parameter sets exist for elderly people. It is, however, well known that body proportions change significantly during aging, e.g. due to degenerative effects in the spine, such that parameters for young people cannot be used for realistically simulating the dynamics of elderly people. In this study, regression equations are derived from the inertial parameters, center of mass positions, and body segment lengths provided by de Leva to be adjustable to the changes in proportion of the body parts of male and female humans due to aging. Additional adjustments are made to the reference points of the parameters for the upper body segments as they are chosen in a more practicable way in the context of creating a multi-body model in a chain structure with the pelvis representing the most proximal segment.
Fujita, A; Takabatake, H; Tagaki, S; Sohda, T; Sekine, K
1996-03-01
To evaluate the effect of chemotherapy on QOL, the survival period was categorized by 3 intervals: one in the hospital for chemotherapy (TOX), on an outpatient basis (TWiST Time without Symptom and Toxicity), and in the hospital for conservative therapy (REL). Coefficients showing the QOL level were expressed as ut, uw and ur. If uw was 1 and ut and ur were plotted at less than 1, ut TOX+uwTWiST+urREL could be a quality-adjusted value relative to TWiST (Q-TWiST). One hundred five patients with stage IV non-small cell lung cancer were included. Sixty-five were given chemotherapy, and the other 40 were not. The observation period was 2 years. Q-TWiST values for age, sex, PS, histology and chemotherapy were calculated. Their quantification was performed employing a regression tree type method. Chemotherapy contributed to Q-TWiST when ut approached 1 i.e., no side effect was supposed). When ut was less than 0.5, PS and sex had an appreciable role.
ERIC Educational Resources Information Center
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
Huo, Yuankai; Aboud, Katherine; Kang, Hakmook; Cutting, Laurie E; Landman, Bennett A
2016-10-01
Understanding brain volumetry is essential to understand neurodevelopment and disease. Historically, age-related changes have been studied in detail for specific age ranges (e.g., early childhood, teen, young adults, elderly, etc.) or more sparsely sampled for wider considerations of lifetime aging. Recent advancements in data sharing and robust processing have made available considerable quantities of brain images from normal, healthy volunteers. However, existing analysis approaches have had difficulty addressing (1) complex volumetric developments on the large cohort across the life time (e.g., beyond cubic age trends), (2) accounting for confound effects, and (3) maintaining an analysis framework consistent with the general linear model (GLM) approach pervasive in neuroscience. To address these challenges, we propose to use covariate-adjusted restricted cubic spline (C-RCS) regression within a multi-site cross-sectional framework. This model allows for flexible consideration of non-linear age-associated patterns while accounting for traditional covariates and interaction effects. As a demonstration of this approach on lifetime brain aging, we derive normative volumetric trajectories and 95% confidence intervals from 5111 healthy patients from 64 sites while accounting for confounding sex, intracranial volume and field strength effects. The volumetric results are shown to be consistent with traditional studies that have explored more limited age ranges using single-site analyses. This work represents the first integration of C-RCS with neuroimaging and the derivation of structural covariance networks (SCNs) from a large study of multi-site, cross-sectional data.
Dashtbozorgi, Zahra; Golmohammadi, Hassan
2010-12-01
The main aim of this study was the development of a quantitative structure-property relationship method using an artificial neural network (ANN) for predicting the water-to-wet butyl acetate partition coefficients of organic solutes. As a first step, a genetic algorithm-multiple linear regression model was developed; the descriptors appearing in this model were considered as inputs for the ANN. These descriptors are principal moment of inertia C (I(C)), area-weighted surface charge of hydrogen-bonding donor atoms (HACA-2), Kier and Hall index (order 2) ((2)χ), Balaban index (J), minimum bond order of a C atom (P(C)) and relative negative-charged SA (RNCS). Then a 6-4-1 neural network was generated for the prediction of water-to-wet butyl acetate partition coefficients of 76 organic solutes. By comparing the results obtained from multiple linear regression and ANN models, it can be seen that statistical parameters (Fisher ratio, correlation coefficient and standard error) of the ANN model are better than that regression model, which indicates that nonlinear model can simulate the relationship between the structural descriptors and the partition coefficients of the investigated molecules more accurately.
Jen, Min-Hua; Bottle, Alex; Kirkwood, Graham; Johnston, Ron; Aylin, Paul
2011-09-01
We have previously described a system for monitoring a number of healthcare outcomes using case-mix adjustment models. It is desirable to automate the model fitting process in such a system if monitoring covers a large number of outcome measures or subgroup analyses. Our aim was to compare the performance of three different variable selection strategies: "manual", "automated" backward elimination and re-categorisation, and including all variables at once, irrespective of their apparent importance, with automated re-categorisation. Logistic regression models for predicting in-hospital mortality and emergency readmission within 28 days were fitted to an administrative database for 78 diagnosis groups and 126 procedures from 1996 to 2006 for National Health Services hospital trusts in England. The performance of models was assessed with Receiver Operating Characteristic (ROC) c statistics, (measuring discrimination) and Brier score (assessing the average of the predictive accuracy). Overall, discrimination was similar for diagnoses and procedures and consistently better for mortality than for emergency readmission. Brier scores were generally low overall (showing higher accuracy) and were lower for procedures than diagnoses, with a few exceptions for emergency readmission within 28 days. Among the three variable selection strategies, the automated procedure had similar performance to the manual method in almost all cases except low-risk groups with few outcome events. For the rapid generation of multiple case-mix models we suggest applying automated modelling to reduce the time required, in particular when examining different outcomes of large numbers of procedures and diseases in routinely collected administrative health data.
Jones, Jeff A; Waller, Niels G
2015-06-01
Yuan and Chan (Psychometrika, 76, 670-690, 2011) recently showed how to compute the covariance matrix of standardized regression coefficients from covariances. In this paper, we describe a method for computing this covariance matrix from correlations. Next, we describe an asymptotic distribution-free (ADF; Browne in British Journal of Mathematical and Statistical Psychology, 37, 62-83, 1984) method for computing the covariance matrix of standardized regression coefficients. We show that the ADF method works well with nonnormal data in moderate-to-large samples using both simulated and real-data examples. R code (R Development Core Team, 2012) is available from the authors or through the Psychometrika online repository for supplementary materials.
Kumagai, Etsushi; Homma, Koki; Kuroda, Eiki; Shimono, Hiroyuki
2016-11-01
The rising atmospheric CO2 concentration ([CO2 ]) can increase crop productivity, but there are likely to be intraspecific variations in the response. To meet future world food demand, screening for genotypes with high [CO2 ] responsiveness will be a useful option, but there is no criterion for high [CO2 ] responsiveness. We hypothesized that the Finlay-Wilkinson regression coefficient (RC) (for the relationship between a genotype's yield versus the mean yield of all genotypes in a specific environment) could serve as a pre-screening criterion for identifying genotypes that respond strongly to elevated [CO2 ]. We collected datasets on the yield of 6 rice and 10 soybean genotypes along environmental gradients and compared their responsiveness to elevated [CO2 ] based on the regression coefficients (i.e. the increases of yield per 100 µmol mol(-1) [CO2 ]) identified in previous reports. We found significant positive correlations between the RCs and the responsiveness of yield to elevated [CO2 ] in both rice and soybean. This result raises the possibility that the coefficient of the Finlay-Wilkinson relationship could be used as a pre-screening criterion for [CO2 ] responsiveness.
ERIC Educational Resources Information Center
Tipton, Elizabeth; Pustejovsky, James E.
2015-01-01
Randomized experiments are commonly used to evaluate the effectiveness of educational interventions. The goal of the present investigation is to develop small-sample corrections for multiple contrast hypothesis tests (i.e., F-tests) such as the omnibus test of meta-regression fit or a test for equality of three or more levels of a categorical…
Wang, Qianggang; Zhou, Niancheng; Lou, Xiaoxuan; Chen, Xu
2014-01-01
Unbalanced grid faults will lead to several drawbacks in the output power quality of photovoltaic generation (PV) converters, such as power fluctuation, current amplitude swell, and a large quantity of harmonics. The aim of this paper is to propose a flexible AC current generation method by selecting coefficients to overcome these problems in an optimal way. Three coefficients are brought in to tune the output current reference within the required limits of the power quality (the current harmonic distortion, the AC current peak, the power fluctuation, and the DC voltage fluctuation). Through the optimization algorithm, the coefficients can be determined aiming to generate the minimum integrated amplitudes of the active and reactive power references with the constraints of the inverter current and DC voltage fluctuation. Dead-beat controller is utilized to track the optimal current reference in a short period. The method has been verified in PSCAD/EMTDC software.
Wang, Qianggang; Zhou, Niancheng; Lou, Xiaoxuan; Chen, Xu
2014-01-01
Unbalanced grid faults will lead to several drawbacks in the output power quality of photovoltaic generation (PV) converters, such as power fluctuation, current amplitude swell, and a large quantity of harmonics. The aim of this paper is to propose a flexible AC current generation method by selecting coefficients to overcome these problems in an optimal way. Three coefficients are brought in to tune the output current reference within the required limits of the power quality (the current harmonic distortion, the AC current peak, the power fluctuation, and the DC voltage fluctuation). Through the optimization algorithm, the coefficients can be determined aiming to generate the minimum integrated amplitudes of the active and reactive power references with the constraints of the inverter current and DC voltage fluctuation. Dead-beat controller is utilized to track the optimal current reference in a short period. The method has been verified in PSCAD/EMTDC software. PMID:25243215
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models.
Wang, Xiaoli; Wu, Shuangsheng; MacIntyre, C. Raina; Zhang, Hongbin; Shi, Weixian; Peng, Xiaomin; Duan, Wei; Yang, Peng; Zhang, Yi; Wang, Quanyi
2015-01-01
Serfling-type periodic regression models have been widely used to identify and analyse epidemic of influenza. In these approaches, the baseline is traditionally determined using cleaned historical non-epidemic data. However, we found that the previous exclusion of epidemic seasons was empirical, since year-year variations in the seasonal pattern of activity had been ignored. Therefore, excluding fixed ‘epidemic’ months did not seem reasonable. We made some adjustments in the rule of epidemic-period removal to avoid potentially subjective definition of the start and end of epidemic periods. We fitted the baseline iteratively. Firstly, we established a Serfling regression model based on the actual observations without any removals. After that, instead of manually excluding a predefined ‘epidemic’ period (the traditional method), we excluded observations which exceeded a calculated boundary. We then established Serfling regression once more using the cleaned data and excluded observations which exceeded a calculated boundary. We repeated this process until the R2 value stopped to increase. In addition, the definitions of the onset of influenza epidemic were heterogeneous, which might make it impossible to accurately evaluate the performance of alternative approaches. We then used this modified model to detect the peak timing of influenza instead of the onset of epidemic and compared this model with traditional Serfling models using observed weekly case counts of influenza-like illness (ILIs), in terms of sensitivity, specificity and lead time. A better performance was observed. In summary, we provide an adjusted Serfling model which may have improved performance over traditional models in early warning at arrival of peak timing of influenza. PMID:25756205
Agogo, George O
2017-01-01
Measurement error in exposure variables is a serious impediment in epidemiological studies that relate exposures to health outcomes. In nutritional studies, interest could be in the association between long-term dietary intake and disease occurrence. Long-term intake is usually assessed with food frequency questionnaire (FFQ), which is prone to recall bias. Measurement error in FFQ-reported intakes leads to bias in parameter estimate that quantifies the association. To adjust for bias in the association, a calibration study is required to obtain unbiased intake measurements using a short-term instrument such as 24-hour recall (24HR). The 24HR intakes are used as response in regression calibration to adjust for bias in the association. For foods not consumed daily, 24HR-reported intakes are usually characterized by excess zeroes, right skewness, and heteroscedasticity posing serious challenge in regression calibration modeling. We proposed a zero-augmented calibration model to adjust for measurement error in reported intake, while handling excess zeroes, skewness, and heteroscedasticity simultaneously without transforming 24HR intake values. We compared the proposed calibration method with the standard method and with methods that ignore measurement error by estimating long-term intake with 24HR and FFQ-reported intakes. The comparison was done in real and simulated datasets. With the 24HR, the mean increase in mercury level per ounce fish intake was about 0.4; with the FFQ intake, the increase was about 1.2. With both calibration methods, the mean increase was about 2.0. Similar trend was observed in the simulation study. In conclusion, the proposed calibration method performs at least as good as the standard method.
Kjelstrom, L.C.
1995-01-01
Previously developed U.S. Geological Survey regional regression models of runoff and 11 chemical constituents were evaluated to assess their suitability for use in urban areas in Boise and Garden City. Data collected in the study area were used to develop adjusted regional models of storm-runoff volumes and mean concentrations and loads of chemical oxygen demand, dissolved and suspended solids, total nitrogen and total ammonia plus organic nitrogen as nitrogen, total and dissolved phosphorus, and total recoverable cadmium, copper, lead, and zinc. Explanatory variables used in these models were drainage area, impervious area, land-use information, and precipitation data. Mean annual runoff volume and loads at the five outfalls were estimated from 904 individual storms during 1976 through 1993. Two methods were used to compute individual storm loads. The first method used adjusted regional models of storm loads and the second used adjusted regional models for mean concentration and runoff volume. For large storms, the first method seemed to produce excessively high loads for some constituents and the second method provided more reliable results for all constituents except suspended solids. The first method provided more reliable results for large storms for suspended solids.
ERIC Educational Resources Information Center
Walton, Joseph M.; And Others
1978-01-01
Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)
Asquith, William H.; Roussel, Meghan C.
2009-01-01
Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed
Fadeyi, Michael; Tran, Tin
2013-01-01
Primary immunodeficiency disease (PIDD) is an inherited disorder characterized by an inadequate immune system. The most common type of PIDD is antibody deficiency. Patients with this disorder lack the ability to make functional immunoglobulin G (IgG) and require lifelong IgG replacement therapy to prevent serious bacterial infections. The current standard therapy for PIDD is intravenous immunoglobulin (IVIG) infusions, but IVIG might not be appropriate for all patients. For this reason, subcutaneous immunoglobulin (SCIG) has emerged as an alternative to IVIG. A concern for physicians is the precise SCIG dose that should be prescribed, because there are pharmacokinetic differences between IVIG and SCIG. Manufacturers of SCIG 10% and 20% liquid (immune globulin subcutaneous [human]) recommend a dose-adjustment coefficient (DAC). Both strengths are currently approved by the FDA. This DAC is to be used when patients are switched from IVIG to SCIG. In this article, we propose another dosing method that uses a higher ratio of IVIG to SCIG and an incremental adjustment based on clinical status, body weight, and the presence of concurrent diseases. PMID:24391400
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.
2015-01-01
Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294
Shahlaei, M; Fassihi, A; Saghaie, L; Zare, A
2014-01-01
A quantiatative structure property relationship (QSPR) treatment was used to a data set consisting of diverse 3-hydroxypyridine-4-one derivatives to relate the logarithmic function of octanol:water partition coefficients (denoted by log po/w) with theoretical molecular descriptors. Evaluation of a test set of 6 compounds with the developed partial least squares (PLS) model revealed that this model is reliable with a good predictability. Since the QSPR study was performed on the basis of theoretical descriptors calculated completely from the molecular structures, the proposed model could potentially provide useful information about the activity of the studied compounds. Various tests and criteria such as leave-one-out cross validation, leave-many-out cross validation, and also criteria suggested by Tropsha were employed to examine the predictability and robustness of the developed model.
Laubender, Ruediger P; Bender, Ralf
2014-02-28
Recently, Laubender and Bender (Stat. Med. 2010; 29: 851-859) applied the average risk difference (RD) approach to estimate adjusted RD and corresponding number needed to treat measures in the Cox proportional hazards model. We calculated standard errors and confidence intervals by using bootstrap techniques. In this paper, we develop asymptotic variance estimates of the adjusted RD measures and corresponding asymptotic confidence intervals within the counting process theory and evaluated them in a simulation study. We illustrate the use of the asymptotic confidence intervals by means of data of the Düsseldorf Obesity Mortality Study.
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and
Factor Scores, Structure Coefficients, and Communality Coefficients
ERIC Educational Resources Information Center
Goodwyn, Fara
2012-01-01
This paper presents heuristic explanations of factor scores, structure coefficients, and communality coefficients. Common misconceptions regarding these topics are clarified. In addition, (a) the regression (b) Bartlett, (c) Anderson-Rubin, and (d) Thompson methods for calculating factor scores are reviewed. Syntax necessary to execute all four…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
NASA Astrophysics Data System (ADS)
Liberman, Neomi; Ben-David Kolikant, Yifat; Beeri, Catriel
2012-09-01
Due to a program reform in Israel, experienced CS high-school teachers faced the need to master and teach a new programming paradigm. This situation served as an opportunity to explore the relationship between teachers' content knowledge (CK) and their pedagogical content knowledge (PCK). This article focuses on three case studies, with emphasis on one of them. Using observations and interviews, we examine how the teachers, we observed taught and what development of their teaching occurred as a result of their teaching experience, if at all. Our findings suggest that this situation creates a new hybrid state of teachers, which we term "regressed experts." These teachers incorporate in their professional practice some elements typical of novices and some typical of experts. We also found that these teachers' experience, although established when teaching a different CK, serve as a leverage to improve their knowledge and understanding of aspects of the new content.
Error bounds in cascading regressions
Karlinger, M.R.; Troutman, B.M.
1985-01-01
Cascading regressions is a technique for predicting a value of a dependent variable when no paired measurements exist to perform a standard regression analysis. Biases in coefficients of a cascaded-regression line as well as error variance of points about the line are functions of the correlation coefficient between dependent and independent variables. Although this correlation cannot be computed because of the lack of paired data, bounds can be placed on errors through the required properties of the correlation coefficient. The potential meansquared error of a cascaded-regression prediction can be large, as illustrated through an example using geomorphologic data. ?? 1985 Plenum Publishing Corporation.
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Soh, Chang-Heok; Harrington, David P; Zaslavsky, Alan M
2008-03-01
When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Transfer Learning Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Paul, A.; Rottensteiner, F.; Heipke, C.
2015-08-01
In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Some Simple Computational Formulas for Multiple Regression
ERIC Educational Resources Information Center
Aiken, Lewis R., Jr.
1974-01-01
Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)
ERIC Educational Resources Information Center
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Rank regression: an alternative regression approach for data with outliers.
Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin
2014-10-01
Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.
Complementary Log Regression for Sufficient-Cause Modeling of Epidemiologic Data.
Lin, Jui-Hsiang; Lee, Wen-Chung
2016-12-13
The logistic regression model is the workhorse of epidemiological data analysis. The model helps to clarify the relationship between multiple exposures and a binary outcome. Logistic regression analysis is readily implemented using existing statistical software, and this has contributed to it becoming a routine procedure for epidemiologists. In this paper, the authors focus on a causal model which has recently received much attention from the epidemiologic community, namely, the sufficient-component cause model (causal-pie model). The authors show that the sufficient-component cause model is associated with a particular 'link' function: the complementary log link. In a complementary log regression, the exponentiated coefficient of a main-effect term corresponds to an adjusted 'peril ratio', and the coefficient of a cross-product term can be used directly to test for causal mechanistic interaction (sufficient-cause interaction). The authors provide detailed instructions on how to perform a complementary log regression using existing statistical software and use three datasets to illustrate the methodology. Complementary log regression is the model of choice for sufficient-cause analysis of binary outcomes. Its implementation is as easy as conventional logistic regression.
... structural alignment and improve your body's physical function. Low back pain, neck pain and headache are the most common ... treated. Chiropractic adjustment can be effective in treating low back pain, although much of the research done shows only ...
... from other people Skipped heartbeats and other physical complaints Trembling or twitching To have adjustment disorder, you ... ADAM Health Solutions. About MedlinePlus Site Map FAQs Customer Support Get email updates Subscribe to RSS Follow ...
Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies
1982-11-01
prompted close examination of the issue at a workshop on hypertriglyceridemia where some of the cautions and perspectives given in this paper were...longevity," Circulation, 34, 679-697, (1966). 19. Lippel, K., Tyroler, H., Eder, H., Gotto, A., and Vahouny, G. "Rela- tionship of hypertriglyceridemia
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer.
Harry, H.H.
1988-03-11
Abstract and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus. 3 figs.
Harry, Herbert H.
1989-01-01
Apparatus and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.
Kolasa-Wiecek, Alicja
2015-04-01
The energy sector in Poland is the source of 81% of greenhouse gas (GHG) emissions. Poland, among other European Union countries, occupies a leading position with regard to coal consumption. Polish energy sector actively participates in efforts to reduce GHG emissions to the atmosphere, through a gradual decrease of the share of coal in the fuel mix and development of renewable energy sources. All evidence which completes the knowledge about issues related to GHG emissions is a valuable source of information. The article presents the results of modeling of GHG emissions which are generated by the energy sector in Poland. For a better understanding of the quantitative relationship between total consumption of primary energy and greenhouse gas emission, multiple stepwise regression model was applied. The modeling results of CO2 emissions demonstrate a high relationship (0.97) with the hard coal consumption variable. Adjustment coefficient of the model to actual data is high and equal to 95%. The backward step regression model, in the case of CH4 emission, indicated the presence of hard coal (0.66), peat and fuel wood (0.34), solid waste fuels, as well as other sources (-0.64) as the most important variables. The adjusted coefficient is suitable and equals R2=0.90. For N2O emission modeling the obtained coefficient of determination is low and equal to 43%. A significant variable influencing the amount of N2O emission is the peat and wood fuel consumption.
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s(-1). Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit (r(2) > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in
Regressive systemic sclerosis.
Black, C; Dieppe, P; Huskisson, T; Hart, F D
1986-01-01
Systemic sclerosis is a disease which usually progresses or reaches a plateau with persistence of symptoms and signs. Regression is extremely unusual. Four cases of established scleroderma are described in which regression is well documented. The significance of this observation and possible mechanisms of disease regression are discussed. Images PMID:3718012
Tharrington, Arnold N.
2015-09-09
The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Relationship between Multiple Regression and Selected Multivariable Methods.
ERIC Educational Resources Information Center
Schumacker, Randall E.
The relationship of multiple linear regression to various multivariate statistical techniques is discussed. The importance of the standardized partial regression coefficient (beta weight) in multiple linear regression as it is applied in path, factor, LISREL, and discriminant analyses is emphasized. The multivariate methods discussed in this paper…
The Geometry of Enhancement in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…
Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé
2016-01-01
Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418
Meteorological adjustment of yearly mean values for air pollutant concentration comparison
NASA Technical Reports Server (NTRS)
Sidik, S. M.; Neustadter, H. E.
1976-01-01
Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Commonality Analysis for the Regression Case.
ERIC Educational Resources Information Center
Murthy, Kavita
Commonality analysis is a procedure for decomposing the coefficient of determination (R superscript 2) in multiple regression analyses into the percent of variance in the dependent variable associated with each independent variable uniquely, and the proportion of explained variance associated with the common effects of predictors in various…
Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
Improved Regression Calibration
ERIC Educational Resources Information Center
Skrondal, Anders; Kuha, Jouni
2012-01-01
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…
Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-01
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424
Barros, Márcio Vinícius Lins de; Arancibia, Ana Elisa Loyola; Costa, Ana Paula; Bueno, Fernando Brito; Martins, Marcela Aparecida Corrêa; Magalhães, Maria Cláudia; Silva, José Luiz Padilha; Bastos, Marcos de
2016-04-01
Deep venous thrombosis (DVT) management includes prediction rule evaluation to define standard pretest DVT probabilities in symptomatic patients. The aim of this study was to evaluate the incremental usefulness of hormonal therapy to the Wells prediction rules for DVT in women. We studied women undertaking compressive ultrasound scanning for suspected DVT. We adjusted the Wells score for DVT, taking into account the β-coefficients of the logistic regression model. Data discrimination was evaluated by the receiver operating characteristic (ROC) curve. The adjusted score calibration was assessed graphically and by the Hosmer-Lemeshow test. Reclassification tables and the net reclassification index were used for the adjusted score comparison with the Wells score for DVT. We observed 461 women including 103 DVT events. The mean age was 56 years (±21 years). The adjusted logistic regression model included hormonal therapy and six Wells prediction rules for DVT. The adjusted score weights ranged from -4 to 4. Hosmer-Lemeshow test showed a nonsignificant P value (0.69) and the calibration graph showed no differences between the expected and the observed values. The area under the ROC curve was 0.92 [95% confidence interval (CI) 0.90-0.95] for the adjusted model and 0.87 (95% CI 0.84-0.91) for the Wells score for DVT (Delong test, P value < 0.01). Net reclassification index for the adjusted score was 0.22 (95% CI 0.11-0.33, P value < 0.01). Our results suggest an incremental usefulness of hormonal therapy as an independent DVT prediction rule in women compared with the Wells score for DVT. The adjusted score must be evaluated in different populations before clinical use.
Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas
2013-01-01
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706
Spatial regression analysis on 32 years total column ozone data
NASA Astrophysics Data System (ADS)
Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.
2014-02-01
Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979-2008) and two years of assimilated SCIAMACHY ozone data (2009-2010). The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Nino (ENSO) and stratospheric alternative halogens (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high
George: Gaussian Process regression
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel
2015-11-01
George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes.
Robustness of Land-Use Regression Models Developed from Mobile Air Pollutant Measurements.
Hatzopoulou, Marianne; Valois, Marie-France; Levy, Ilan; Mihele, Cristian; Lu, Gang; Bagg, Scott; Minet, Laura; Brook, Jeffrey Robert
2017-02-27
Land-Use Regression (LUR) models are useful for resolving fine scale spatial variations in average air pollutant concentrations across urban areas. With the rise of mobile air pollution campaigns, characterized by short-term monitoring and large spatial extents, it is important to investigate the effects of sampling protocols on the resulting LUR. In this study a mobile lab was used to repeatedly visit a large number of locations (~1800), defined by road segments, to derive average concentrations across the city of Montreal, Canada. We hypothesize that the robustness of the LUR from these data depends upon how many independent, random times each location is visited (Nvis) and the number of locations (Nloc) used in model development and that these parameters can be optimized. By performing multiple LURs on random sets of locations, we assessed the robustness of the LUR through consistency in adjusted R2 (i.e., coefficient of variation, CV) and in regression coefficients among different models. As Nloc increased, R2adj became less variable; for Nloc=100 vs. Nloc=300 the CV in R2adj for ultrafine particles decreased from 0.088 to 0.029 and from 0.115 to 0.076 for NO2. The CV in the R2adj also decreased as Nvis increased from 6 to 16; from 0.090 to 0.014 for UFP. As Nloc and Nvis increase, the variability in the coefficient sizes across the different model realizations were also seen to decrease.
Gas-film coefficients for streams
Rathbun, R.E.; Tai, D.Y.
1983-01-01
Equations for predicting the gas-film coefficient for the volatilization of organic solutes from streams are developed. The film coefficient is a function of windspeed and water temperature. The dependence of the coefficient on windspeed is determined from published information on the evaporation of water from a canal. The dependence of the coefficient on temperature is determined from laboratory studies on the evaporation of water. Procedures for adjusting the coefficients for different organic solutes are based on the molecular diffusion coefficient and the molecular weight. The molecular weight procedure is easiest to use because of the availability of molecular weights. However, the theoretical basis of the procedure is questionable. The diffusion coefficient procedure is supported by considerable data. Questions, however, remain regarding the exact dependence of the film coefficint on the diffusion coefficient. It is suggested that the diffusion coefficient procedure with a 0.68-power dependence be used when precise estimate of the gas-film coefficient are needed and that the molecular weight procedure be used when only approximate estimates are needed.
Parametric regression model for survival data: Weibull regression model as an example
2016-01-01
Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Union Support Recovery in High-Dimensional Multivariate Regression
2008-08-01
view the Lasso as a shrinkage estimator to be compared to traditional least squares or ridge regression ; in this case, it is natural to study the `2... instance , in a hierarchical regression model, groups of regression coefficients may be required to be zero or non-zero in a blockwise manner; for example...Neural Information Processing Systems, 18. MIT Press, Cambridge, MA. Bach, F. (2008). Consistency of the group Lasso and multiple kernel learning
Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables
ERIC Educational Resources Information Center
Rakow, Ernest A.
1978-01-01
Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…
Rodríguez-Ramilo, Silvia Teresa; García-Cortés, Luis Alberto; González-Recio, Oscar
2014-01-01
Genome-enhanced genotypic evaluations are becoming popular in several livestock species. For this purpose, the combination of the pedigree-based relationship matrix with a genomic similarities matrix between individuals is a common approach. However, the weight placed on each matrix has been so far established with ad hoc procedures, without formal estimation thereof. In addition, when using marker- and pedigree-based relationship matrices together, the resulting combined relationship matrix needs to be adjusted to the same scale in reference to the base population. This study proposes a semi-parametric Bayesian method for combining marker- and pedigree-based information on genome-enabled predictions. A kernel matrix from a reproducing kernel Hilbert spaces regression model was used to combine genomic and genealogical information in a semi-parametric scenario, avoiding inversion and adjustment complications. In addition, the weights on marker- versus pedigree-based information were inferred from a Bayesian model with Markov chain Monte Carlo. The proposed method was assessed involving a large number of SNPs and a large reference population. Five phenotypes, including production and type traits of dairy cattle were evaluated. The reliability of the genome-based predictions was assessed using the correlation, regression coefficient and mean squared error between the predicted and observed values. The results indicated that when a larger weight was given to the pedigree-based relationship matrix the correlation coefficient was lower than in situations where more weight was given to genomic information. Importantly, the posterior means of the inferred weight were near the maximum of 1. The behavior of the regression coefficient and the mean squared error was similar to the performance of the correlation, that is, more weight to the genomic information provided a regression coefficient closer to one and a smaller mean squared error. Our results also indicated a greater
Ahearn, Elizabeth A.
2010-01-01
contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.
Modified Biserial Correlation Coefficients.
ERIC Educational Resources Information Center
Kraemer, Helena Chmura
1981-01-01
Asymptotic distribution theory of Brogden's form of biserial correlation coefficient is derived and large sample estimates of its standard error obtained. Its relative efficiency to the biserial correlation coefficient is examined. Recommendations for choice of estimator of biserial correlation are presented. (Author/JKS)
ERIC Educational Resources Information Center
Waller, Niels; Jones, Jeff
2011-01-01
We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…
Calculating a Stepwise Ridge Regression.
ERIC Educational Resources Information Center
Morris, John D.
1986-01-01
Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
On the use of beta coefficients in meta-analysis.
Peterson, Robert A; Brown, Steven P
2005-01-01
This research reports an investigation of the use of standardized regression (beta) coefficients in meta-analyses that use correlation coefficients as the effect-size metric. The investigation consisted of analyzing more than 1,700 corresponding beta coefficients and correlation coefficients harvested from published studies. Results indicate that, under certain conditions, using knowledge of corresponding beta coefficients to input missing correlations (effect sizes) generally produces relatively accurate and precise population effect-size estimates. Potential benefits from applying this knowledge include smaller sampling errors because of increased numbers of effect sizes and smaller non-sampling errors because of the inclusion of a broader array of research designs.
Kramer, S.
1996-12-31
In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.
Coefficients of Effective Length.
ERIC Educational Resources Information Center
Edwards, Roger H.
1981-01-01
Under certain conditions, a validity Coefficient of Effective Length (CEL) can produce highly misleading results. A modified coefficent is suggested for use when empirical studies indicate that underlying assumptions have been violated. (Author/BW)
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd.
Covariate-Adjusted Precision Matrix Estimation with an Application in Genetical Genomics
Cai, T. Tony; Li, Hongzhe; Liu, Weidong; Xie, Jichun
2017-01-01
Summary Motivated by analysis of genetical genomics data, we introduce a sparse high dimensional multivariate regression model for studying conditional independence relationships among a set of genes adjusting for possible genetic effects. The precision matrix in the model specifies a covariate-adjusted Gaussian graph, which presents the conditional dependence structure of gene expression after the confounding genetic effects on gene expression are taken into account. We present a covariate-adjusted precision matrix estimation method using a constrained ℓ1 minimization, which can be easily implemented by linear programming. Asymptotic convergence rates in various matrix norms and sign consistency are established for the estimators of the regression coefficients and the precision matrix, allowing both the number of genes and the number of the genetic variants to diverge. Simulation shows that the proposed method results in significant improvements in both precision matrix estimation and graphical structure selection when compared to the standard Gaussian graphical model assuming constant means. The proposed method is also applied to analyze a yeast genetical genomics data for the identification of the gene network among a set of genes in the mitogen-activated protein kinase pathway.
Sheehan, Kenneth R.; Strager, Michael P.; Welsh, Stuart
2013-01-01
Stream habitat assessments are commonplace in fish management, and often involve nonspatial analysis methods for quantifying or predicting habitat, such as ordinary least squares regression (OLS). Spatial relationships, however, often exist among stream habitat variables. For example, water depth, water velocity, and benthic substrate sizes within streams are often spatially correlated and may exhibit spatial nonstationarity or inconsistency in geographic space. Thus, analysis methods should address spatial relationships within habitat datasets. In this study, OLS and a recently developed method, geographically weighted regression (GWR), were used to model benthic substrate from water depth and water velocity data at two stream sites within the Greater Yellowstone Ecosystem. For data collection, each site was represented by a grid of 0.1 m2 cells, where actual values of water depth, water velocity, and benthic substrate class were measured for each cell. Accuracies of regressed substrate class data by OLS and GWR methods were calculated by comparing maps, parameter estimates, and determination coefficient r 2. For analysis of data from both sites, Akaike’s Information Criterion corrected for sample size indicated the best approximating model for the data resulted from GWR and not from OLS. Adjusted r 2 values also supported GWR as a better approach than OLS for prediction of substrate. This study supports GWR (a spatial analysis approach) over nonspatial OLS methods for prediction of habitat for stream habitat assessments.
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Assessing risk factors for periodontitis using regression
NASA Astrophysics Data System (ADS)
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
NASA Technical Reports Server (NTRS)
Snyder, G. Jeffrey (Inventor)
2015-01-01
A high temperature Seebeck coefficient measurement apparatus and method with various features to minimize typical sources of errors is described. Common sources of temperature and voltage measurement errors which may impact accurate measurement are identified and reduced. Applying the identified principles, a high temperature Seebeck measurement apparatus and method employing a uniaxial, four-point geometry is described to operate from room temperature up to 1300K. These techniques for non-destructive Seebeck coefficient measurements are simple to operate, and are suitable for bulk samples with a broad range of physical types and shapes.
JKTLD: Limb darkening coefficients
NASA Astrophysics Data System (ADS)
Southworth, John
2015-11-01
JKTLD outputs theoretically-calculated limb darkening (LD) strengths for equations (LD laws) which predict the amount of LD as a function of the part of the star being observed. The coefficients of these laws are obtained by bilinear interpolation (in effective temperature and surface gravity) in published tables of coefficients calculated from stellar model atmospheres by several researchers. Many observations of stars require the strength of limb darkening (LD) to be estimated, which can be done using theoretical models of stellar atmospheres; JKTLD can help in these circumstances.
Use of Multiple Regression in Counseling Psychology Research: A Flexible Data-Analytic Strategy.
ERIC Educational Resources Information Center
Wampold, Bruce E.; Freund, Richard D.
1987-01-01
Explains multiple regression, demonstrates its flexibility for analyzing data from various designs, and discusses interpretation of results from multiple regression analysis. Presents regression equations for single independent variable and for two or more independent variables, followed by a discussion of coefficients related to these. Compares…
Brink, Carsten; Bernchou, Uffe; Bertelsen, Anders; Hansen, Olfred; Schytte, Tine; Bentzen, Soren M.
2014-07-15
Purpose: Large interindividual variations in volume regression of non-small cell lung cancer (NSCLC) are observable on standard cone beam computed tomography (CBCT) during fractionated radiation therapy. Here, a method for automated assessment of tumor volume regression is presented and its potential use in response adapted personalized radiation therapy is evaluated empirically. Methods and Materials: Automated deformable registration with calculation of the Jacobian determinant was applied to serial CBCT scans in a series of 99 patients with NSCLC. Tumor volume at the end of treatment was estimated on the basis of the first one third and two thirds of the scans. The concordance between estimated and actual relative volume at the end of radiation therapy was quantified by Pearson's correlation coefficient. On the basis of the estimated relative volume, the patients were stratified into 2 groups having volume regressions below or above the population median value. Kaplan-Meier plots of locoregional disease-free rate and overall survival in the 2 groups were used to evaluate the predictive value of tumor regression during treatment. Cox proportional hazards model was used to adjust for other clinical characteristics. Results: Automatic measurement of the tumor regression from standard CBCT images was feasible. Pearson's correlation coefficient between manual and automatic measurement was 0.86 in a sample of 9 patients. Most patients experienced tumor volume regression, and this could be quantified early into the treatment course. Interestingly, patients with pronounced volume regression had worse locoregional tumor control and overall survival. This was significant on patient with non-adenocarcinoma histology. Conclusions: Evaluation of routinely acquired CBCT images during radiation therapy provides biological information on the specific tumor. This could potentially form the basis for personalized response adaptive therapy.
A generalized concordance correlation coefficient for continuous and categorical data.
King, T S; Chinchilli, V M
2001-07-30
This paper discusses a generalized version of the concordance correlation coefficient for agreement data. The concordance correlation coefficient evaluates the accuracy and precision between two measures, and is based on the expected value of the squared function of distance. We have generalized this coefficient by applying alternative functions of distance to produce more robust versions of the concordance correlation coefficient. In this paper we extend the application of this class of estimators to categorical data as well, and demonstrate similarities to the kappa and weighted kappa statistics. We also introduce a stratified concordance correlation coefficient which adjusts for explanatory factors, and an extended concordance correlation coefficient which measures agreement among more than two responses. With these extensions, the generalized concordance correlation coefficient provides a unifying approach to assessing agreement among two or more measures that are either continuous or categorical in scale.
ERIC Educational Resources Information Center
Lowe, Terry J.
Reporting and interpretation of effect sizes and structure coefficients in multiple regression results are important for good practice. The purpose of this study was to investigate the use and interpretation of effect sizes (ES) and structure coefficients in multiple regression analyses in two mathematics and science education journals. Published…
Evaluating differential effects using regression interactions and regression mixture models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
NASA Astrophysics Data System (ADS)
Taguas, Encarnación; Nadal-Romero, Estela; Ayuso, José L.; Casalí, Javier; Cid, Patricio; Dafonte, Jorge; Duarte, Antonio C.; Giménez, Rafael; Giráldez, Juan V.; Gómez-Macpherson, Helena; Gómez, José A.; González-Hidalgo, J. Carlos; Lucía, Ana; Mateos, Luciano; Rodríguez-Blanco, M. Luz; Schnabel, Susanne; Serrano-Muela, M. Pilar; Lana-Renault, Noemí; Mercedes Taboada-Castro, M.; Taboada-Castro, M. Teresa
2016-04-01
Analysis of storm rainfall-runoff data is essential to improve our understanding of catchment hydrology and to validate models supporting hydrological planning. In a context of climate change, statistical and process-based models are helpful to explore different scenarios which might be represented by simple parameters such as volumetric runoff coefficient. In this work, rainfall-runoff event datasets collected at 17 rural catchments in the Iberian Peninsula were studied. The objectives were: i) to describe hydrological patterns/variability of the relation rainfall-runoff; ii) to explore different methodologies to quantify representative volumetric runoff coefficients. Firstly, the criteria used to define an event were examined in order to standardize the analysis. Linear regression adjustments and statistics of the rainfall-runoff relations were examined to identify possible common patterns. In addition, a principal component analysis was applied to evaluate the variability among catchments based on their physical attributes. Secondly, runoff coefficients at event temporal scale were calculated following different methods. Median, mean, Hawkinś graphic method (Hawkins, 1993), reference values for engineering project of Prevert (TRAGSA, 1994) and the ratio of cumulated runoff and cumulated precipitation of the event that generated runoff (Rcum) were compared. Finally, the relations between the most representative volumetric runoff coefficients with the physical features of the catchments were explored using multiple linear regressions. The mean volumetric runoff coefficient in the studied catchments was 0.18, whereas the median was 0.15, both with variation coefficients greater than 100%. In 6 catchments, rainfall-runoff linear adjustments presented coefficient of determination greater than 0.60 (p < 0.001) while in 5, it was lesser than 0.40. The slope of the linear adjustments for agricultural catchments located in areas with the lowest annual precipitation were
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r(2) denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Hunter, Paul R
2009-12-01
Household water treatment (HWT) is being widely promoted as an appropriate intervention for reducing the burden of waterborne disease in poor communities in developing countries. A recent study has raised concerns about the effectiveness of HWT, in part because of concerns over the lack of blinding and in part because of considerable heterogeneity in the reported effectiveness of randomized controlled trials. This study set out to attempt to investigate the causes of this heterogeneity and so identify factors associated with good health gains. Studies identified in an earlier systematic review and meta-analysis were supplemented with more recently published randomized controlled trials. A total of 28 separate studies of randomized controlled trials of HWT with 39 intervention arms were included in the analysis. Heterogeneity was studied using the "metareg" command in Stata. Initial analyses with single candidate predictors were undertaken and all variables significant at the P < 0.2 level were included in a final regression model. Further analyses were done to estimate the effect of the interventions over time by MonteCarlo modeling using @Risk and the parameter estimates from the final regression model. The overall effect size of all unblinded studies was relative risk = 0.56 (95% confidence intervals 0.51-0.63), but after adjusting for bias due to lack of blinding the effect size was much lower (RR = 0.85, 95% CI = 0.76-0.97). Four main variables were significant predictors of effectiveness of intervention in a multipredictor meta regression model: Log duration of study follow-up (regression coefficient of log effect size = 0.186, standard error (SE) = 0.072), whether or not the study was blinded (coefficient 0.251, SE 0.066) and being conducted in an emergency setting (coefficient -0.351, SE 0.076) were all significant predictors of effect size in the final model. Compared to the ceramic filter all other interventions were much less effective (Biosand 0.247, 0
1990-05-01
0.255 -0.120 -0.129 49 (ELRHGHT) ELBOW REST HEIGHT 0.094 0.540 0.525 92 (SHOUELLT) SHOULDER-ELBOW LENGTH 0.548 0.546 82 (NECKCRCB) NECK ...WSCIRCON) WAIST CIRCUMFERENCE, OMPHALION -0.059 -0.064 81 (NECKCIRC) NECK CIRCUMFERENCE -0.123 S.E. OF ESTIMATE 14.911 12.485 11.881 11.640 11.497...NECKHTLT) NECK HEIGHT, LATERAL 1.012 0.993 0.797 0.615 0.564 90 (SCYEDPTH) SCYE DEPTH 0.199 0.197 0.275 0.290 100 (STATURE) STATURE 0.184 0.180 0.205 7
Regression Models For Saffron Yields in Iran
NASA Astrophysics Data System (ADS)
S. H, Sanaeinejad; S. N, Hosseini
Saffron is an important crop in social and economical aspects in Khorassan Province (Northeast of Iran). In this research wetried to evaluate trends of saffron yield in recent years and to study the relationship between saffron yield and the climate change. A regression analysis was used to predict saffron yield based on 20 years of yield data in Birjand, Ghaen and Ferdows cities.Climatologically data for the same periods was provided by database of Khorassan Climatology Center. Climatologically data includedtemperature, rainfall, relative humidity and sunshine hours for ModelI, and temperature and rainfall for Model II. The results showed the coefficients of determination for Birjand, Ferdows and Ghaen for Model I were 0.69, 0.50 and 0.81 respectively. Also coefficients of determination for the same cities for model II were 0.53, 0.50 and 0.72 respectively. Multiple regression analysisindicated that among weather variables, temperature was the key parameter for variation ofsaffron yield. It was concluded that increasing temperature at spring was the main cause of declined saffron yield during recent years across the province. Finally, yield trend was predicted for the last 5 years using time series analysis.
Identification of high leverage points in binary logistic regression
NASA Astrophysics Data System (ADS)
Fitrianto, Anwar; Wendy, Tham
2016-10-01
Leverage points are those which measures uncommon observations in x space of regression diagnostics. Detection of high leverage points plays a vital role because it is responsible in masking outlier. In regression, high observations which made at extreme in the space of explanatory variables and they are far apart from the average of the data are called as leverage points. In this project, a method for identification of high leverage point in logistic regression was shown using numerical example. We investigate the effects of high leverage point in the logistic regression model. The comparison of the result in the model with and without leverage model is being discussed. Some graphical analyses based on the result of the analysis are presented. We found that the presence of HLP have effect on the hii, estimated probability, estimated coefficients, p-value of variable, odds ratio and regression equation.
NASA Astrophysics Data System (ADS)
Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei
2014-10-01
Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Comparison of regression methods for modeling intensive care length of stay.
Verburg, Ilona W M; de Keizer, Nicolette F; de Jonge, Evert; Peek, Niels
2014-01-01
Intensive care units (ICUs) are increasingly interested in assessing and improving their performance. ICU Length of Stay (LoS) could be seen as an indicator for efficiency of care. However, little consensus exists on which prognostic method should be used to adjust ICU LoS for case-mix factors. This study compared the performance of different regression models when predicting ICU LoS. We included data from 32,667 unplanned ICU admissions to ICUs participating in the Dutch National Intensive Care Evaluation (NICE) in the year 2011. We predicted ICU LoS using eight regression models: ordinary least squares regression on untransformed ICU LoS,LoS truncated at 30 days and log-transformed LoS; a generalized linear model with a Gaussian distribution and a logarithmic link function; Poisson regression; negative binomial regression; Gamma regression with a logarithmic link function; and the original and recalibrated APACHE IV model, for all patients together and for survivors and non-survivors separately. We assessed the predictive performance of the models using bootstrapping and the squared Pearson correlation coefficient (R2), root mean squared prediction error (RMSPE), mean absolute prediction error (MAPE) and bias. The distribution of ICU LoS was skewed to the right with a median of 1.7 days (interquartile range 0.8 to 4.0) and a mean of 4.2 days (standard deviation 7.9). The predictive performance of the models was between 0.09 and 0.20 for R2, between 7.28 and 8.74 days for RMSPE, between 3.00 and 4.42 days for MAPE and between -2.99 and 1.64 days for bias. The predictive performance was slightly better for survivors than for non-survivors. We were disappointed in the predictive performance of the regression models and conclude that it is difficult to predict LoS of unplanned ICU admissions using patient characteristics at admission time only.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time
Teaching Students Not to Dismiss the Outermost Observations in Regressions
ERIC Educational Resources Information Center
Kasprowicz, Tomasz; Musumeci, Jim
2015-01-01
One econometric rule of thumb is that greater dispersion in observations of the independent variable improves estimates of regression coefficients and therefore produces better results, i.e., lower standard errors of the estimates. Nevertheless, students often seem to mistrust precisely the observations that contribute the most to this greater…
Enhance-Synergism and Suppression Effects in Multiple Regression
ERIC Educational Resources Information Center
Lipovetsky, Stan; Conklin, W. Michael
2004-01-01
Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…
On the misinterpretation of the correlation coefficient in pharmaceutical sciences.
Sonnergaard, J M
2006-09-14
The correlation coefficient is often used and more often misused as a universal parameter expressing the quality in linear regression analysis. The popularity of this dimensionless quantity is evident as it is easy to communicate and considered to be unproblematic to comprehend. However, illustrative examples will demonstrate that the correlation coefficient is highly ineffective as a stand-alone quantity without reference to the number of observations, the pattern of the data and the slope of the regression line. Much more efficient quality methodologies are available where the correct technique depends on the purpose of the investigation. These relevant and precise methods in quality assurance of linear regression as alternative to the correlation coefficient are presented.
Variable Selection in Semiparametric Regression Modeling.
Li, Runze; Liang, Hua
2008-01-01
In this paper, we are concerned with how to select significant variables in semiparametric modeling. Variable selection for semiparametric regression models consists of two components: model selection for nonparametric components and select significant variables for parametric portion. Thus, it is much more challenging than that for parametric models such as linear models and generalized linear models because traditional variable selection procedures including stepwise regression and the best subset selection require model selection to nonparametric components for each submodel. This leads to very heavy computational burden. In this paper, we propose a class of variable selection procedures for semiparametric regression models using nonconcave penalized likelihood. The newly proposed procedures are distinguished from the traditional ones in that they delete insignificant variables and estimate the coefficients of significant variables simultaneously. This allows us to establish the sampling properties of the resulting estimate. We first establish the rate of convergence of the resulting estimate. With proper choices of penalty functions and regularization parameters, we then establish the asymptotic normality of the resulting estimate, and further demonstrate that the proposed procedures perform as well as an oracle procedure. Semiparametric generalized likelihood ratio test is proposed to select significant variables in the nonparametric component. We investigate the asymptotic behavior of the proposed test and demonstrate its limiting null distribution follows a chi-squared distribution, which is independent of the nuisance parameters. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedures.
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
NASA Technical Reports Server (NTRS)
Jacobsen, R. T.; Stewart, R. B.; Crain, R. W., Jr.; Rose, G. L.; Myers, A. F.
1976-01-01
A method was developed for establishing a rational choice of the terms to be included in an equation of state with a large number of adjustable coefficients. The methods presented were developed for use in the determination of an equation of state for oxygen and nitrogen. However, a general application of the methods is possible in studies involving the determination of an optimum polynomial equation for fitting a large number of data points. The data considered in the least squares problem are experimental thermodynamic pressure-density-temperature data. Attention is given to a description of stepwise multiple regression and the use of stepwise regression in the determination of an equation of state for oxygen and nitrogen.
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Williams-Sether, Tara; Gross, Tara A.
2016-02-09
Seasonal mean daily flow data from 119 U.S. Geological Survey streamflow-gaging stations in North Dakota; the surrounding states of Montana, Minnesota, and South Dakota; and the Canadian provinces of Manitoba and Saskatchewan with 10 or more years of unregulated flow record were used to develop regression equations for flow duration, n-day high flow and n-day low flow using ordinary least-squares and Tobit regression techniques. Regression equations were developed for seasonal flow durations at the 10th, 25th, 50th, 75th, and 90th percent exceedances; the 1-, 7-, and 30-day seasonal mean high flows for the 10-, 25-, and 50-year recurrence intervals; and the 1-, 7-, and 30-day seasonal mean low flows for the 2-, 5-, and 10-year recurrence intervals. Basin and climatic characteristics determined to be significant explanatory variables in one or more regression equations included drainage area, percentage of basin drainage area that drains to isolated lakes and ponds, ruggedness number, stream length, basin compactness ratio, minimum basin elevation, precipitation, slope ratio, stream slope, and soil permeability. The adjusted coefficient of determination for the n-day high-flow regression equations ranged from 55.87 to 94.53 percent. The Chi2 values for the duration regression equations ranged from 13.49 to 117.94, whereas the Chi2 values for the n-day low-flow regression equations ranged from 4.20 to 49.68.
Profile local linear estimation of generalized semiparametric regression model for longitudinal data
Sun, Liuquan; Zhou, Jie
2013-01-01
This paper studies the generalized semiparametric regression model for longitudinal data where the covariate effects are constant for some and time-varying for others. Different link functions can be used to allow more flexible modelling of longitudinal data. The nonparametric components of the model are estimated using a local linear estimating equation and the parametric components are estimated through a profile estimating function. The method automatically adjusts for heterogeneity of sampling times, allowing the sampling strategy to depend on the past sampling history as well as possibly time-dependent covariates without specifically model such dependence. A K -fold cross-validation bandwidth selection is proposed as a working tool for locating an appropriate bandwidth. A criteria for selecting the link function is proposed to provide better fit of the data. Large sample properties of the proposed estimators are investigated. Large sample pointwise and simultaneous confidence intervals for the regression coefficients are constructed. Formal hypothesis testing procedures are proposed to check for the covariate effects and whether the effects are time-varying. A simulation study is conducted to examine the finite sample performances of the proposed estimation and hypothesis testing procedures. The methods are illustrated with a data example. PMID:23471814
Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana
2015-05-01
Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable.
Weaver, Virginia M.; Vargas, Gonzalo García; Silbergeld, Ellen K.; Rothenberg, Stephen J.; Fadrowski, Jeffrey J.; Rubio-Andrade, Marisela; Parsons, Patrick J.; Steuerwald, Amy J.; and others
2014-07-15
Positive associations between urine toxicant levels and measures of glomerular filtration rate (GFR) have been reported recently in a range of populations. The explanation for these associations, in a direction opposite that of traditional nephrotoxicity, is uncertain. Variation in associations by urine concentration adjustment approach has also been observed. Associations of urine cadmium, thallium and uranium in models of serum creatinine- and cystatin-C-based estimated GFR (eGFR) were examined using multiple linear regression in a cross-sectional study of adolescents residing near a lead smelter complex. Urine concentration adjustment approaches compared included urine creatinine, urine osmolality and no adjustment. Median age, blood lead and urine cadmium, thallium and uranium were 13.9 years, 4.0 μg/dL, 0.22, 0.27 and 0.04 g/g creatinine, respectively, in 512 adolescents. Urine cadmium and thallium were positively associated with serum creatinine-based eGFR only when urine creatinine was used to adjust for urine concentration (β coefficient=3.1 mL/min/1.73 m{sup 2}; 95% confidence interval=1.4, 4.8 per each doubling of urine cadmium). Weaker positive associations, also only with urine creatinine adjustment, were observed between these metals and serum cystatin-C-based eGFR and between urine uranium and serum creatinine-based eGFR. Additional research using non-creatinine-based methods of adjustment for urine concentration is necessary. - Highlights: • Positive associations between urine metals and creatinine-based eGFR are unexpected. • Optimal approach to urine concentration adjustment for urine biomarkers uncertain. • We compared urine concentration adjustment methods. • Positive associations observed only with urine creatinine adjustment. • Additional research using non-creatinine-based methods of adjustment needed.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications.
Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric
2016-01-01
Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method.
Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications
Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric
2016-01-01
Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939
Remotely Adjustable Hydraulic Pump
NASA Technical Reports Server (NTRS)
Kouns, H. H.; Gardner, L. D.
1987-01-01
Outlet pressure adjusted to match varying loads. Electrohydraulic servo has positioned sleeve in leftmost position, adjusting outlet pressure to maximum value. Sleeve in equilibrium position, with control land covering control port. For lowest pressure setting, sleeve shifted toward right by increased pressure on sleeve shoulder from servovalve. Pump used in aircraft and robots, where hydraulic actuators repeatedly turned on and off, changing pump load frequently and over wide range.
NASA Technical Reports Server (NTRS)
Ashby, George C., Jr.; Robbins, W. Eugene; Horsley, Lewis A.
1991-01-01
Probe readily positionable in core of uniform flow in hypersonic wind tunnel. Formed of pair of mating cylindrical housings: transducer housing and pitot-tube housing. Pitot tube supported by adjustable wedge fairing attached to top of pitot-tube housing with semicircular foot. Probe adjusted both radially and circumferentially. In addition, pressure-sensing transducer cooled internally by water or other cooling fluid passing through annulus of cooling system.
Weighted triangulation adjustment
Anderson, Walter L.
1969-01-01
The variation of coordinates method is employed to perform a weighted least squares adjustment of horizontal survey networks. Geodetic coordinates are required for each fixed and adjustable station. A preliminary inverse geodetic position computation is made for each observed line. Weights associated with each observed equation for direction, azimuth, and distance are applied in the formation of the normal equations in-the least squares adjustment. The number of normal equations that may be solved is twice the number of new stations and less than 150. When the normal equations are solved, shifts are produced at adjustable stations. Previously computed correction factors are applied to the shifts and a most probable geodetic position is found for each adjustable station. Pinal azimuths and distances are computed. These may be written onto magnetic tape for subsequent computation of state plane or grid coordinates. Input consists of punch cards containing project identification, program options, and position and observation information. Results listed include preliminary and final positions, residuals, observation equations, solution of the normal equations showing magnitudes of shifts, and a plot of each adjusted and fixed station. During processing, data sets containing irrecoverable errors are rejected and the type of error is listed. The computer resumes processing of additional data sets.. Other conditions cause warning-errors to be issued, and processing continues with the current data set.
2015-01-01
Land use regression (LUR) models have been used to assess air pollutant exposure, but limited evidence exists on whether location-specific LUR models are applicable to other locations (transferability) or general models are applicable to smaller areas (generalizability). We tested transferability and generalizability of spatial-temporal LUR models of hourly particle number concentration (PNC) for Boston-area (MA, U.S.A.) urban neighborhoods near Interstate 93. Four neighborhood-specific regression models and one Boston-area model were developed from mobile monitoring measurements (34–46 days/neighborhood over one year each). Transferability was tested by applying each neighborhood-specific model to the other neighborhoods; generalizability was tested by applying the Boston-area model to each neighborhood. Both the transferability and generalizability of models were tested with and without neighborhood-specific calibration. Important PNC predictors (adjusted-R2 = 0.24–0.43) included wind speed and direction, temperature, highway traffic volume, and distance from the highway edge. Direct model transferability was poor (R2 < 0.17). Locally-calibrated transferred models (R2 = 0.19–0.40) and the Boston-area model (adjusted-R2 = 0.26, range: 0.13–0.30) performed similarly to neighborhood-specific models; however, some coefficients of locally calibrated transferred models were uninterpretable. Our results show that transferability of neighborhood-specific LUR models of hourly PNC was limited, but that a general model performed acceptably in multiple areas when calibrated with local data. PMID:25867675
Regional regression of flood characteristics employing historical information
Tasker, Gary D.; Stedinger, J.R.
1987-01-01
Streamflow gauging networks provide hydrologic information for use in estimating the parameters of regional regression models. The regional regression models can be used to estimate flood statistics, such as the 100 yr peak, at ungauged sites as functions of drainage basin characteristics. A recent innovation in regional regression is the use of a generalized least squares (GLS) estimator that accounts for unequal station record lengths and sample cross correlation among the flows. However, this technique does not account for historical flood information. A method is proposed here to adjust this generalized least squares estimator to account for possible information about historical floods available at some stations in a region. The historical information is assumed to be in the form of observations of all peaks above a threshold during a long period outside the systematic record period. A Monte Carlo simulation experiment was performed to compare the GLS estimator adjusted for historical floods with the unadjusted GLS estimator and the ordinary least squares estimator. Results indicate that using the GLS estimator adjusted for historical information significantly improves the regression model. ?? 1987.
Weaver, Virginia M.; Vargas, Gonzalo García; Silbergeld, Ellen K.; Rothenberg, Stephen J.; Fadrowski, Jeffrey J.; Rubio-Andrade, Marisela; Parsons, Patrick J.; Steuerwald, Amy J.; Navas-Acien, Ana; Guallar, Eliseo
2014-01-01
Positive associations between urine toxicant levels and measures of glomerular filtration rate (GFR) have been reported recently in a range of populations. The explanation for these associations, in a direction opposite that of traditional nephrotoxicity, is uncertain. Variation in associations by urine concentration adjustment approach has also been observed. Associations of urine cadmium, thallium and uranium in models of serum creatinine- and cystatin-C-based estimated GFR (eGFR) were examined using multiple linear regression in a cross-sectional study of adolescents residing near a lead smelter complex. Urine concentration adjustment approaches compared included urine creatinine, urine osmolality and no adjustment. Median age, blood lead and urine cadmium, thallium and uranium were 13.9 years, 4.0 μg/dL, 0.22, 0.27 and 0.04 g/g creatinine, respectively, in 512 adolescents. Urine cadmium and thallium were positively associated with serum creatinine-based eGFR only when urine creatinine was used to adjust for urine concentration (β coefficient=3.1 mL/min/1.73 m2; 95% confidence interval=1.4, 4.8 per each doubling of urine cadmium). Weaker positive associations, also only with urine creatinine adjustment, were observed between these metals and serum cystatin-C-based eGFR and between urine uranium and serum creatinine-based eGFR. Additional research using non-creatinine-based methods of adjustment for urine concentration is necessary. PMID:24815335
Joint regression analysis of correlated data using Gaussian copulas.
Song, Peter X-K; Li, Mingyao; Yuan, Ying
2009-03-01
This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.
Multiple Regression and Its Discontents
ERIC Educational Resources Information Center
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Recirculating valve lash adjuster
Stoody, R.R.
1987-02-24
This patent describes an internal combustion engine with a valve assembly of the type including overhead valves supported by a cylinder head for opening and closing movements in a substantially vertical direction and a rotatable overhead camshaft thereabove lubricated by engine oil pumped by an engine oil pump. A hydraulic lash adjuster with an internal reservoir therein is solely supplied with run-off lubricating oil from the camshaft which oil is pumped into the internal reservoir of the lash adjuster by self-pumping operation of the lash adjuster produced by lateral forces thereon by the rotative operation of the camshaft comprising: a housing of the lash adjuster including an axially extending bore therethrough with a lower wall means of the housing closing the lower end thereof; a first plunger member being closely slidably received in the bore of the housing and having wall means defining a fluid filled power chamber with the lower wall means of the housing; and a second plunger member of the lash adjuster having a portion being loosely slidably received and extending into the bore of the housing for reciprocation therein. Another portion extends upwardly from the housing to operatively receive alternating side-to-side force inputs from operation of the camshaft.
Incremental learning for ν-Support Vector Regression.
Gu, Bin; Sheng, Victor S; Wang, Zhijie; Ho, Derek; Osman, Said; Li, Shuo
2015-07-01
The ν-Support Vector Regression (ν-SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-SVC) (Schölkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν-SVC algorithm (AONSVM) to ν-SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν-SVR learning algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker (KKT) conditions to prepare an initial solution for the incremental learning. Combining the initial adjustments with the two steps of AONSVM produces an exact and effective incremental ν-SVR learning algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of INSVR (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that INSVR can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that INSVR is faster than batch ν-SVR algorithms with both cold and warm starts.
2011-01-01
Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG) signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2) values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS) was shown to have high isometric torque estimation accuracy combined with very short training times. PMID:21943179
Eugster, Patrick; Sennhauser, Michèle; Zweifel, Peter
2010-07-01
When premiums are community-rated, risk adjustment (RA) serves to mitigate competitive insurers' incentive to select favorable risks. However, unless fully prospective, it also undermines their incentives for efficiency. By capping its volume, one may try to counteract this tendency, exposing insurers to some financial risk. This in term runs counter the quest to refine the RA formula, which would increase RA volume. Specifically, the adjuster, "Hospitalization or living in a nursing home during the previous year" will be added in Switzerland starting 2012. This paper investigates how to minimize the opportunity cost of capping RA in terms of increased incentives for risk selection.
Quantile Regression Models for Current Status Data.
Ou, Fang-Shu; Zeng, Donglin; Cai, Jianwen
2016-11-01
Current status data arise frequently in demography, epidemiology, and econometrics where the exact failure time cannot be determined but is only known to have occurred before or after a known observation time. We propose a quantile regression model to analyze current status data, because it does not require distributional assumptions and the coefficients can be interpreted as direct regression effects on the distribution of failure time in the original time scale. Our model assumes that the conditional quantile of failure time is a linear function of covariates. We assume conditional independence between the failure time and observation time. An M-estimator is developed for parameter estimation which is computed using the concave-convex procedure and its confidence intervals are constructed using a subsampling method. Asymptotic properties for the estimator are derived and proven using modern empirical process theory. The small sample performance of the proposed method is demonstrated via simulation studies. Finally, we apply the proposed method to analyze data from the Mayo Clinic Study of Aging.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
NASA Astrophysics Data System (ADS)
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-12-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Gravitational Wave Emulation Using Gaussian Process Regression
NASA Astrophysics Data System (ADS)
Doctor, Zoheyr; Farr, Ben; Holz, Daniel
2017-01-01
Parameter estimation (PE) for gravitational wave signals from compact binary coalescences (CBCs) requires reliable template waveforms which span the parameter space. Waveforms from numerical relativity are accurate but computationally expensive, so approximate templates are typically used for PE. These `approximants', while quick to compute, can introduce systematic errors and bias PE results. We describe a machine learning method for generating CBC waveforms and uncertainties using existing accurate waveforms as a training set. Coefficients of a reduced order waveform model are computed and each treated as arising from a Gaussian process. These coefficients and their uncertainties are then interpolated using Gaussian process regression (GPR). As a proof of concept, we construct a training set of approximant waveforms (rather than NR waveforms) in the two-dimensional space of chirp mass and mass ratio and interpolate new waveforms with GPR. We demonstrate that the mismatch between interpolated waveforms and approximants is below the 1% level for an appropriate choice of training set and GPR kernel hyperparameters.
Nagai, Mika; Konno, Yoshihiro; Satsukawa, Masahiro; Yamashita, Shinji; Yoshinari, Kouichi
2016-08-01
Drug-drug interactions (DDIs) via cytochrome P450 (P450) induction are one clinical problem leading to increased risk of adverse effects and the need for dosage adjustments and additional therapeutic monitoring. In silico models for predicting P450 induction are useful for avoiding DDI risk. In this study, we have established regression models for CYP3A4 and CYP2B6 induction in human hepatocytes using several physicochemical parameters for a set of azole compounds with different P450 induction as characteristics as model compounds. To obtain a well-correlated regression model, the compounds for CYP3A4 or CYP2B6 induction were independently selected from the tested azole compounds using principal component analysis with fold-induction data. Both of the multiple linear regression models obtained for CYP3A4 and CYP2B6 induction are represented by different sets of physicochemical parameters. The adjusted coefficients of determination for these models were of 0.8 and 0.9, respectively. The fold-induction of the validation compounds, another set of 12 azole-containing compounds, were predicted within twofold limits for both CYP3A4 and CYP2B6. The concordance for the prediction of CYP3A4 induction was 87% with another validation set, 23 marketed drugs. However, the prediction of CYP2B6 induction tended to be overestimated for these marketed drugs. The regression models show that lipophilicity mostly contributes to CYP3A4 induction, whereas not only the lipophilicity but also the molecular polarity is important for CYP2B6 induction. Our regression models, especially that for CYP3A4 induction, might provide useful methods to avoid potent CYP3A4 or CYP2B6 inducers during the lead optimization stage without performing induction assays in human hepatocytes.
XRA image segmentation using regression
NASA Astrophysics Data System (ADS)
Jin, Jesse S.
1996-04-01
Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.
ERIC Educational Resources Information Center
Fong, Duncan K. H.; Ebbes, Peter; DeSarbo, Wayne S.
2012-01-01
Multiple regression is frequently used across the various social sciences to analyze cross-sectional data. However, it can often times be challenging to justify the assumption of common regression coefficients across all respondents. This manuscript presents a heterogeneous Bayesian regression model that enables the estimation of…
Psychological Adjustment and Homosexuality.
ERIC Educational Resources Information Center
Gonsiorek, John C.
In this paper, the diverse literature bearing on the topic of homosexuality and psychological adjustment is critically reviewed and synthesized. The first chapter discusses the most crucial methodological issue in this area, the problem of sampling. The kinds of samples used to date are critically examined, and some suggestions for improved…
NASA Technical Reports Server (NTRS)
1986-01-01
Corning Glass Works' Serengeti Driver sunglasses are unique in that their lenses self-adjust and filter light while suppressing glare. They eliminate more than 99% of the ultraviolet rays in sunlight. The frames are based on the NASA Anthropometric Source Book.
Hunter, Steven L.
2002-01-01
An inclinometer utilizing synchronous demodulation for high resolution and electronic offset adjustment provides a wide dynamic range without any moving components. A device encompassing a tiltmeter and accompanying electronic circuitry provides quasi-leveled tilt sensors that detect highly resolved tilt change without signal saturation.
Chen, Qiang; Mei, Kun; Dahlgren, Randy A; Wang, Ting; Gong, Jian; Zhang, Minghua
2016-12-01
As an important regulator of pollutants in overland flow and interflow, land use has become an essential research component for determining the relationships between surface water quality and pollution sources. This study investigated the use of ordinary least squares (OLS) and geographically weighted regression (GWR) models to identify the impact of land use and population density on surface water quality in the Wen-Rui Tang River watershed of eastern China. A manual variable excluding-selecting method was explored to resolve multicollinearity issues. Standard regression coefficient analysis coupled with cluster analysis was introduced to determine which variable had the greatest influence on water quality. Results showed that: (1) Impact of land use on water quality varied with spatial and seasonal scales. Both positive and negative effects for certain land-use indicators were found in different subcatchments. (2) Urban land was the dominant factor influencing N, P and chemical oxygen demand (COD) in highly urbanized regions, but the relationship was weak as the pollutants were mainly from point sources. Agricultural land was the primary factor influencing N and P in suburban and rural areas; the relationship was strong as the pollutants were mainly from agricultural surface runoff. Subcatchments located in suburban areas were identified with urban land as the primary influencing factor during the wet season while agricultural land was identified as a more prevalent influencing factor during the dry season. (3) Adjusted R(2) values in OLS models using the manual variable excluding-selecting method averaged 14.3% higher than using stepwise multiple linear regressions. However, the corresponding GWR models had adjusted R(2) ~59.2% higher than the optimal OLS models, confirming that GWR models demonstrated better prediction accuracy. Based on our findings, water resource protection policies should consider site-specific land-use conditions within each watershed to
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Regressive evolution in Astyanax cavefish.
Jeffery, William R
2009-01-01
A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment.
Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results
ERIC Educational Resources Information Center
Warne, Russell T.
2011-01-01
Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…
A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form
ERIC Educational Resources Information Center
Cai, Li; Hayes, Andrew F.
2008-01-01
When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…
Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert
2013-01-01
Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.
Confidence Intervals for an Effect Size Measure in Multiple Linear Regression
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2007-01-01
The increase in the squared multiple correlation coefficient ([Delta]R[squared]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. The coverage probability that an asymptotic and percentile bootstrap confidence interval includes [Delta][rho][squared] was investigated. As expected,…
ERIC Educational Resources Information Center
Kane, Michael T.; Mroch, Andrew A.
2010-01-01
In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…
Parametrically guided estimation in nonparametric varying coefficient models with quasi-likelihood
Davenport, Clemontina A.; Maity, Arnab; Wu, Yichao
2015-01-01
Varying coefficient models allow us to generalize standard linear regression models to incorporate complex covariate effects by modeling the regression coefficients as functions of another covariate. For nonparametric varying coefficients, we can borrow the idea of parametrically guided estimation to improve asymptotic bias. In this paper, we develop a guided estimation procedure for the nonparametric varying coefficient models. Asymptotic properties are established for the guided estimators and a method of bandwidth selection via bias-variance tradeoff is proposed. We compare the performance of the guided estimator with that of the unguided estimator via both simulation and real data examples. PMID:26146469
Spatial regression analysis on 32 years of total column ozone data
NASA Astrophysics Data System (ADS)
Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.
2014-08-01
Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively
Cactus: An Introduction to Regression
ERIC Educational Resources Information Center
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Multiple Regression: A Leisurely Primer.
ERIC Educational Resources Information Center
Daniel, Larry G.; Onwuegbuzie, Anthony J.
Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…
Weighting Regressions by Propensity Scores
ERIC Educational Resources Information Center
Freedman, David A.; Berk, Richard A.
2008-01-01
Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…
Quantile Regression with Censored Data
ERIC Educational Resources Information Center
Lin, Guixian
2009-01-01
The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…
Coefficient Alpha: A Reliability Coefficient for the 21st Century?
ERIC Educational Resources Information Center
Yang, Yanyun; Green, Samuel B.
2011-01-01
Coefficient alpha is almost universally applied to assess reliability of scales in psychology. We argue that researchers should consider alternatives to coefficient alpha. Our preference is for structural equation modeling (SEM) estimates of reliability because they are informative and allow for an empirical evaluation of the assumptions…
Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors
Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David
2013-01-01
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials. PMID:24293988
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Schrenkenghost, Debra K.
2001-01-01
The Adjustable Autonomy Testbed (AAT) is a simulation-based testbed located in the Intelligent Systems Laboratory in the Automation, Robotics and Simulation Division at NASA Johnson Space Center. The purpose of the testbed is to support evaluation and validation of prototypes of adjustable autonomous agent software for control and fault management for complex systems. The AA T project has developed prototype adjustable autonomous agent software and human interfaces for cooperative fault management. This software builds on current autonomous agent technology by altering the architecture, components and interfaces for effective teamwork between autonomous systems and human experts. Autonomous agents include a planner, flexible executive, low level control and deductive model-based fault isolation. Adjustable autonomy is intended to increase the flexibility and effectiveness of fault management with an autonomous system. The test domain for this work is control of advanced life support systems for habitats for planetary exploration. The CONFIG hybrid discrete event simulation environment provides flexible and dynamically reconfigurable models of the behavior of components and fluids in the life support systems. Both discrete event and continuous (discrete time) simulation are supported, and flows and pressures are computed globally. This provides fast dynamic simulations of interacting hardware systems in closed loops that can be reconfigured during operations scenarios, producing complex cascading effects of operations and failures. Current object-oriented model libraries support modeling of fluid systems, and models have been developed of physico-chemical and biological subsystems for processing advanced life support gases. In FY01, water recovery system models will be developed.
Cutburth, Ronald W.; Silva, Leonard L.
1988-01-01
An improved mounting stage of the type used for the detection of laser beams is disclosed. A stage center block is mounted on each of two opposite sides by a pair of spaced ball bearing tracks which provide stability as well as simplicity. The use of the spaced ball bearing pairs in conjunction with an adjustment screw which also provides support eliminates extraneous stabilization components and permits maximization of the area of the center block laser transmission hole.
The Impact of Nonignorable Missing Data on the Inference of Regression Coefficients.
ERIC Educational Resources Information Center
Min, Kyung-Seok; Frank, Kenneth A.
Various statistical methods have been available to deal with missing data problems, but the difficulty is that they are based on somewhat restrictive assumptions that missing patterns are known or can be modeled with auxiliary information. This paper treats the presence of missing cases from the viewpoint that generalization as a sample does not…
ERIC Educational Resources Information Center
Wilson, Pamela W.; And Others
The purpose of this study was to present an empirical correction of the KR21 (Kuder Richardson test reliability) formula that not only yields a closer approximation to the numerical value of the KR20 without overestimation, but also simplifies computation. This correction was accomplished by introducing several correction factors to the numerator…
Direction of Effects in Multiple Linear Regression Models.
Wiedermann, Wolfgang; von Eye, Alexander
2015-01-01
Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.
Regression Verification Using Impact Summaries
NASA Technical Reports Server (NTRS)
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program
Psychosocial adjustment to ALS: a longitudinal study.
Matuz, Tamara; Birbaumer, Niels; Hautzinger, Martin; Kübler, Andrea
2015-01-01
For the current study the Lazarian stress-coping theory and the appendant model of psychosocial adjustment to chronic illness and disabilities (Pakenham, 1999) has shaped the foundation for identifying determinants of adjustment to ALS. We aimed to investigate the evolution of psychosocial adjustment to ALS and to determine its long-term predictors. A longitudinal study design with four measurement time points was therefore, used to assess patients' quality of life, depression, and stress-coping model related aspects, such as illness characteristics, social support, cognitive appraisals, and coping strategies during a period of 2 years. Regression analyses revealed that 55% of the variance of severity of depressive symptoms and 47% of the variance in quality of life at T2 was accounted for by all the T1 predictor variables taken together. On the level of individual contributions, protective buffering, and appraisal of own coping potential accounted for a significant percentage in the variance in severity of depressive symptoms, whereas problem management coping strategies explained variance in quality of life scores. Illness characteristics at T2 did not explain any variance of both adjustment outcomes. Overall, the pattern of the longitudinal results indicated stable depressive symptoms and quality of life indices reflecting a successful adjustment to the disease across four measurement time points during a period of about two years. Empirical evidence is provided for the predictive value of social support, cognitive appraisals, and coping strategies, but not illness parameters such as severity and duration for adaptation to ALS. The current study contributes to a better conceptualization of adjustment, allowing us to provide evidence-based support beyond medical and physical intervention for people with ALS.
Psychosocial adjustment to ALS: a longitudinal study
Matuz, Tamara; Birbaumer, Niels; Hautzinger, Martin; Kübler, Andrea
2015-01-01
For the current study the Lazarian stress-coping theory and the appendant model of psychosocial adjustment to chronic illness and disabilities (Pakenham, 1999) has shaped the foundation for identifying determinants of adjustment to ALS. We aimed to investigate the evolution of psychosocial adjustment to ALS and to determine its long-term predictors. A longitudinal study design with four measurement time points was therefore, used to assess patients' quality of life, depression, and stress-coping model related aspects, such as illness characteristics, social support, cognitive appraisals, and coping strategies during a period of 2 years. Regression analyses revealed that 55% of the variance of severity of depressive symptoms and 47% of the variance in quality of life at T2 was accounted for by all the T1 predictor variables taken together. On the level of individual contributions, protective buffering, and appraisal of own coping potential accounted for a significant percentage in the variance in severity of depressive symptoms, whereas problem management coping strategies explained variance in quality of life scores. Illness characteristics at T2 did not explain any variance of both adjustment outcomes. Overall, the pattern of the longitudinal results indicated stable depressive symptoms and quality of life indices reflecting a successful adjustment to the disease across four measurement time points during a period of about two years. Empirical evidence is provided for the predictive value of social support, cognitive appraisals, and coping strategies, but not illness parameters such as severity and duration for adaptation to ALS. The current study contributes to a better conceptualization of adjustment, allowing us to provide evidence-based support beyond medical and physical intervention for people with ALS. PMID:26441696
Tensor Regression with Applications in Neuroimaging Data Analysis
Zhou, Hua; Li, Lexin; Zhu, Hongtu
2013-01-01
Classical regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients. Modern applications in medical imaging generate covariates of more complex form such as multidimensional arrays (tensors). Traditional statistical and computational methods are proving insufficient for analysis of these high-throughput data due to their ultrahigh dimensionality as well as complex structure. In this article, we propose a new family of tensor regression models that efficiently exploit the special structure of tensor covariates. Under this framework, ultrahigh dimensionality is reduced to a manageable level, resulting in efficient estimation and prediction. A fast and highly scalable estimation algorithm is proposed for maximum likelihood estimation and its associated asymptotic properties are studied. Effectiveness of the new methods is demonstrated on both synthetic and real MRI imaging data. PMID:24791032
Spatial regression with covariate measurement error: A semiparametric approach.
Huque, Md Hamidul; Bondell, Howard D; Carroll, Raymond J; Ryan, Louise M
2016-09-01
Spatial data have become increasingly common in epidemiology and public health research thanks to advances in GIS (Geographic Information Systems) technology. In health research, for example, it is common for epidemiologists to incorporate geographically indexed data into their studies. In practice, however, the spatially defined covariates are often measured with error. Naive estimators of regression coefficients are attenuated if measurement error is ignored. Moreover, the classical measurement error theory is inapplicable in the context of spatial modeling because of the presence of spatial correlation among the observations. We propose a semiparametric regression approach to obtain bias-corrected estimates of regression parameters and derive their large sample properties. We evaluate the performance of the proposed method through simulation studies and illustrate using data on Ischemic Heart Disease (IHD). Both simulation and practical application demonstrate that the proposed method can be effective in practice.
Multiple imputation for cure rate quantile regression with censored data.
Wu, Yuanshan; Yin, Guosheng
2017-03-01
The main challenge in the context of cure rate analysis is that one never knows whether censored subjects are cured or uncured, or whether they are susceptible or insusceptible to the event of interest. Considering the susceptible indicator as missing data, we propose a multiple imputation approach to cure rate quantile regression for censored data with a survival fraction. We develop an iterative algorithm to estimate the conditionally uncured probability for each subject. By utilizing this estimated probability and Bernoulli sample imputation, we can classify each subject as cured or uncured, and then employ the locally weighted method to estimate the quantile regression coefficients with only the uncured subjects. Repeating the imputation procedure multiple times and taking an average over the resultant estimators, we obtain consistent estimators for the quantile regression coefficients. Our approach relaxes the usual global linearity assumption, so that we can apply quantile regression to any particular quantile of interest. We establish asymptotic properties for the proposed estimators, including both consistency and asymptotic normality. We conduct simulation studies to assess the finite-sample performance of the proposed multiple imputation method and apply it to a lung cancer study as an illustration.
Combining biomarkers for classification with covariate adjustment.
Kim, Soyoung; Huang, Ying
2017-03-09
Combining multiple markers can improve classification accuracy compared with using a single marker. In practice, covariates associated with markers or disease outcome can affect the performance of a biomarker or biomarker combination in the population. The covariate-adjusted receiver operating characteristic (ROC) curve has been proposed as a tool to tease out the covariate effect in the evaluation of a single marker; this curve characterizes the classification accuracy solely because of the marker of interest. However, research on the effect of covariates on the performance of marker combinations and on how to adjust for the covariate effect when combining markers is still lacking. In this article, we examine the effect of covariates on classification performance of linear marker combinations and propose to adjust for covariates in combining markers by maximizing the nonparametric estimate of the area under the covariate-adjusted ROC curve. The proposed method provides a way to estimate the best linear biomarker combination that is robust to risk model assumptions underlying alternative regression-model-based methods. The proposed estimator is shown to be consistent and asymptotically normally distributed. We conduct simulations to evaluate the performance of our estimator in cohort and case/control designs and compare several different weighting strategies during estimation with respect to efficiency. Our estimator is also compared with alternative regression-model-based estimators or estimators that maximize the empirical area under the ROC curve, with respect to bias and efficiency. We apply the proposed method to a biomarker study from an human immunodeficiency virus vaccine trial. Copyright © 2017 John Wiley & Sons, Ltd.
3D Regression Heat Map Analysis of Population Study Data.
Klemm, Paul; Lawonn, Kai; Glaßer, Sylvia; Niemann, Uli; Hegenscheid, Katrin; Völzke, Henry; Preim, Bernhard
2016-01-01
Epidemiological studies comprise heterogeneous data about a subject group to define disease-specific risk factors. These data contain information (features) about a subject's lifestyle, medical status as well as medical image data. Statistical regression analysis is used to evaluate these features and to identify feature combinations indicating a disease (the target feature). We propose an analysis approach of epidemiological data sets by incorporating all features in an exhaustive regression-based analysis. This approach combines all independent features w.r.t. a target feature. It provides a visualization that reveals insights into the data by highlighting relationships. The 3D Regression Heat Map, a novel 3D visual encoding, acts as an overview of the whole data set. It shows all combinations of two to three independent features with a specific target disease. Slicing through the 3D Regression Heat Map allows for the detailed analysis of the underlying relationships. Expert knowledge about disease-specific hypotheses can be included into the analysis by adjusting the regression model formulas. Furthermore, the influences of features can be assessed using a difference view comparing different calculation results. We applied our 3D Regression Heat Map method to a hepatic steatosis data set to reproduce results from a data mining-driven analysis. A qualitative analysis was conducted on a breast density data set. We were able to derive new hypotheses about relations between breast density and breast lesions with breast cancer. With the 3D Regression Heat Map, we present a visual overview of epidemiological data that allows for the first time an interactive regression-based analysis of large feature sets with respect to a disease.
Interaction Models for Functional Regression
USSET, JOSEPH; STAICU, ANA-MARIA; MAITY, ARNAB
2015-01-01
A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data. PMID:26744549
Sparse Regression by Projection and Sparse Discriminant Analysis.
Qi, Xin; Luo, Ruiyan; Carroll, Raymond J; Zhao, Hongyu
2015-04-01
Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared to the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplemental materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.
Astronomical Methods for Nonparametric Regression
NASA Astrophysics Data System (ADS)
Steinhardt, Charles L.; Jermyn, Adam
2017-01-01
I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.
NASA Technical Reports Server (NTRS)
Farley, Gary L.
1994-01-01
Local characteristics of fabrics varied to suit special applications. Adjustable reed machinery proposed for use in weaving fabrics in various net shapes, widths, yarn spacings, and yarn angles. Locations of edges of fabric and configuration of warp and filling yarns varied along fabric to obtain specified properties. In machinery, reed wires mounted in groups on sliders, mounted on lengthwise rails in reed frame. Mechanisms incorporated to move sliders lengthwise, parallel to warp yarns, by sliding them along rails; move sliders crosswise by translating reed frame rails perpendicular to warp yarns; and crosswise by spreading reed rails within group. Profile of reed wires in group on each slider changed.
Deep Human Parsing with Active Template Regression.
Liang, Xiaodan; Liu, Si; Shen, Xiaohui; Yang, Jianchao; Liu, Luoqi; Dong, Jian; Lin, Liang; Yan, Shuicheng
2015-12-01
In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an active template regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches 64.38 percent by our ATR framework, significantly higher than 44.76 percent based on the state-of-the-art algorithm [28].
The Impact of Financial Sophistication on Adjustable Rate Mortgage Ownership
ERIC Educational Resources Information Center
Smith, Hyrum; Finke, Michael S.; Huston, Sandra J.
2011-01-01
The influence of a financial sophistication scale on adjustable-rate mortgage (ARM) borrowing is explored. Descriptive statistics and regression analysis using recent data from the Survey of Consumer Finances reveal that ARM borrowing is driven by both the least and most financially sophisticated households but for different reasons. Less…
Effects of Relational Authenticity on Adjustment to College
ERIC Educational Resources Information Center
Lenz, A. Stephen; Holman, Rachel L.; Lancaster, Chloe; Gotay, Stephanie G.
2016-01-01
The authors examined the association between relational health and student adjustment to college. Data were collected from 138 undergraduate students completing their 1st semester at a large university in the mid-southern United States. Regression analysis indicated that higher levels of relational authenticity were a predictor of success during…
Kokuhu, Takatoshi; Fukushima, Keizo; Ushigome, Hidetaka; Yoshimura, Norio; Sugioka, Nobuyuki
2013-01-01
The optimal use and monitoring of cyclosporine A (CyA) have remained unclear and the current strategy of CyA treatment requires frequent dose adjustment following an empirical initial dosage adjusted for total body weight (TBW). The primary aim of this study was to evaluate age and anthropometric parameters as predictors for dose adjustment of CyA; and the secondary aim was to compare the usefulness of the concentration at predose (C0) and 2-hour postdose (C2) monitoring. An open-label, non-randomized, retrospective study was performed in 81 renal transplant patients in Japan during 2001-2010. The relationships between the area under the blood concentration-time curve (AUC0-9) of CyA and its C0 or C2 level were assessed with a linear regression analysis model. In addition to age, 7 anthropometric parameters were tested as predictors for AUC0-9 of CyA: TBW, height (HT), body mass index (BMI), body surface area (BSA), ideal body weight (IBW), lean body weight (LBW), and fat free mass (FFM). Correlations between AUC0-9 of CyA and these parameters were also analyzed with a linear regression model. The rank order of the correlation coefficient was C0 > C2 (C0; r=0.6273, C2; r=0.5562). The linear regression analyses between AUC0-9 of CyA and candidate parameters indicated their potential usefulness from the following rank order: IBW > FFM > HT > BSA > LBW > TBW > BMI > Age. In conclusion, after oral administration, C2 monitoring has a large variation and could be at high risk for overdosing. Therefore, after oral dosing of CyA, it was not considered to be a useful approach for single monitoring, but should rather be used with C0 monitoring. The regression analyses between AUC0-9 of CyA and anthropometric parameters indicated that IBW was potentially the superior predictor for dose adjustment of CyA in an empiric strategy using TBW (IBW; r=0.5181, TBW; r=0.3192); however, this finding seems to lack the pharmacokinetic rationale and thus warrants further basic and clinical
Cytoplasmic hydrogen ion diffusion coefficient.
al-Baldawi, N F; Abercrombie, R F
1992-01-01
The apparent cytoplasmic proton diffusion coefficient was measured using pH electrodes and samples of cytoplasm extracted from the giant neuron of a marine invertebrate. By suddenly changing the pH at one surface of the sample and recording the relaxation of pH within the sample, an apparent diffusion coefficient of 1.4 +/- 0.5 x 10(-6) cm2/s (N = 7) was measured in the acidic or neutral range of pH (6.0-7.2). This value is approximately 5x lower than the diffusion coefficient of the mobile pH buffers (approximately 8 x 10(-6) cm2/s) and approximately 68x lower than the diffusion coefficient of the hydronium ion (93 x 10(-6) cm2/s). A mobile pH buffer (approximately 15% of the buffering power) and an immobile buffer (approximately 85% of the buffering power) could quantitatively account for the results at acidic or neutral pH. At alkaline pH (8.2-8.6), the apparent proton diffusion coefficient increased to 4.1 +/- 0.8 x 10(-6) cm2/s (N = 7). This larger diffusion coefficient at alkaline pH could be explained quantitatively by the enhanced buffering power of the mobile amino acids. Under the conditions of these experiments, it is unlikely that hydroxide movement influences the apparent hydrogen ion diffusion coefficient. PMID:1617134
Pakenham, Kenneth I; Samios, Christina; Sofronoff, Kate
2005-05-01
The present study examined the applicability of the double ABCX model of family adjustment in explaining maternal adjustment to caring for a child diagnosed with Asperger syndrome. Forty-seven mothers completed questionnaires at a university clinic while their children were participating in an anxiety intervention. The children were aged between 10 and 12 years. Results of correlations showed that each of the model components was related to one or more domains of maternal adjustment in the direction predicted, with the exception of problem-focused coping. Hierarchical regression analyses demonstrated that, after controlling for the effects of relevant demographics, stressor severity, pile-up of demands and coping were related to adjustment. Findings indicate the utility of the double ABCX model in guiding research into parental adjustment when caring for a child with Asperger syndrome. Limitations of the study and clinical implications are discussed.
Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2010-01-01
The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…
Use of age-adjusted rates of suicide in time series studies in Israel.
Bridges, F Stephen; Tankersley, William B
2009-01-01
Durkheim's modified theory of suicide was examined to explore how consistent it was in predicting Israeli rates of suicide from 1965 to 1997 when using age-adjusted rates rather than crude ones. In this time-series study, Israeli male and female rates of suicide increased and decreased, respectively, between 1965 and 1997. Conforming to Durkheim's modified theory, the Israeli male rate of suicide was lower in years when rates of marriage and birth are higher, while rates of suicide are higher in years when rates of divorce are higher, the opposite to that of Israeli women. The corrected regression coefficients suggest that the Israeli female rate of suicide remained lower in years when rate of divorce is higher, again the opposite suggested by Durkheim's modified theory. These results may indicate that divorce affects the mental health of Israeli women as suggested by their lower rate of suicide. Perhaps the "multiple roles held by Israeli females creates suicidogenic stress" and divorce provides some sense of stress relief, mentally speaking. The results were not as consistent with predictions suggested by Durkheim's modified theory of suicide as were rates from the United States for the same period nor were they consistent with rates based on "crude" suicide data. Thus, using age-adjusted rates of suicide had an influence on the prediction of the Israeli rate of suicide during this period.
Continuously adjustable Pulfrich spectacles
NASA Astrophysics Data System (ADS)
Jacobs, Ken; Karpf, Ron
2011-03-01
A number of Pulfrich 3-D movies and TV shows have been produced, but the standard implementation has inherent drawbacks. The movie and TV industries have correctly concluded that the standard Pulfrich 3-D implementation is not a useful 3-D technique. Continuously Adjustable Pulfrich Spectacles (CAPS) is a new implementation of the Pulfrich effect that allows any scene containing movement in a standard 2-D movie, which are most scenes, to be optionally viewed in 3-D using inexpensive viewing specs. Recent scientific results in the fields of human perception, optoelectronics, video compression and video format conversion are translated into a new implementation of Pulfrich 3- D. CAPS uses these results to continuously adjust to the movie so that the viewing spectacles always conform to the optical density that optimizes the Pulfrich stereoscopic illusion. CAPS instantly provides 3-D immersion to any moving scene in any 2-D movie. Without the glasses, the movie will appear as a normal 2-D image. CAPS work on any viewing device, and with any distribution medium. CAPS is appropriate for viewing Internet streamed movies in 3-D.
An assessment of precipitation adjustment and feedback computation methods
NASA Astrophysics Data System (ADS)
Richardson, T. B.; Samset, B. H.; Andrews, T.; Myhre, G.; Forster, P. M.
2016-10-01
The precipitation adjustment and feedback framework is a useful tool for understanding global and regional precipitation changes. However, there is no definitive method for making the decomposition. In this study we highlight important differences which arise in results due to methodological choices. The responses to five different forcing agents (CO2, CH4, SO4, black carbon, and solar insolation) are analyzed using global climate model simulations. Three decomposition methods are compared: using fixed sea surface temperature experiments (fSST), regressing transient climate change after an abrupt forcing (regression), and separating based on timescale using the first year of coupled simulations (YR1). The YR1 method is found to incorporate significant SST-driven feedbacks into the adjustment and is therefore not suitable for making the decomposition. Globally, the regression and fSST methods produce generally consistent results; however, the regression values are dependent on the number of years analyzed and have considerably larger uncertainties. Regionally, there are substantial differences between methods. The pattern of change calculated using regression reverses sign in many regions as the number of years analyzed increases. This makes it difficult to establish what effects are included in the decomposition. The fSST method provides a more clear-cut separation in terms of what physical drivers are included in each component. The fSST results are less affected by methodological choices and exhibit much less variability. We find that the precipitation adjustment is weakly affected by the choice of SST climatology.
Note on a Confidence Interval for the Squared Semipartial Correlation Coefficient
ERIC Educational Resources Information Center
Algina, James; Keselman, Harvey J.; Penfield, Randall J.
2008-01-01
A squared semipartial correlation coefficient ([Delta]R[superscript 2]) is the increase in the squared multiple correlation coefficient that occurs when a predictor is added to a multiple regression model. Prior research has shown that coverage probability for a confidence interval constructed by using a modified percentile bootstrap method with…
Bias associated with using the estimated propensity score as a regression covariate.
Hade, Erinn M; Lu, Bo
2014-01-15
The use of propensity score methods to adjust for selection bias in observational studies has become increasingly popular in public health and medical research. A substantial portion of studies using propensity score adjustment treat the propensity score as a conventional regression predictor. Through a Monte Carlo simulation study, Austin and colleagues. investigated the bias associated with treatment effect estimation when the propensity score is used as a covariate in nonlinear regression models, such as logistic regression and Cox proportional hazards models. We show that the bias exists even in a linear regression model when the estimated propensity score is used and derive the explicit form of the bias. We also conduct an extensive simulation study to compare the performance of such covariate adjustment with propensity score stratification, propensity score matching, inverse probability of treatment weighted method, and nonparametric functional estimation using splines. The simulation scenarios are designed to reflect real data analysis practice. Instead of specifying a known parametric propensity score model, we generate the data by considering various degrees of overlap of the covariate distributions between treated and control groups. Propensity score matching excels when the treated group is contained within a larger control pool, while the model-based adjustment may have an edge when treated and control groups do not have too much overlap. Overall, adjusting for the propensity score through stratification or matching followed by regression or using splines, appears to be a good practical strategy.
Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.
2000-01-01
The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.
A technique to measure rotordynamic coefficients in hydrostatic bearings
NASA Astrophysics Data System (ADS)
Capaldi, Russell J.
1993-11-01
An experimental technique is described for measuring the rotordynamic coefficients of fluid film journal bearings. The bearing tester incorporates a double-spool shaft assembly that permits independent control over the journal spin speed and the frequency of an adjustable-magnitude circular orbit. This configuration yields data that enables determination of the full linear anisotropic rotordynamic coefficient matrices. The dynamic force measurements were made simultaneously with two independent systems, one with piezoelectric load cells and the other with strain gage load cells. Some results are presented for a four-recess, oil-fed hydrostatic journal bearing.
A technique to measure rotordynamic coefficients in hydrostatic bearings
NASA Technical Reports Server (NTRS)
Capaldi, Russell J.
1993-01-01
An experimental technique is described for measuring the rotordynamic coefficients of fluid film journal bearings. The bearing tester incorporates a double-spool shaft assembly that permits independent control over the journal spin speed and the frequency of an adjustable-magnitude circular orbit. This configuration yields data that enables determination of the full linear anisotropic rotordynamic coefficient matrices. The dynamic force measurements were made simultaneously with two independent systems, one with piezoelectric load cells and the other with strain gage load cells. Some results are presented for a four-recess, oil-fed hydrostatic journal bearing.
A regressive model analysis of congenital sensorineural deafness in German Dalmatian dogs.
Juraschko, Kathrin; Meyer-Lindenberg, Andrea; Nolte, Ingo; Distl, Ottmar
2003-08-01
The objective of the present study was to analyze the mode of inheritance for congenital sensorineural deafness (CSD) in German Dalmatian dogs by consideration of association between phenotypic breed characteristics and CSD. Segregation analysis with regressive logistic models was employed to test for different mechanisms of genetic transmission. Data were obtained from all three Dalmatian kennel clubs associated with the German Association for Dog Breeding and Husbandry (VDH). CSD was tested by veterinary practitioners using standardized protocols for Brainstem Auditory-Evoked Response (BAER). The sample included 1899 Dalmatian dogs from 354 litters in 169 different kennels. BAER testing results were from the years 1986 to 1999. Pedigree information was available for up to seven generations. The segregation analysis showed that a mixed monogenic-polygenic model including eye color as covariate among all other tested models best explained the segregation of affected animals in the pedigrees. The recessive major gene segregated in dogs with blue and brown eye color as well as in dogs with and without pigmented coat patches. Models which took into account the occurrence of patches, percentage of puppies tested per litter, or inbreeding coefficient gave no better adjustment to the most general (saturated) model. A procedure for the simultaneous prediction of breeding values and the estimation of genotype probabilities for CSD is expected to improve breeding programs significantly.
Regression analysis of cytopathological data
Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.
1982-12-01
Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.
Functional constraints on phenomenological coefficients
NASA Astrophysics Data System (ADS)
Klika, Václav; Pavelka, Michal; Benziger, Jay B.
2017-02-01
Thermodynamic fluxes (diffusion fluxes, heat flux, etc.) are often proportional to thermodynamic forces (gradients of chemical potentials, temperature, etc.) via the matrix of phenomenological coefficients. Onsager's relations imply that the matrix is symmetric, which reduces the number of unknown coefficients is reduced. In this article we demonstrate that for a class of nonequilibrium thermodynamic models in addition to Onsager's relations the phenomenological coefficients must share the same functional dependence on the local thermodynamic state variables. Thermodynamic models and experimental data should be validated through consistency with the functional constraint. We present examples of coupled heat and mass transport (thermodiffusion) and coupled charge and mass transport (electro-osmotic drag). Additionally, these newly identified constraints further reduce the number of experiments needed to describe the phenomenological coefficient.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Optimization of Regression Models of Experimental Data Using Confirmation Points
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2010-01-01
A new search metric is discussed that may be used to better assess the predictive capability of different math term combinations during the optimization of a regression model of experimental data. The new search metric can be determined for each tested math term combination if the given experimental data set is split into two subsets. The first subset consists of data points that are only used to determine the coefficients of the regression model. The second subset consists of confirmation points that are exclusively used to test the regression model. The new search metric value is assigned after comparing two values that describe the quality of the fit of each subset. The first value is the standard deviation of the PRESS residuals of the data points. The second value is the standard deviation of the response residuals of the confirmation points. The greater of the two values is used as the new search metric value. This choice guarantees that both standard deviations are always less or equal to the value that is used during the optimization. Experimental data from the calibration of a wind tunnel strain-gage balance is used to illustrate the application of the new search metric. The new search metric ultimately generates an optimized regression model that was already tested at regression model independent confirmation points before it is ever used to predict an unknown response from a set of regressors.
copCAR: A Flexible Regression Model for Areal Data.
Hughes, John
2015-09-16
Non-Gaussian spatial data are common in many fields. When fitting regressions for such data, one needs to account for spatial dependence to ensure reliable inference for the regression coefficients. The two most commonly used regression models for spatially aggregated data are the automodel and the areal generalized linear mixed model (GLMM). These models induce spatial dependence in different ways but share the smoothing approach, which is intuitive but problematic. This article develops a new regression model for areal data. The new model is called copCAR because it is copula-based and employs the areal GLMM's conditional autoregression (CAR). copCAR overcomes many of the drawbacks of the automodel and the areal GLMM. Specifically, copCAR (1) is flexible and intuitive, (2) permits positive spatial dependence for all types of data, (3) permits efficient computation, and (4) provides reliable spatial regression inference and information about dependence strength. An implementation is provided by R package copCAR, which is available from the Comprehensive R Archive Network, and supplementary materials are available online.
Interactive natural image segmentation via spline regression.
Xiang, Shiming; Nie, Feiping; Zhang, Chunxia; Zhang, Changshui
2009-07-01
This paper presents an interactive algorithm for segmentation of natural images. The task is formulated as a problem of spline regression, in which the spline is derived in Sobolev space and has a form of a combination of linear and Green's functions. Besides its nonlinear representation capability, one advantage of this spline in usage is that, once it has been constructed, no parameters need to be tuned to data. We define this spline on the user specified foreground and background pixels, and solve its parameters (the combination coefficients of functions) from a group of linear equations. To speed up spline construction, K-means clustering algorithm is employed to cluster the user specified pixels. By taking the cluster centers as representatives, this spline can be easily constructed. The foreground object is finally cut out from its background via spline interpolation. The computational complexity of the proposed algorithm is linear in the number of the pixels to be segmented. Experiments on diverse natural images, with comparison to existing algorithms, illustrate the validity of our method.
Olson, Scott A.; with a section by Veilleux, Andrea G.
2014-01-01
This report provides estimates of flood discharges at selected annual exceedance probabilities (AEPs) for streamgages in and adjacent to Vermont and equations for estimating flood discharges at AEPs of 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent (recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-years, respectively) for ungaged, unregulated, rural streams in Vermont. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 145 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, percentage of wetland area, and the basin-wide mean of the average annual precipitation. The average standard errors of prediction for estimating the flood discharges at the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEP with these equations are 34.9, 36.0, 38.7, 42.4, 44.9, 47.3, 50.7, and 55.1 percent, respectively. Flood discharges at selected AEPs for streamgages were computed by using the Expected Moments Algorithm. To improve estimates of the flood discharges for given exceedance probabilities at streamgages in Vermont, a new generalized skew coefficient was developed. The new generalized skew for the region is a constant, 0.44. The mean square error of the generalized skew coefficient is 0.078. This report describes a technique for using results from the regression equations to adjust an AEP discharge computed from a streamgage record. This report also describes a technique for using a drainage-area adjustment to estimate flood discharge at a selected AEP for an ungaged site upstream or downstream from a streamgage. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.
Multiatlas segmentation as nonparametric regression.
Awate, Suyash P; Whitaker, Ross T
2014-09-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.
Multiatlas Segmentation as Nonparametric Regression
Awate, Suyash P.; Whitaker, Ross T.
2015-01-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528
Judd, Linda J.; Asquith, William H.; Slade, Raymond M.
1996-01-01
One technique to estimate generalized skew coefficients involved the use of regression equations developed for each of eight regions in Texas, and the other involved development of a statewide map of generalized skew coefficients. The weighted mean of the weighted mean standard errors of the regression equations for the eight regions is 0.36 log10 skew units, and the weighted mean standard error of the map is 0.35 log10 skew units. The technique based on the map is preferred for estimating generalized skew coefficients because of its smooth transition from one region of the State to another.
VARYING COEFFICIENT MODELS FOR DATA WITH AUTO-CORRELATED ERROR PROCESS
Chen, Zhao; Li, Runze; Li, Yan
2014-01-01
Varying coefficient model has been popular in the literature. In this paper, we propose a profile least squares estimation procedure to its regression coefficients when its random error is an auto-regressive (AR) process. We further study the asymptotic properties of the proposed procedure, and establish the asymptotic normality for the resulting estimate. We show that the resulting estimate for the regression coefficients has the same asymptotic bias and variance as the local linear estimate for varying coefficient models with independent and identically distributed observations. We apply the SCAD variable selection procedure (Fan and Li, 2001) to reduce model complexity of the AR error process. Numerical comparison and finite sample performance of the resulting estimate are examined by Monte Carlo studies. Our simulation results demonstrate the proposed procedure is much more efficient than the one ignoring the error correlation. The proposed methodology is illustrated by a real data example. PMID:25908899
VARYING COEFFICIENT MODELS FOR DATA WITH AUTO-CORRELATED ERROR PROCESS.
Chen, Zhao; Li, Runze; Li, Yan
2015-04-01
Varying coefficient model has been popular in the literature. In this paper, we propose a profile least squares estimation procedure to its regression coefficients when its random error is an auto-regressive (AR) process. We further study the asymptotic properties of the proposed procedure, and establish the asymptotic normality for the resulting estimate. We show that the resulting estimate for the regression coefficients has the same asymptotic bias and variance as the local linear estimate for varying coefficient models with independent and identically distributed observations. We apply the SCAD variable selection procedure (Fan and Li, 2001) to reduce model complexity of the AR error process. Numerical comparison and finite sample performance of the resulting estimate are examined by Monte Carlo studies. Our simulation results demonstrate the proposed procedure is much more efficient than the one ignoring the error correlation. The proposed methodology is illustrated by a real data example.
Recognition of caudal regression syndrome.
Boulas, Mari M
2009-04-01
Caudal regression syndrome, also referred to as caudal dysplasia and sacral agenesis syndrome, is a rare congenital malformation characterized by varying degrees of developmental failure early in gestation. It involves the lower extremities, the lumbar and coccygeal vertebrae, and corresponding segments of the spinal cord. This is a rare disorder, and true pathogenesis is unclear. The etiology is thought to be related to maternal diabetes, genetic predisposition, and vascular hypoperfusion, but no true causative factor has been determined. Fetal diagnostic tools allow for early recognition of the syndrome, and careful examination of the newborn is essential to determine the extent of the disorder. Associated organ system dysfunction depends on the severity of the disease. Related defects are structural, and systematic problems including respiratory, cardiac, gastrointestinal, urinary, orthopedic, and neurologic can be present in varying degrees of severity and in different combinations. A multidisciplinary approach to management is crucial. Because the primary pathology is irreversible, treatment is only supportive.
Practical Session: Multiple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Lumbar herniated disc: spontaneous regression
Yüksel, Kasım Zafer
2017-01-01
Background Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. Methods This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3−L4, L4−L5 or L5−S1 were enrolled. Results The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3−L4, L4−L5, and L5−S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5−22). Conclusions It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery. PMID:28119770
This Infographic shows the National Cancer Institute SEER Incidence Trends. The graphs show the Average Annual Percent Change (AAPC) 2002-2011. For Men, Thyroid: 5.3*,Liver & IBD: 3.6*, Melanoma: 2.3*, Kidney: 2.0*, Myeloma: 1.9*, Pancreas: 1.2*, Leukemia: 0.9*, Oral Cavity: 0.5, Non-Hodgkin Lymphoma: 0.3*, Esophagus: -0.1, Brain & ONS: -0.2*, Bladder: -0.6*, All Sites: -1.1*, Stomach: -1.7*, Larynx: -1.9*, Prostate: -2.1*, Lung & Bronchus: -2.4*, and Colon & Rectum: -3/0*. For Women, Thyroid: 5.8*, Liver & IBD: 2.9*, Myeloma: 1.8*, Kidney: 1.6*, Melanoma: 1.5, Corpus & Uterus: 1.3*, Pancreas: 1.1*, Leukemia: 0.6*, Brain & ONS: 0, Non-Hodgkin Lymphoma: -0.1, All Sites: -0.1, Breast: -0.3, Stomach: -0.7*, Oral Cavity: -0.7*, Bladder: -0.9*, Ovary: -0.9*, Lung & Bronchus: -1.0*, Cervix: -2.4*, and Colon & Rectum: -2.7*. * AAPC is significantly different from zero (p<.05). Rates were adjusted for reporting delay in the registry. www.cancer.gov Source: Special section of the Annual Report to the Nation on the Status of Cancer, 1975-2011.
Nonlinear Hydrostatic Adjustment.
NASA Astrophysics Data System (ADS)
Bannon, Peter R.
1996-12-01
The final equilibrium state of Lamb's hydrostatic adjustment problem is found for finite amplitude heating. Lamb's problem consists of the response of a compressible atmosphere to an instantaneous, horizontally homogeneous heating. Results are presented for both isothermal and nonisothermal atmospheres.As in the linear problem, the fluid displacements are confined to the heated layer and to the region aloft with no displacement of the fluid below the heating. The region above the heating is displaced uniformly upward for heating and downward for cooling. The amplitudes of the displacements are larger for cooling than for warming.Examination of the energetics reveals that the fraction of the heat deposited into the acoustic modes increases linearly with the amplitude of the heating. This fraction is typically small (e.g., 0.06% for a uniform warming of 1 K) and is essentially independent of the lapse rate of the base-state atmosphere. In contrast a fixed fraction of the available energy generated by the heating goes into the acoustic modes. This fraction (e.g., 12% for a standard tropospheric lapse rate) agrees with the linear result and increases with increasing stability of the base-state atmosphere.The compressible results are compared to solutions using various forms of the soundproof equations. None of the soundproof equations predict the finite amplitude solutions accurately. However, in the small amplitude limit, only the equations for deep convection advanced by Dutton and Fichtl predict the thermodynamic state variables accurately for a nonisothermal base-state atmosphere.
Kuiper, Gerhardus J A J M; Houben, Rik; Wetzels, Rick J H; Verhezen, Paul W M; van Oerle, Rene; Ten Cate, Hugo; Henskens, Yvonne M C; Lancé, Marcus D
2017-01-09
Low platelet counts and hematocrit levels hinder whole blood point-of-care testing of platelet function. Thus far, no reference ranges for MEA (multiple electrode aggregometry) and PFA-100 (platelet function analyzer 100) devices exist for low ranges. Through dilution methods of volunteer whole blood, platelet function at low ranges of platelet count and hematocrit levels was assessed on MEA for four agonists and for PFA-100 in two cartridges. Using (multiple) regression analysis, 95% reference intervals were computed for these low ranges. Low platelet counts affected MEA in a positive correlation (all agonists showed r(2) ≥ 0.75) and PFA-100 in an inverse correlation (closure times were prolonged with lower platelet counts). Lowered hematocrit did not affect MEA testing, except for arachidonic acid activation (ASPI), which showed a weak positive correlation (r(2) = 0.14). Closure time on PFA-100 testing was inversely correlated with hematocrit for both cartridges. Regression analysis revealed different 95% reference intervals in comparison with originally established intervals for both MEA and PFA-100 in low platelet or hematocrit conditions. Multiple regression analysis of ASPI and both tests on the PFA-100 for combined low platelet and hematocrit conditions revealed that only PFA-100 testing should be adjusted for both thrombocytopenia and anemia. 95% reference intervals were calculated using multiple regression analysis. However, coefficients of determination of PFA-100 were poor, and some variance remained unexplained. Thus, in this pilot study using (multiple) regression analysis, we could establish reference intervals of platelet function in anemia and thrombocytopenia conditions on PFA-100 and in thrombocytopenia conditions on MEA.
Orthogonality of spherical harmonic coefficients
NASA Technical Reports Server (NTRS)
Mcleod, M. G.
1980-01-01
Orthogonality relations are obtained for the spherical harmonic coefficients of functions defined on the surface of a sphere. Following a brief discussion of the orthogonality of Fourier series coefficients, consideration is given to the values averaged over all orientations of the coordinate system of the spherical harmonic coefficients of a function defined on the surface of a sphere that can be expressed in terms of Legendre polynomials for the special case where the function is the sum of two delta functions located at two different points on the sphere, and for the case of an essentially arbitrary function. It is noted that the orthogonality relations derived have found applications in statistical studies of the geomagnetic field.
Transport coefficients of gluonic fluid
Das, Santosh K.; Alam, Jan-e
2011-06-01
The shear ({eta}) and bulk ({zeta}) viscous coefficients have been evaluated for a gluonic fluid. The elastic, gg{yields}gg and the inelastic, number nonconserving, gg{yields}ggg processes have been considered as the dominant perturbative processes in evaluating the viscous coefficients to entropy density (s) ratios. Recently the processes: gg{yields}ggg has been revisited and a correction to the widely used Gunion-Bertsch (GB) formula has been obtained. The {eta} and {zeta} have been evaluated for gluonic fluid with the formula recently derived. At large {alpha}{sub s} the value of {eta}/s approaches its lower bound, {approx}1/4{pi}.
Seebeck coefficient of one electron
Durrani, Zahid A. K.
2014-03-07
The Seebeck coefficient of one electron, driven thermally into a semiconductor single-electron box, is investigated theoretically. With a finite temperature difference ΔT between the source and charging island, a single electron can charge the island in equilibrium, directly generating a Seebeck effect. Seebeck coefficients for small and finite ΔT are calculated and a thermally driven Coulomb staircase is predicted. Single-electron Seebeck oscillations occur with increasing ΔT, as one electron at a time charges the box. A method is proposed for experimental verification of these effects.
A regression model analysis of longitudinal dental caries data.
Ringelberg, M L; Tonascia, J A
1976-03-01
Longitudinal data on caries experience were derived from the reexamination and interview of a cohort of 306 subjects with an average follow-up period of 33 years after the baseline examination. Analysis of the data was accomplished by the use of contingency tables utilizing enumeration statistics compared with a multiple regression analysis. The analyses indicated a strong association of caries experience at one point in time with the caries experience of that same person earlier in life. The regression model approach offers adjustment of any given independent variable for the effect of all other independent variables, providing a powerful means of bias reduction. The model is also useful in separating out the specific effect of an independent variable over and above the contribution of other variables. The model used explained 35% of the variability in the DMFS scores recorded. Similar models could be useful adjuncts in the analyses of dental epidemiologic data.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Genetics Home Reference: caudal regression syndrome
... of a genetic condition? Genetic and Rare Diseases Information Center Frequency Caudal regression syndrome is estimated to occur in 1 to ... parts of the skeleton, gastrointestinal system, and genitourinary ... caudal regression syndrome results from the presence of an abnormal ...
Regressions during reading: The cost depends on the cause.
Eskenazi, Michael A; Folk, Jocelyn R
2016-11-21
The direction and duration of eye movements during reading is predominantly determined by cognitive and linguistic processing, but some low-level oculomotor effects also influence the duration and direction of eye movements. One such effect is inhibition of return (IOR), which results in an increased latency to return attention to a target that has been previously attended (Posner & Cohen, Attention and Performance X: Control of Language Processes, 32, 531-556, 1984). Although this is a low level effect, it has also been found in the complex task of reading (Henderson & Luke, Psychonomic Bulletin & Review, 19(6), 1101-1107, 2012; Rayner, Juhasz, Ashby, & Clifton, Vision Research, 43(9), 1027-1034, 2003). The purpose of the current study was to isolate the potentially different causes of regressive eye movements: to adjust for oculomotor error and to assist with comprehension difficulties. We found that readers demonstrated an IOR effect when regressions were caused by oculomotor error, but not when regressions were caused by comprehension difficulties. The results suggest that IOR is primarily associated with low-level oculomotor control of eye movements, and that regressive eye movements that are controlled by comprehension processes are not subject to IOR effects. The results have implications for understanding the relationship between oculomotor and cognitive control of eye movements and for models of eye movement control.
Semiparametric regression during 2003–2007*
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2010-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800
Intuitionistic Fuzzy Weighted Linear Regression Model with Fuzzy Entropy under Linear Restrictions.
Kumar, Gaurav; Bajaj, Rakesh Kumar
2014-01-01
In fuzzy set theory, it is well known that a triangular fuzzy number can be uniquely determined through its position and entropies. In the present communication, we extend this concept on triangular intuitionistic fuzzy number for its one-to-one correspondence with its position and entropies. Using the concept of fuzzy entropy the estimators of the intuitionistic fuzzy regression coefficients have been estimated in the unrestricted regression model. An intuitionistic fuzzy weighted linear regression (IFWLR) model with some restrictions in the form of prior information has been considered. Further, the estimators of regression coefficients have been obtained with the help of fuzzy entropy for the restricted/unrestricted IFWLR model by assigning some weights in the distance function.
A linear regression solution to the spatial autocorrelation problem
NASA Astrophysics Data System (ADS)
Griffith, Daniel A.
The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.
Bayesian median regression for temporal gene expression data
NASA Astrophysics Data System (ADS)
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
A statistical test for the equality of differently adjusted incidence rate ratios.
Hoffmann, Kurt; Pischon, Tobias; Schulz, Mandy; Schulze, Matthias B; Ray, Jennifer; Boeing, Heiner
2008-03-01
An incidence rate ratio (IRR) is a meaningful effect measure in epidemiology if it is adjusted for all important confounders. For evaluation of the impact of adjustment, adjusted IRRs should be compared with crude IRRs. The aim of this methodological study was to present a statistical approach for testing the equality of adjusted and crude IRRs and to derive a confidence interval for the ratio of the two IRRs. The method can be extended to compare two differently adjusted IRRs and, thus, to evaluate the effect of additional adjustment. The method runs immediately on existing software. To illustrate the application of this approach, the authors studied adjusted IRRs for two risk factors of type 2 diabetes using data from the European Prospective Investigation into Cancer and Nutrition-Potsdam Study from 2005. The statistical method described may be helpful as an additional tool for analyzing epidemiologic cohort data and for interpreting results obtained from Cox regression models with adjustment for different covariates.
Area-to-point regression kriging for pan-sharpening
NASA Astrophysics Data System (ADS)
Wang, Qunming; Shi, Wenzhong; Atkinson, Peter M.
2016-04-01
Pan-sharpening is a technique to combine the fine spatial resolution panchromatic (PAN) band with the coarse spatial resolution multispectral bands of the same satellite to create a fine spatial resolution multispectral image. In this paper, area-to-point regression kriging (ATPRK) is proposed for pan-sharpening. ATPRK considers the PAN band as the covariate. Moreover, ATPRK is extended with a local approach, called adaptive ATPRK (AATPRK), which fits a regression model using a local, non-stationary scheme such that the regression coefficients change across the image. The two geostatistical approaches, ATPRK and AATPRK, were compared to the 13 state-of-the-art pan-sharpening approaches summarized in Vivone et al. (2015) in experiments on three separate datasets. ATPRK and AATPRK produced more accurate pan-sharpened images than the 13 benchmark algorithms in all three experiments. Unlike the benchmark algorithms, the two geostatistical solutions precisely preserved the spectral properties of the original coarse data. Furthermore, ATPRK can be enhanced by a local scheme in AATRPK, in cases where the residuals from a global regression model are such that their spatial character varies locally.
NASA Technical Reports Server (NTRS)
Chandra, N.
1974-01-01
Numerical coefficients required to express the angular distribution for the rotationally elastic or inelastic scattering of electrons from a diatomic molecule were tabulated for the case of nitrogen and in the energy range from 0.20 eV to 10.0 eV. Five different rotational states are considered.
Identities for generalized hypergeometric coefficients
Biedenharn, L.C.; Louck, J.D.
1991-01-01
Generalizations of hypergeometric functions to arbitrarily many symmetric variables are discussed, along with their associated hypergeometric coefficients, and the setting within which these generalizations arose. Identities generalizing the Euler identity for {sub 2}F{sub 1}, the Saalschuetz identity, and two generalizations of the {sub 4}F{sub 3} Bailey identity, among others, are given. 16 refs.
Effective Viscosity Coefficient of Nanosuspensions
NASA Astrophysics Data System (ADS)
Rudyak, V. Ya.; Belkin, A. A.; Egorov, V. V.
2008-12-01
Systematic calculations of the effective viscosity coefficient of nanosuspensions have been performed using the molecular dynamics method. It is established that the viscosity of a nanosuspension depends not only on the volume concentration of the nanoparticles but also on their mass and diameter. Differences from Einstein's relation are found even for nanosuspensions with a low particle concentration.
Integer Solutions of Binomial Coefficients
ERIC Educational Resources Information Center
Gilbertson, Nicholas J.
2016-01-01
A good formula is like a good story, rich in description, powerful in communication, and eye-opening to readers. The formula presented in this article for determining the coefficients of the binomial expansion of (x + y)n is one such "good read." The beauty of this formula is in its simplicity--both describing a quantitative situation…
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Developmental Regression in Autism Spectrum Disorders
ERIC Educational Resources Information Center
Rogers, Sally J.
2004-01-01
The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Synthesizing Regression Results: A Factored Likelihood Method
ERIC Educational Resources Information Center
Wu, Meng-Jia; Becker, Betsy Jane
2013-01-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…
Streamflow forecasting using functional regression
NASA Astrophysics Data System (ADS)
Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.
2016-07-01
Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.
Survival analysis and Cox regression.
Benítez-Parejo, N; Rodríguez del Águila, M M; Pérez-Vicente, S
2011-01-01
The data provided by clinical trials are often expressed in terms of survival. The analysis of survival comprises a series of statistical analytical techniques in which the measurements analysed represent the time elapsed between a given exposure and the outcome of a certain event. Despite the name of these techniques, the outcome in question does not necessarily have to be either survival or death, and may be healing versus no healing, relief versus pain, complication versus no complication, relapse versus no relapse, etc. The present article describes the analysis of survival from both a descriptive perspective, based on the Kaplan-Meier estimation method, and in terms of bivariate comparisons using the log-rank statistic. Likewise, a description is provided of the Cox regression models for the study of risk factors or covariables associated to the probability of survival. These models are defined in both simple and multiple forms, and a description is provided of how they are calculated and how the postulates for application are checked - accompanied by illustrating examples with the shareware application R.
Estimating equivalence with quantile regression
Cade, B.S.
2011-01-01
Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.
Forcing Regression through a Given Point Using Any Familiar Computational Routine.
1983-03-01
single independent variable; e.g., X = ( gHb ) 1/2 sin 2ab has two distinct carriers (Hb and ab) but is one independent variable (see example problem 1...and height of breaking waves (cib, Hb, respectively). v -20.7 m ( gHb )1/2 sin 2ab() The coefficient of proportionality (20.7) is based on typical... gHb )11 2 sin 2ab Regress Y on X to determine the best estimate of the coefficient of proportionality between X and Y. CORRECT RESULTS: Regression
Kala, Abhishek K.; Tiwari, Chetan; Mikler, Armin R.
2017-01-01
Background The primary aim of the study reported here was to determine the effectiveness of utilizing local spatial variations in environmental data to uncover the statistical relationships between West Nile Virus (WNV) risk and environmental factors. Because least squares regression methods do not account for spatial autocorrelation and non-stationarity of the type of spatial data analyzed for studies that explore the relationship between WNV and environmental determinants, we hypothesized that a geographically weighted regression model would help us better understand how environmental factors are related to WNV risk patterns without the confounding effects of spatial non-stationarity. Methods We examined commonly mapped environmental factors using both ordinary least squares regression (LSR) and geographically weighted regression (GWR). Both types of models were applied to examine the relationship between WNV-infected dead bird counts and various environmental factors for those locations. The goal was to determine which approach yielded a better predictive model. Results LSR efforts lead to identifying three environmental variables that were statistically significantly related to WNV infected dead birds (adjusted R2 = 0.61): stream density, road density, and land surface temperature. GWR efforts increased the explanatory value of these three environmental variables with better spatial precision (adjusted R2 = 0.71). Conclusions The spatial granularity resulting from the geographically weighted approach provides a better understanding of how environmental spatial heterogeneity is related to WNV risk as implied by WNV infected dead birds, which should allow improved planning of public health management strategies. PMID:28367364
Partitioning coefficients between olivine and silicate melts
NASA Astrophysics Data System (ADS)
Bédard, J. H.
2005-08-01
Variation of Nernst partition coefficients ( D) between olivine and silicate melts cannot be neglected when modeling partial melting and fractional crystallization. Published natural and experimental olivine/liquidD data were examined for covariation with pressure, temperature, olivine forsterite content, and melt SiO 2, H 2O, MgO and MgO/MgO + FeO total. Values of olivine/liquidD generally increase with decreasing temperature and melt MgO content, and with increasing melt SiO 2 content, but generally show poor correlations with other variables. Multi-element olivine/liquidD profiles calculated from regressions of D REE-Sc-Y vs. melt MgO content are compared to results of the Lattice Strain Model to link melt MgO and: D0 (the strain compensated partition coefficient), EM3+ (Young's Modulus), and r0 (the size of the M site). Ln D0 varies linearly with Ln MgO in the melt; EM3+ varies linearly with melt MgO, with a dog-leg at ca. 1.5% MgO; and r0 remains constant at 0.807 Å. These equations are then used to calculate olivine/liquidD for these elements using the Lattice Strain Model. These empirical parameterizations of olivine/liquidD variations yield results comparable to experimental or natural partitioning data, and can easily be integrated into existing trace element modeling algorithms. The olivine/liquidD data suggest that basaltic melts in equilibrium with pure olivine may acquire small negative Ta-Hf-Zr-Ti anomalies, but that negative Nb anomalies are unlikely to develop. Misfits between results of the Lattice Strain Model and most light rare earth and large ion lithophile partitioning data suggest that kinetic effects may limit the lower value of D for extremely incompatible elements in natural situations characterized by high cooling/crystallization rates.
Developmental regression in autism spectrum disorder.
Al Backer, Nouf Backer
2015-01-01
The occurrence of developmental regression in autism spectrum disorder (ASD) is one of the most puzzling phenomena of this disorder. A little is known about the nature and mechanism of developmental regression in ASD. About one-third of young children with ASD lose some skills during the preschool period, usually speech, but sometimes also nonverbal communication, social or play skills are also affected. There is a lot of evidence suggesting that most children who demonstrate regression also had previous, subtle, developmental differences. It is difficult to predict the prognosis of autistic children with developmental regression. It seems that the earlier development of social, language, and attachment behaviors followed by regression does not predict the later recovery of skills or better developmental outcomes. The underlying mechanisms that lead to regression in autism are unknown. The role of subclinical epilepsy in the developmental regression of children with autism remains unclear.
Spousal Adjustment to Myocardial Infarction.
ERIC Educational Resources Information Center
Ziglar, Elisa J.
This paper reviews the literature on the stresses and coping strategies of spouses of patients with myocardial infarction (MI). It attempts to identify specific problem areas of adjustment for the spouse and to explore the effects of spousal adjustment on patient recovery. Chapter one provides an overview of the importance in examining the…
Transport coefficients of quantum plasmas
Bennaceur, D.; Khalfaoui, A.H. )
1993-09-01
Transport coefficients of fully ionized plasmas with a weakly coupled, completely degenerate electron gas and classical ions with a wide range of coupling strength are expressed within the Bloch transport equation. Using the Kohler variational principle the collision integral of the quantum Boltzmann equation is derived, which accounts for quantum effects through collective plasma oscillations. The physical implications of the results are investigated through comparisons with other theories. For practical applications, electrical and thermal conductivities are derived in simple analytical formulas. The relation between these two transport coefficients is expressed in an explicit form, giving a generalized Wiedemann-Franz law, where the Lorentz ratio is a dependent function of the coupling parameter and the degree of degeneracy of the plasma.
High temperature Seebeck coefficient metrology
Martin, J.; Tritt, T.; Uher, C.
2010-12-15
We present an overview of the challenges and practices of thermoelectric metrology on bulk materials at high temperature (300 to 1300 K). The Seebeck coefficient, when combined with thermal and electrical conductivity, is an essential property measurement for evaluating the potential performance of novel thermoelectric materials. However, there is some question as to which measurement technique(s) provides the most accurate determination of the Seebeck coefficient at high temperature. This has led to the implementation of nonideal practices that have further complicated the confirmation of reported high ZT materials. To ensure meaningful interlaboratory comparison of data, thermoelectric measurements must be reliable, accurate, and consistent. This article will summarize and compare the relevant measurement techniques and apparatus designs required to effectively manage uncertainty, while also providing a reference resource of previous advances in high temperature thermoelectric metrology.
Consistent transport coefficients in astrophysics
NASA Technical Reports Server (NTRS)
Fontenla, Juan M.; Rovira, M.; Ferrofontan, C.
1986-01-01
A consistent theory for dealing with transport phenomena in stellar atmospheres starting with the kinetic equations and introducing three cases (LTE, partial LTE, and non-LTE) was developed. The consistent hydrodynamical equations were presented for partial-LTE, the transport coefficients defined, and a method shown to calculate them. The method is based on the numerical solution of kinetic equations considering Landau, Boltzmann, and Focker-Planck collision terms. Finally a set of results for the transport coefficients derived for a partially ionized hydrogen gas with radiation was shown, considering ionization and recombination as well as elastic collisions. The results obtained imply major changes is some types of theoretical model calculations and can resolve some important current problems concerning energy and mass balance in the solar atmosphere. It is shown that energy balance in the lower solar transition region can be fully explained by means of radiation losses and conductive flux.
Study of Dispersion Coefficient Channel
NASA Astrophysics Data System (ADS)
Akiyama, K. R.; Bressan, C. K.; Pires, M. S. G.; Canno, L. M.; Ribeiro, L. C. L. J.
2016-08-01
The issue of water pollution has worsened in recent times due to releases, intentional or not, of pollutants in natural water bodies. This causes several studies about the distribution of pollutants are carried out. The water quality models have been developed and widely used today as a preventative tool, ie to try to predict what will be the concentration distribution of constituent along a body of water in spatial and temporal scale. To understand and use such models, it is necessary to know some concepts of hydraulic high on their application, including the longitudinal dispersion coefficient. This study aims to conduct a theoretical and experimental study of the channel dispersion coefficient, yielding more information about their direct determination in the literature.
Portable vapor diffusion coefficient meter
Ho, Clifford K.
2007-06-12
An apparatus for measuring the effective vapor diffusion coefficient of a test vapor diffusing through a sample of porous media contained within a test chamber. A chemical sensor measures the time-varying concentration of vapor that has diffused a known distance through the porous media. A data processor contained within the apparatus compares the measured sensor data with analytical predictions of the response curve based on the transient diffusion equation using Fick's Law, iterating on the choice of an effective vapor diffusion coefficient until the difference between the predicted and measured curves is minimized. Optionally, a purge fluid can forced through the porous media, permitting the apparatus to also measure a gas-phase permeability. The apparatus can be made lightweight, self-powered, and portable for use in the field.
Subsonic Aircraft With Regression and Neural-Network Approximators Designed
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Hopkins, Dale A.
2004-01-01
At the NASA Glenn Research Center, NASA Langley Research Center's Flight Optimization System (FLOPS) and the design optimization testbed COMETBOARDS with regression and neural-network-analysis approximators have been coupled to obtain a preliminary aircraft design methodology. For a subsonic aircraft, the optimal design, that is the airframe-engine combination, is obtained by the simulation. The aircraft is powered by two high-bypass-ratio engines with a nominal thrust of about 35,000 lbf. It is to carry 150 passengers at a cruise speed of Mach 0.8 over a range of 3000 n mi and to operate on a 6000-ft runway. The aircraft design utilized a neural network and a regression-approximations-based analysis tool, along with a multioptimizer cascade algorithm that uses sequential linear programming, sequential quadratic programming, the method of feasible directions, and then sequential quadratic programming again. Optimal aircraft weight versus the number of design iterations is shown. The central processing unit (CPU) time to solution is given. It is shown that the regression-method-based analyzer exhibited a smoother convergence pattern than the FLOPS code. The optimum weight obtained by the approximation technique and the FLOPS code differed by 1.3 percent. Prediction by the approximation technique exhibited no error for the aircraft wing area and turbine entry temperature, whereas it was within 2 percent for most other parameters. Cascade strategy was required by FLOPS as well as the approximators. The regression method had a tendency to hug the data points, whereas the neural network exhibited a propensity to follow a mean path. The performance of the neural network and regression methods was considered adequate. It was at about the same level for small, standard, and large models with redundancy ratios (defined as the number of input-output pairs to the number of unknown coefficients) of 14, 28, and 57, respectively. In an SGI octane workstation (Silicon Graphics
Convection coefficients at building surfaces
NASA Astrophysics Data System (ADS)
Kammerud, R. C.; Altmayer, E.; Bauman, F. S.; Gadgil, A.; Bohn, M.
1982-09-01
Correlations relating the rate of heat transfer from the surfaces of rooms to the enclosed air are being developed, based on empirical and analytic examinations of convection in enclosures. The correlations express the heat transfer rate in terms of boundary conditions relating to room geometry and surface temperatures. Work to date indicates that simple convection coefficient calculation techniques can be developed, which significantly improve accuracy of heat transfer predictions in comparison with the standard calculations recommended by ASHRAE.
Measurement of reaeration coefficients for selected Florida streams
Hampson, P.S.; Coffin, J.E.
1989-01-01
A total of 29 separate reaeration coefficient determinations were performed on 27 subreaches of 12 selected Florida streams between October 1981 and May 1985. Measurements performed prior to June 1984 were made using the peak and area methods with ethylene and propane as the tracer gases. Later measurements utilized the steady-state method with propane as the only tracer gas. The reaeration coefficients ranged from 1.07 to 45.9 days with a mean estimated probable error of +/16.7%. Ten predictive equations (compiled from the literature) were also evaluated using the measured coefficients. The most representative equation was one of the energy dissipation type with a standard error of 60.3%. Seven of the 10 predictive additional equations were modified using the measured coefficients and nonlinear regression techniques. The most accurate of the developed equations was also of the energy dissipation form and had a standard error of 54.9%. For 5 of the 13 subreaches in which both ethylene and propane were used, the ethylene data resulted in substantially larger reaeration coefficient values which were rejected. In these reaches, ethylene concentrations were probably significantly affected by one or more electrophilic addition reactions known to occur in aqueous media. (Author 's abstract)
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context.
Incremental Aerodynamic Coefficient Database for the USA2
NASA Technical Reports Server (NTRS)
Richardson, Annie Catherine
2016-01-01
In March through May of 2016, a wind tunnel test was conducted by the Aerosciences Branch (EV33) to visually study the unsteady aerodynamic behavior over multiple transition geometries for the Universal Stage Adapter 2 (USA2) in the MSFC Aerodynamic Research Facility's Trisonic Wind Tunnel (TWT). The purpose of the test was to make a qualitative comparison of the transonic flow field in order to provide a recommended minimum transition radius for manufacturing. Additionally, 6 Degree of Freedom force and moment data for each configuration tested was acquired in order to determine the geometric effects on the longitudinal aerodynamic coefficients (Normal Force, Axial Force, and Pitching Moment). In order to make a quantitative comparison of the aerodynamic effects of the USA2 transition geometry, the aerodynamic coefficient data collected during the test was parsed and incorporated into a database for each USA2 configuration tested. An incremental aerodynamic coefficient database was then developed using the generated databases for each USA2 geometry as a function of Mach number and angle of attack. The final USA2 coefficient increments will be applied to the aerodynamic coefficients of the baseline geometry to adjust the Space Launch System (SLS) integrated launch vehicle force and moment database based on the transition geometry of the USA2.
Liu, Cong; Kolarik, Barbara; Gunnarsen, Lars; Zhang, Yinping
2015-10-20
Polychlorinated biphenyls (PCBs) have been found to be persistent in the environment and possibly harmful. Many buildings are characterized with high PCB concentrations. Knowledge about partitioning between primary sources and building materials is critical for exposure assessment and practical remediation of PCB contamination. This study develops a C-depth method to determine diffusion coefficient (D) and partition coefficient (K), two key parameters governing the partitioning process. For concrete, a primary material studied here, relative standard deviations of results among five data sets are 5%-22% for K and 42-66% for D. Compared with existing methods, C-depth method overcomes the inability to obtain unique estimation for nonlinear regression and does not require assumed correlations for D and K among congeners. Comparison with a more sophisticated two-term approach implies significant uncertainty for D, and smaller uncertainty for K. However, considering uncertainties associated with sampling and chemical analysis, and impact of environmental factors, the results are acceptable for engineering applications. This was supported by good agreement between model prediction and measurement. Sensitivity analysis indicated that effective diffusion distance, contacting time of materials with primary sources, and depth of measured concentrations are critical for determining D, and PCB concentration in primary sources is critical for K.
Comparative analysis of regression and artificial neural network models for wind speed prediction
NASA Astrophysics Data System (ADS)
Bilgili, Mehmet; Sahin, Besir
2010-11-01
In this study, wind speed was modeled by linear regression (LR), nonlinear regression (NLR) and artificial neural network (ANN) methods. A three-layer feedforward artificial neural network structure was constructed and a backpropagation algorithm was used for the training of ANNs. To get a successful simulation, firstly, the correlation coefficients between all of the meteorological variables (wind speed, ambient temperature, atmospheric pressure, relative humidity and rainfall) were calculated taking two variables in turn for each calculation. All independent variables were added to the simple regression model. Then, the method of stepwise multiple regression was applied for the selection of the “best” regression equation (model). Thus, the best independent variables were selected for the LR and NLR models and also used in the input layer of the ANN. The results obtained by all methods were compared to each other. Finally, the ANN method was found to provide better performance than the LR and NLR methods.
Sternberg, Maya R.; Schleicher, Rosemary L.; Pfeiffer, Christine M.
2016-01-01
The collection of papers in this journal supplement provides insight into the association of various covariates with concentrations of biochemical indicators of diet and nutrition (biomarkers), beyond age, race and sex using linear regression. We studied 10 specific sociodemographic and lifestyle covariates in combination with 29 biomarkers from NHANES 2003–2006 for persons ≥20 y. The covariates were organized into 2 chunks, sociodemographic (age, sex, race-ethnicity, education, and income) and lifestyle (dietary supplement use, smoking, alcohol consumption, BMI, and physical activity) and fit in hierarchical fashion using each chunk or set of related variables to determine how covariates, jointly, are related to biomarker concentrations. In contrast to many regression modeling applications, all variables were retained in a full regression model regardless of statistical significance to preserve the interpretation of the statistical properties of beta coefficients, P-values and CI, and to keep the interpretation consistent across a set of biomarkers. The variables were pre-selected prior to data analysis and the data analysis plan was designed at the outset to minimize the reporting of false positive findings by limiting the amount of preliminary hypothesis testing. While we generally found that demographic differences seen in biomarkers were over- or under-estimated when ignoring other key covariates, the demographic differences generally remained statistically significant after adjusting for sociodemographic and lifestyle variables. These papers are intended to provide a foundation to researchers to help them generate hypotheses for future studies or data analyses and/or develop predictive regression models using the wealth of NHANES data. PMID:23596165
Process modeling with the regression network.
van der Walt, T; Barnard, E; van Deventer, J
1995-01-01
A new connectionist network topology called the regression network is proposed. The structural and underlying mathematical features of the regression network are investigated. Emphasis is placed on the intricacies of the optimization process for the regression network and some measures to alleviate these difficulties of optimization are proposed and investigated. The ability of the regression network algorithm to perform either nonparametric or parametric optimization, as well as a combination of both, is also highlighted. It is further shown how the regression network can be used to model systems which are poorly understood on the basis of sparse data. A semi-empirical regression network model is developed for a metallurgical processing operation (a hydrocyclone classifier) by building mechanistic knowledge into the connectionist structure of the regression network model. Poorly understood aspects of the process are provided for by use of nonparametric regions within the structure of the semi-empirical connectionist model. The performance of the regression network model is compared to the corresponding generalization performance results obtained by some other nonparametric regression techniques.
Quantile regression applied to spectral distance decay
Rocchini, D.; Cade, B.S.
2008-01-01
Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.
Geodesic least squares regression on information manifolds
Verdoolaege, Geert
2014-12-05
We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.
Adjustable Induction-Heating Coil
NASA Technical Reports Server (NTRS)
Ellis, Rod; Bartolotta, Paul
1990-01-01
Improved design for induction-heating work coil facilitates optimization of heating in different metal specimens. Three segments adjusted independently to obtain desired distribution of temperature. Reduces time needed to achieve required temperature profiles.
Nozato, Hideaki; Morita, Shigeru; Goto, Motoshi
2005-07-15
Recycling coefficients of carbon, aluminum, and titanium were evaluated using a new technique combining impurity pellet injection with high-spatial resolution bremsstrahlung measurement in hydrogen and helium plasmas on the large helical device. The recycling coefficient of impurities was investigated by measuring absolute intensities with the visible bremsstrahlung array. The time evolution of the bremsstrahlung signals was modeled by an impurity transport code adjusting the diffusion coefficient, convective velocity, and recycling coefficient. As a result, a finite value of the recycling coefficient was required in the case of carbon, whereas aluminum and titanium were explained as nonrecycled particles. It was also clarified that the recycling coefficient of carbon had a larger value in hydrogen plasmas (R=0.5-0.65) than in helium plasmas (R=0-0.2), suggesting the formation of hydrogen molecules.
A Modified Gauss-Jordan Procedure as an Alternative to Iterative Procedures in Multiple Regression.
ERIC Educational Resources Information Center
Roscoe, John T.; Kittleson, Howard M.
Correlation matrices involving linear dependencies are common in educational research. In such matrices, there is no unique solution for the multiple regression coefficients. Although computer programs using iterative techniques are used to overcome this problem, these techniques possess certain disadvantages. Accordingly, a modified Gauss-Jordan…
Life adjustment correlates of physical self-concepts.
Sonstroem, R J; Potts, S A
1996-05-01
This research tested relationships between physical self-concepts and contemporary measures of life adjustment. University students (119 females, 126 males) completed the Physical Self-Perception Profile assessing self-concepts of sport competence, physical condition, attractive body, strength, and general physical self-worth. Multiple regression found significant associations (P < 0.05 to P < 0.001) in hypothesized directions between physical self-concepts and positive affect, negative affect, depression, and health complaints in 17 of 20 analyses. Thirteen of these relationships remained significant when controlling for the Bonferroni effect. Hierarchical multiple regression examined the unique contribution of physical self-perceptions in predicting each adjustment variable after accounting for the effects of global self-esteem and two measures of social desirability. Physical self-concepts significantly improved associations with life adjustment (P < 0.05 to P < 0.05) in three of the eight analyses across gender and approached significance in three others. These data demonstrate that self-perceptions of physical competence in college students are essentially related to life adjustment, independent of the effects of social desirability and global self-esteem. These links are mainly with perceptions of sport competence in males and with perceptions of physical condition, attractive body, and general physical self-worth in both males and females.
Integrating Risk Adjustment and Enrollee Premiums in Health Plan Payment
McGuire, Thomas G.; Glazer, Jacob; Newhouse, Joseph P.; Normand, Sharon-Lise; Shi, Julie; Sinaiko, Anna D.; Zuvekas, Samuel
2013-01-01
In two important health policy contexts – private plans in Medicare and the new state-run “Exchanges” created as part of the Affordable Care Act (ACA) – plan payments come from two sources: risk-adjusted payments from a Regulator and premiums charged to individual enrollees. This paper derives principles for integrating risk-adjusted payments and premium policy in individual health insurance markets based on fitting total plan payments to health plan costs per person as closely as possible. A least squares regression including both health status and variables used in premiums reveals the weights a Regulator should put on risk adjusters when markets determine premiums. We apply the methods to an Exchange-eligible population drawn from the Medical Expenditure Panel Survey (MEPS). PMID:24308878
Exploring Mexican American adolescent romantic relationship profiles and adjustment
Moosmann, Danyel A.V.; Roosa, Mark W.
2015-01-01
Although Mexican Americans are the largest ethnic minority group in the nation, knowledge is limited regarding this population's adolescent romantic relationships. This study explored whether 12th grade Mexican Americans’ (N = 218; 54% female) romantic relationship characteristics, cultural values, and gender created unique latent classes and if so, whether they were linked to adjustment. Latent class analyses suggested three profiles including, relatively speaking, higher, satisfactory, and lower quality romantic relationships. Regression analyses indicated these profiles had distinct associations with adjustment. Specifically, adolescents with higher and satisfactory quality romantic relationships reported greater future family expectations, higher self-esteem, and fewer externalizing symptoms than those with lower quality romantic relationships. Similarly, adolescents with higher quality romantic relationships reported greater academic self-efficacy and fewer sexual partners than those with lower quality romantic relationships. Overall, results suggested higher quality romantic relationships were most optimal for adjustment. Future research directions and implications are discussed. PMID:26141198
Predicting cognitive data from medical images using sparse linear regression.
Kandel, Benjamin M; Wolk, David A; Gee, James C; Avants, Brian
2013-01-01
We present a new framework for predicting cognitive or other continuous-variable data from medical images. Current methods of probing the connection between medical images and other clinical data typically use voxel-based mass univariate approaches. These approaches do not take into account the multivariate, network-based interactions between the various areas of the brain and do not give readily interpretable metrics that describe how strongly cognitive function is related to neuroanatomical structure. On the other hand, high-dimensional machine learning techniques do not typically provide a direct method for discovering which parts of the brain are used for making predictions. We present a framework, based on recent work in sparse linear regression, that addresses both drawbacks of mass univariate approaches, while preserving the direct spatial interpretability that they provide. In addition, we present a novel optimization algorithm that adapts the conjugate gradient method for sparse regression on medical imaging data. This algorithm produces coefficients that are more interpretable than existing sparse regression techniques.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Barros, Aluísio JD; Hirakata, Vânia N
2003-01-01
Background Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. Methods We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Results Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Conclusions Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations. PMID:14567763
Applying land use regression model to estimate spatial variation of PM₂.₅ in Beijing, China.
Wu, Jiansheng; Li, Jiacheng; Peng, Jian; Li, Weifeng; Xu, Guang; Dong, Chengcheng
2015-05-01
Fine particulate matter (PM2.5) is the major air pollutant in Beijing, posing serious threats to human health. Land use regression (LUR) has been widely used in predicting spatiotemporal variation of ambient air-pollutant concentrations, though restricted to the European and North American context. We aimed to estimate spatiotemporal variations of PM2.5 by building separate LUR models in Beijing. Hourly routine PM2.5 measurements were collected at 35 sites from 4th March 2013 to 5th March 2014. Seventy-seven predictor variables were generated in GIS, including street network, land cover, population density, catering services distribution, bus stop density, intersection density, and others. Eight LUR models were developed on annual, seasonal, peak/non-peak, and incremental concentration subsets. The annual mean concentration across all sites is 90.7 μg/m(3) (SD = 13.7). PM2.5 shows more temporal variation than spatial variation, indicating the necessity of building different models to capture spatiotemporal trends. The adjusted R (2) of these models range between 0.43 and 0.65. Most LUR models are driven by significant predictors including major road length, vegetation, and water land use. Annual outdoor exposure in Beijing is as high as 96.5 μg/m(3). This is among the first LUR studies implemented in a seriously air-polluted Chinese context, which generally produce acceptable results and reliable spatial air-pollution maps. Apart from the models for winter and incremental concentration, LUR models are driven by similar variables, suggesting that the spatial variations of PM2.5 remain steady for most of the time. Temporal variations are explained by the intercepts, and spatial variations in the measurements determine the strength of variable coefficients in our models.
A Robust Multiple Correlation Coefficient for the Rank Analysis of Linear Models.
1983-09-01
A multiple correlation coefficient is discussed to measure the degree of association between a random variable Y and a set of random variables X sub...approach of analyzing linear models in a regression, prediction context. The population parameter equals the classical multiple correlation ... coefficient if the multivariate normal model holds but would be more robust for departures from this model. Some results are given on the consistency of the sample estimate and on a test for independence. (Author)
FITTING OF THE DATA FOR DIFFUSION COEFFICIENTS IN UNSATURATED POROUS MEDIA
B. Bullard
1999-05-01
The purpose of this calculation is to evaluate diffusion coefficients in unsaturated porous media for use in the TSPA-VA analyses. Using experimental data, regression techniques were used to curve fit the diffusion coefficient in unsaturated porous media as a function of volumetric water content. This calculation substantiates the model fit used in Total System Performance Assessment-1995 An Evaluation of the Potential Yucca Mountain Repository (TSPA-1995), Section 6.5.4.
NASA Astrophysics Data System (ADS)
Mitra, Ashis; Majumdar, Prabal Kumar; Bannerjee, Debamalya
2013-03-01
This paper presents a comparative analysis of two modeling methodologies for the prediction of air permeability of plain woven handloom cotton fabrics. Four basic fabric constructional parameters namely ends per inch, picks per inch, warp count and weft count have been used as inputs for artificial neural network (ANN) and regression models. Out of the four regression models tried, interaction model showed very good prediction performance with a meager mean absolute error of 2.017 %. However, ANN models demonstrated superiority over the regression models both in terms of correlation coefficient and mean absolute error. The ANN model with 10 nodes in the single hidden layer showed very good correlation coefficient of 0.982 and 0.929 and mean absolute error of only 0.923 and 2.043 % for training and testing data respectively.
Suppression Situations in Multiple Linear Regression
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
Principles of Quantile Regression and an Application
ERIC Educational Resources Information Center
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
Three-Dimensional Modeling in Linear Regression.
ERIC Educational Resources Information Center
Herman, James D.
Linear regression examines the relationship between one or more independent (predictor) variables and a dependent variable. By using a particular formula, regression determines the weights needed to minimize the error term for a given set of predictors. With one predictor variable, the relationship between the predictor and the dependent variable…
A Practical Guide to Regression Discontinuity
ERIC Educational Resources Information Center
Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard
2012-01-01
Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Does an active adjustment of aerodynamic drag make sense?
NASA Astrophysics Data System (ADS)
Maciejewski, Marek
2016-09-01
The article concerns evaluation of the possible impact of the gap between the tractor and semitrailer on the aerodynamic drag coefficient. The aim here is not to adjust this distance depending on the geometrical shape of the tractor and trailer, but depending solely on the speed of articulated vehicle. All the tests have form of numerical simulations. The method of simulation is briefly explained in the article. It considers various issues such as the range and objects of tests as well as the test conditions. The initial (pre-adaptive) and final (after adaptation process) computational meshes have been presented as illustrations. Some of the results have been presented in the form of run chart showing the change of value of aerodynamic drag coefficients in time, for different geometric configurations defined by a clearance gap between the tractor and semitrailer. The basis for a detailed analysis and conclusions were the averaged (in time) aerodynamic drag coefficients as a function of the clearance gap.
Atherosclerotic plaque regression: fact or fiction?
Shanmugam, Nesan; Román-Rego, Ana; Ong, Peter; Kaski, Juan Carlos
2010-08-01
Coronary artery disease is the major cause of death in the western world. The formation and rapid progression of atheromatous plaques can lead to serious cardiovascular events in patients with atherosclerosis. The better understanding, in recent years, of the mechanisms leading to atheromatous plaque growth and disruption and the availability of powerful HMG CoA-reductase inhibitors (statins) has permitted the consideration of plaque regression as a realistic therapeutic goal. This article reviews the existing evidence underpinning current therapeutic strategies aimed at achieving atherosclerotic plaque regression. In this review we also discuss imaging modalities for the assessment of plaque regression, predictors of regression and whether plaque regression is associated with a survival benefit.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria.
Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data.
Di, Yanming; Emerson, Sarah C; Schafer, Daniel W; Kimbrel, Jeffrey A; Chang, Jeff H
2013-03-26
RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments.
Haem, Elham; Harling, Kajsa; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf; Karlsson, Mats O
2017-02-01
One important aim in population pharmacokinetics (PK) and pharmacodynamics is identification and quantification of the relationships between the parameters and covariates. Lasso has been suggested as a technique for simultaneous estimation and covariate selection. In linear regression, it has been shown that Lasso possesses no oracle properties, which means it asymptotically performs as though the true underlying model was given in advance. Adaptive Lasso (ALasso) with appropriate initial weights is claimed to possess oracle properties; however, it can lead to poor predictive performance when there is multicollinearity between covariates. This simulation study implemented a new version of ALasso, called adjusted ALasso (AALasso), to take into account the ratio of the standard error of the maximum likelihood (ML) estimator to the ML coefficient as the initial weight in ALasso to deal with multicollinearity in non-linear mixed-effect models. The performance of AALasso was compared with that of ALasso and Lasso. PK data was simulated in four set-ups from a one-compartment bolus input model. Covariates were created by sampling from a multivariate standard normal distribution with no, low (0.2), moderate (0.5) or high (0.7) correlation. The true covariates influenced only clearance at different magnitudes. AALasso, ALasso and Lasso were compared in terms of mean absolute prediction error and error of the estimated covariate coefficient. The results show that AALasso performed better in small data sets, even in those in which a high correlation existed between covariates. This makes AALasso a promising method for covariate selection in nonlinear mixed-effect models.
Almost efficient estimation of relative risk regression
Fitzmaurice, Garrett M.; Lipsitz, Stuart R.; Arriaga, Alex; Sinha, Debajyoti; Greenberg, Caprice; Gawande, Atul A.
2014-01-01
Relative risks (RRs) are often considered the preferred measures of association in prospective studies, especially when the binary outcome of interest is common. In particular, many researchers regard RRs to be more intuitively interpretable than odds ratios. Although RR regression is a special case of generalized linear models, specifically with a log link function for the binomial (or Bernoulli) outcome, the resulting log-binomial regression does not respect the natural parameter constraints. Because log-binomial regression does not ensure that predicted probabilities are mapped to the [0,1] range, maximum likelihood (ML) estimation is often subject to numerical instability that leads to convergence problems. To circumvent these problems, a number of alternative approaches for estimating RR regression parameters have been proposed. One approach that has been widely studied is the use of Poisson regression estimating equations. The estimating equations for Poisson regression yield consistent, albeit inefficient, estimators of the RR regression parameters. We consider the relative efficiency of the Poisson regression estimator and develop an alternative, almost efficient estimator for the RR regression parameters. The proposed method uses near-optimal weights based on a Maclaurin series (Taylor series expanded around zero) approximation to the true Bernoulli or binomial weight function. This yields an almost efficient estimator while avoiding convergence problems. We examine the asymptotic relative efficiency of the proposed estimator for an increase in the number of terms in the series. Using simulations, we demonstrate the potential for convergence problems with standard ML estimation of the log-binomial regression model and illustrate how this is overcome using the proposed estimator. We apply the proposed estimator to a study of predictors of pre-operative use of beta blockers among patients undergoing colorectal surgery after diagnosis of colon cancer. PMID
SCALE Continuous-Energy Eigenvalue Sensitivity Coefficient Calculations
Perfetti, Christopher M.; Rearden, Bradley T.; Martin, William R.
2016-02-25
Sensitivity coefficients describe the fractional change in a system response that is induced by changes to system parameters and nuclear data. The Tools for Sensitivity and UNcertainty Analysis Methodology Implementation (TSUNAMI) code within the SCALE code system makes use of eigenvalue sensitivity coefficients for an extensive number of criticality safety applications, including quantifying the data-induced uncertainty in the eigenvalue of critical systems, assessing the neutronic similarity between different critical systems, and guiding nuclear data adjustment studies. The need to model geometrically complex systems with improved fidelity and the desire to extend TSUNAMI analysis to advanced applications has motivated the developmentmore » of a methodology for calculating sensitivity coefficients in continuous-energy (CE) Monte Carlo applications. The Contributon-Linked eigenvalue sensitivity/Uncertainty estimation via Tracklength importance CHaracterization (CLUTCH) and Iterated Fission Probability (IFP) eigenvalue sensitivity methods were recently implemented in the CE-KENO framework of the SCALE code system to enable TSUNAMI-3D to perform eigenvalue sensitivity calculations using continuous-energy Monte Carlo methods. This work provides a detailed description of the theory behind the CLUTCH method and describes in detail its implementation. This work explores the improvements in eigenvalue sensitivity coefficient accuracy that can be gained through the use of continuous-energy sensitivity methods and also compares several sensitivity methods in terms of computational efficiency and memory requirements.« less
SCALE Continuous-Energy Eigenvalue Sensitivity Coefficient Calculations
Perfetti, Christopher M.; Rearden, Bradley T.; Martin, William R.
2016-02-25
Sensitivity coefficients describe the fractional change in a system response that is induced by changes to system parameters and nuclear data. The Tools for Sensitivity and UNcertainty Analysis Methodology Implementation (TSUNAMI) code within the SCALE code system makes use of eigenvalue sensitivity coefficients for an extensive number of criticality safety applications, including quantifying the data-induced uncertainty in the eigenvalue of critical systems, assessing the neutronic similarity between different critical systems, and guiding nuclear data adjustment studies. The need to model geometrically complex systems with improved fidelity and the desire to extend TSUNAMI analysis to advanced applications has motivated the development of a methodology for calculating sensitivity coefficients in continuous-energy (CE) Monte Carlo applications. The Contributon-Linked eigenvalue sensitivity/Uncertainty estimation via Tracklength importance CHaracterization (CLUTCH) and Iterated Fission Probability (IFP) eigenvalue sensitivity methods were recently implemented in the CE-KENO framework of the SCALE code system to enable TSUNAMI-3D to perform eigenvalue sensitivity calculations using continuous-energy Monte Carlo methods. This work provides a detailed description of the theory behind the CLUTCH method and describes in detail its implementation. This work explores the improvements in eigenvalue sensitivity coefficient accuracy that can be gained through the use of continuous-energy sensitivity methods and also compares several sensitivity methods in terms of computational efficiency and memory requirements.
Wang, Ming; Flanders, W Dana; Bostick, Roberd M; Long, Qi
2012-12-20
Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. Although a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a 'hybrid' approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific errors. We illustrate our method by using data from a colorectal adenoma study.
NASA Astrophysics Data System (ADS)
Zhang, Ying; Bi, Peng; Hiller, Janet
2008-01-01
This is the first study to identify appropriate regression models for the association between climate variation and salmonellosis transmission. A comparison between different regression models was conducted using surveillance data in Adelaide, South Australia. By using notified salmonellosis cases and climatic variables from the Adelaide metropolitan area over the period 1990-2003, four regression methods were examined: standard Poisson regression, autoregressive adjusted Poisson regression, multiple linear regression, and a seasonal autoregressive integrated moving average (SARIMA) model. Notified salmonellosis cases in 2004 were used to test the forecasting ability of the four models. Parameter estimation, goodness-of-fit and forecasting ability of the four regression models were compared. Temperatures occurring 2 weeks prior to cases were positively associated with cases of salmonellosis. Rainfall was also inversely related to the number of cases. The comparison of the goodness-of-fit and forecasting ability suggest that the SARIMA model is better than the other three regression models. Temperature and rainfall may be used as climatic predictors of salmonellosis cases in regions with climatic characteristics similar to those of Adelaide. The SARIMA model could, thus, be adopted to quantify the relationship between climate variations and salmonellosis transmission.
A visual detection model for DCT coefficient quantization
NASA Technical Reports Server (NTRS)
Ahumada, Albert J., Jr.; Watson, Andrew B.
1994-01-01
The discrete cosine transform (DCT) is widely used in image compression and is part of the JPEG and MPEG compression standards. The degree of compression and the amount of distortion in the decompressed image are controlled by the quantization of the transform coefficients. The standards do not specify how the DCT coefficients should be quantized. One approach is to set the quantization level for each coefficient so that the quantization error is near the threshold of visibility. Results from previous work are combined to form the current best detection model for DCT coefficient quantization noise. This model predicts sensitivity as a function of display parameters, enabling quantization matrices to be designed for display situations varying in luminance, veiling light, and spatial frequency related conditions (pixel size, viewing distance, and aspect ratio). It also allows arbitrary color space directions for the representation of color. A model-based method of optimizing the quantization matrix for an individual image was developed. The model described above provides visual thresholds for each DCT frequency. These thresholds are adjusted within each block for visual light adaptation and contrast masking. For given quantization matrix, the DCT quantization errors are scaled by the adjusted thresholds to yield perceptual errors. These errors are pooled nonlinearly over the image to yield total perceptual error. With this model one may estimate the quantization matrix for a particular image that yields minimum bit rate for a given total perceptual error, or minimum perceptual error for a given bit rate. Custom matrices for a number of images show clear improvement over image-independent matrices. Custom matrices are compatible with the JPEG standard, which requires transmission of the quantization matrix.
Adjusting to Chronic Health Conditions.
Helgeson, Vicki S; Zajdel, Melissa
2017-01-03
Research on adjustment to chronic disease is critical in today's world, in which people are living longer lives, but lives are increasingly likely to be characterized by one or more chronic illnesses. Chronic illnesses may deteriorate, enter remission, or fluctuate, but their defining characteristic is that they persist. In this review, we first examine the effects of chronic disease on one's sense of self. Then we review categories of factors that influence how one adjusts to chronic illness, with particular emphasis on the impact of these factors on functional status and psychosocial adjustment. We begin with contextual factors, including demographic variables such as sex and race, as well as illness dimensions such as stigma and illness identity. We then examine a set of dispositional factors that influence chronic illness adjustment, organizing these into resilience and vulnerability factors. Resilience factors include cognitive adaptation indicators, personality variables, and benefit-finding. Vulnerability factors include a pessimistic attributional style, negative gender-related traits, and rumination. We then turn to social environmental variables, including both supportive and unsupportive interactions. Finally, we review chronic illness adjustment within the context of dyadic coping. We conclude by examining potential interactions among these classes of variables and outlining a set of directions for future research.
Li, L; Kleinman, K; Gillman, M W
2014-12-01
We implemented six confounding adjustment methods: (1) covariate-adjusted regression, (2) propensity score (PS) regression, (3) PS stratification, (4) PS matching with two calipers, (5) inverse probability weighting and (6) doubly robust estimation to examine the associations between the body mass index (BMI) z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding v. formula only (n=437) and cesarean section v. vaginal delivery (n=1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were -0.33 (95% CI -0.53, -0.13) and -0.24 (-0.46, -0.02), respectively. The other approaches resulted in smaller n (204-276) because of poor overlap of covariates, but CIs were of similar width except for inverse probability weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from -0.01 to -0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method.
Kleinman, Ken; Gillman, Matthew W.
2014-01-01
We implemented 6 confounding adjustment methods: 1) covariate-adjusted regression, 2) propensity score (PS) regression, 3) PS stratification, 4) PS matching with two calipers, 5) inverse-probability-weighting, and 6) doubly-robust estimation to examine the associations between the BMI z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding versus formula only (N = 437) and cesarean section versus vaginal delivery (N = 1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were −0.33 (95% CI −0.53, −0.13) and −0.24 (−0.46, −0.02), respectively. The other approaches resulted in smaller N (204 to 276) because of poor overlap of covariates, but CIs were of similar width except for inverse-probability-weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from −0.01 to −0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method. PMID:25171142
Busch, Robyn M; Lineweaver, Tara T; Ferguson, Lisa; Haut, Jennifer S
2015-06-01
Reliable change indices (RCIs) and standardized regression-based (SRB) change score norms permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRB change score norms for use in children with epilepsy. Sixty-three children with epilepsy (age range: 6-16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice effect-adjusted RCIs and SRB change score norms were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children's Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. Reliable change indices and SRB change score norms for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRB change score norms for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An Excel sheet to perform all relevant calculations is also available to interested clinicians or researchers.
Technology Transfer Automated Retrieval System (TEKTRAN)
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immun...
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
MCCB warm adjustment testing concept
NASA Astrophysics Data System (ADS)
Erdei, Z.; Horgos, M.; Grib, A.; Preradović, D. M.; Rodic, V.
2016-08-01
This paper presents an experimental investigation in to operating of thermal protection device behavior from an MCCB (Molded Case Circuit Breaker). One of the main functions of the circuit breaker is to assure protection for the circuits where mounted in for possible overloads of the circuit. The tripping mechanism for the overload protection is based on a bimetal movement during a specific time frame. This movement needs to be controlled and as a solution to control this movement we choose the warm adjustment concept. This concept is meant to improve process capability control and final output. The warm adjustment device design will create a unique adjustment of the bimetal position for each individual breaker, determined when the testing current will flow thru a phase which needs to trip in a certain amount of time. This time is predetermined due to scientific calculation for all standard types of amperages and complies with the IEC 60497 standard requirements.
The emission coefficient of uranium plasmas
NASA Technical Reports Server (NTRS)
Schneider, R. T.; Campbell, H. D.; Mack, J. M.
1973-01-01
The emission coefficient for uranium plasmas (Temperature: 8000 K) was measured for the wavelength range (200 A - 6000 A). The results are compared to theory and other measurements. The absorption coefficient for the same wavelength interval is also given.
Relative risk regression analysis of epidemiologic data.
Prentice, R L
1985-11-01
Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation
Comparable-Worth Adjustments: Yes--Comparable-Worth Adjustments: No.
ERIC Educational Resources Information Center
Galloway, Sue; O'Neill, June
1985-01-01
Two essays address the issue of pay equity and present opinions favoring and opposing comparable-worth adjustments. Movement of women out of traditionally female jobs, the limits of "equal pay," fairness of comparable worth and market-based wages, implementation and efficiency of comparable worth system, and alternatives to comparable…
Technology Transfer Automated Retrieval System (TEKTRAN)
In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...
NASA Astrophysics Data System (ADS)
Darnah
2016-04-01
Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.
Regressive language in severe head injury.
Thomsen, I V; Skinhoj, E
1976-09-01
In a follow-up study of 50 patients with severe head injuries three patients had echolalia. One patient with initially global aphasia had echolalia for some weeks when he started talking. Another patient with severe diffuse brain damage, dementia, and emotional regression had echolalia. The dysfunction was considered a detour performance. In the third patient echolalia and palilalia were details in a total pattern of regression lasting for months. The patient, who had extensive frontal atrophy secondary to a very severe head trauma, presented an extreme state of regression returning to a foetal-body pattern and behaving like a baby.
Regression of altitude-produced cardiac hypertrophy.
NASA Technical Reports Server (NTRS)
Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.
1973-01-01
The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.
A Note on the Dynamic Correlation Coefficient.
1977-11-04
The use of the dynamic correlation coefficient as a test of spuriousness in longitudinal designs was examined. It was shown that given conditions of...spuriousness and perfect stationarity, the dynamic correlation coefficient was positively, rather than inversely, related to spuriousness. It was...recommended that the dynamic correlation coefficient not be used in the future as a test of spuriousness. (Author)
Soccer Ball Lift Coefficients via Trajectory Analysis
ERIC Educational Resources Information Center
Goff, John Eric; Carre, Matt J.
2010-01-01
We performed experiments in which a soccer ball was launched from a machine while two high-speed cameras recorded portions of the trajectory. Using the trajectory data and published drag coefficients, we extracted lift coefficients for a soccer ball. We determined lift coefficients for a wide range of spin parameters, including several spin…
M-Bonomial Coefficients and Their Identities
ERIC Educational Resources Information Center
Asiru, Muniru A.
2010-01-01
In this note, we introduce M-bonomial coefficients or (M-bonacci binomial coefficients). These are similar to the binomial and the Fibonomial (or Fibonacci-binomial) coefficients and can be displayed in a triangle similar to Pascal's triangle from which some identities become obvious.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-11
... Surface Transportation Board Railroad Cost Recovery Procedures--Productivity Adjustment; Quarterly Rail... Railroads that the Board restate the previously published productivity adjustment for the 2003-2007 averaging period (2007 productivity adjustment) so that it tracks the 2007 productivity adjustment...
A new bivariate negative binomial regression model
NASA Astrophysics Data System (ADS)
Faroughi, Pouya; Ismail, Noriszura
2014-12-01
This paper introduces a new form of bivariate negative binomial (BNB-1) regression which can be fitted to bivariate and correlated count data with covariates. The BNB regression discussed in this study can be fitted to bivariate and overdispersed count data with positive, zero or negative correlations. The joint p.m.f. of the BNB1 distribution is derived from the product of two negative binomial marginals with a multiplicative factor parameter. Several testing methods were used to check overdispersion and goodness-of-fit of the model. Application of BNB-1 regression is illustrated on Malaysian motor insurance dataset. The results indicated that BNB-1 regression has better fit than bivariate Poisson and BNB-2 models with regards to Akaike information criterion.
An introduction to multilevel regression models.
Austin, P C; Goel, V; van Walraven, C
2001-01-01
Data in health research are frequently structured hierarchically. For example, data may consist of patients nested within physicians, who in turn may be nested in hospitals or geographic regions. Fitting regression models that ignore the hierarchical structure of the data can lead to false inferences being drawn from the data. Implementing a statistical analysis that takes into account the hierarchical structure of the data requires special methodologies. In this paper, we introduce the concept of hierarchically structured data, and present an introduction to hierarchical regression models. We then compare the performance of a traditional regression model with that of a hierarchical regression model on a dataset relating test utilization at the annual health exam with patient and physician characteristics. In comparing the resultant models, we see that false inferences can be drawn by ignoring the structure of the data.
Multiple Instance Regression with Structured Data
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.
Bayesian Comparison of Two Regression Lines.
ERIC Educational Resources Information Center
Tsutakawa, Robert K.
1978-01-01
A Bayesian solution is presented for the Johnson-Neyman problem (whether or not the distance between two regression lines is statistically significant over a finite interval of the independent variable). (Author/CTM)
TWSVR: Regression via Twin Support Vector Machine.
Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh
2016-02-01
Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets.
Marginal longitudinal semiparametric regression via penalized splines
Kadiri, M. Al; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models. PMID:21037941
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Adjustable Optical-Fiber Attenuator
NASA Technical Reports Server (NTRS)
Buzzetti, Mike F.
1994-01-01
Adjustable fiber-optic attenuator utilizes bending loss to reduce strength of light transmitted along it. Attenuator functions without introducing measurable back-reflection or insertion loss. Relatively insensitive to vibration and changes in temperature. Potential applications include cable television, telephone networks, other signal-distribution networks, and laboratory instrumentation.
Dyadic Adjustment: An Ecosystemic Examination.
ERIC Educational Resources Information Center
Wilson, Stephan M.; Larson, Jeffry H.; McCulloch, B. Jan; Stone, Katherine L.
1997-01-01
Examines the relationship of background, individual, and family influences on dyadic adjustment, using an ecological perspective. Data from 102 married couples were used. Age at marriage for husbands, emotional health for wives, and number of marriage and family problems as well as family life satisfaction for both were related to dyadic…
Problems of Adjustment to School.
ERIC Educational Resources Information Center
Bartolini, Leandro A.
This paper, one of several written for a comprehensive policy study of early childhood education in Illinois, examines and summarizes the literature on the problems of young children in adjusting to starting school full-time and describes the nature and extent of their difficulties in relation to statewide educational policy. The review of studies…
Economic Pressures and Family Adjustment.
ERIC Educational Resources Information Center
Haccoun, Dorothy Markiewicz; Ledingham, Jane E.
The relationships between economic stress on the family and child and parental adjustment were examined for a sample of 199 girls and boys in grades one, four, and seven. These associations were examined separately for families in which both parents were present and in which mothers only were at home. Economic stress was associated with boys'…
Delgado, J; Liao, J C
1992-01-01
The methodology previously developed for determining the Flux Control Coefficients [Delgado & Liao (1992) Biochem. J. 282, 919-927] is extended to the calculation of metabolite Concentration Control Coefficients. It is shown that the transient metabolite concentrations are related by a few algebraic equations, attributed to mass balance, stoichiometric constraints, quasi-equilibrium or quasi-steady states, and kinetic regulations. The coefficients in these relations can be estimated using linear regression, and can be used to calculate the Control Coefficients. The theoretical basis and two examples are discussed. Although the methodology is derived based on the linear approximation of enzyme kinetics, it yields reasonably good estimates of the Control Coefficients for systems with non-linear kinetics. PMID:1497632
Variable-Domain Functional Regression for Modeling ICU Data.
Gellar, Jonathan E; Colantuoni, Elizabeth; Needham, Dale M; Crainiceanu, Ciprian M
2014-12-01
We introduce a class of scalar-on-function regression models with subject-specific functional predictor domains. The fundamental idea is to consider a bivariate functional parameter that depends both on the functional argument and on the width of the functional predictor domain. Both parametric and nonparametric models are introduced to fit the functional coefficient. The nonparametric model is theoretically and practically invariant to functional support transformation, or support registration. Methods were motivated by and applied to a study of association between daily measures of the Intensive Care Unit (ICU) Sequential Organ Failure Assessment (SOFA) score and two outcomes: in-hospital mortality, and physical impairment at hospital discharge among survivors. Methods are generally applicable to a large number of new studies that record a continuous variables over unequal domains.
Second virial coefficients for chain molecules
Bokis, C.P.; Donohue, M.D. . Dept. of Chemical Engineering); Hall, C.K. . Dept. of Chemical Engineering)
1994-01-01
The importance of having accurate second virial coefficients in phase equilibrium calculations, especially for the calculation of dew points, is discussed. The square-well potentials results in a simple but inaccurate equation for the second virial coefficient for small, spherical molecules such as argon. Here, the authors present a new equation for the second virial coefficient of both spherical molecules and chain molecules which is written in a form similar to that for the square-well potential. This new equation is accurate in comparison to Monte Carlo simulation data on second virial coefficients for square-well chain molecules and with second virial coefficients obtained from experiments on n-alkanes.
[Iris movement mediates pupillary membrane regression].
Morizane, Yuki
2007-11-01
In the course of mammalian lens development, a transient capillary meshwork called as the pupillary membrane (PM) forms. It is located in the pupil area to nourish the anterior surface of the lens, and then regresses to clear the optical path. Although the involvement of the apoptotic process has been reported in PM regression, the initiating factor remains unknown. We initially found that regression of the PM coincided with the development of iris motility, and that iris movement caused cessation and resumption of blood flow within the PM. Therefore, we investigated whether the development of the capacity of the iris to constrict and dilate can function as an essential signal that induces apoptosis in the PM. Continuous inhibition of iris movement with mydriatic agents suppressed apoptosis of the PM and resulted in the persistence of PM in rats. The distribution of apoptotic cells in the regressing PM was diffuse and showed no apparent localization. These results indicated that iris movement induced regression of the PM by changing the blood flow within it. This study suggests the importance of the physiological interactions between tissues-in this case, the iris and the PM-as a signal to advance vascular regression during organ development.
Multiple-Instance Regression with Structured Data
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.
Analysis of oscillatory motion of a light airplane at high values of lift coefficient
NASA Technical Reports Server (NTRS)
Batterson, J. G.
1983-01-01
A modified stepwise regression is applied to flight data from a light research air-plane operating at high angles at attack. The well-known phenomenon referred to as buckling or porpoising is analyzed and modeled using both power series and spline expansions of the aerodynamic force and moment coefficients associated with the longitudinal equations of motion.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio (PR) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95%CI: 1.005-1.265), 1.128(95%CI: 1.001-1.264) and 1.132(95%CI: 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95% CI: 1.055-1.206) and 1.126(95% CI: 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR, which was 1.125 (95%CI: 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR. Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Detection of Cutting Tool Wear using Statistical Analysis and Regression Model
NASA Astrophysics Data System (ADS)
Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin
2010-10-01
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.
Cefalu, Matthew; Dominici, Francesca
2014-07-01
In environmental epidemiology, we are often faced with 2 challenges. First, an exposure prediction model is needed to estimate the exposure to an agent of interest, ideally at the individual level. Second, when estimating the health effect associated with the exposure, confounding adjustment is needed in the health-effects regression model. The current literature addresses these 2 challenges separately. That is, methods that account for measurement error in the predicted exposure often fail to acknowledge the possibility of confounding, whereas methods designed to control confounding often fail to acknowledge that the exposure has been predicted. In this article, we consider exposure prediction and confounding adjustment in a health-effects regression model simultaneously. Using theoretical arguments and simulation studies, we show that the bias of a health-effect estimate is influenced by the exposure prediction model, the type of confounding adjustment used in the health-effects regression model, and the relationship between these 2. Moreover, we argue that even with a health-effects regression model that properly adjusts for confounding, the use of a predicted exposure can bias the health-effect estimate unless all confounders included in the health-effects regression model are also included in the exposure prediction model. While these results of this article were motivated by studies of environmental contaminants, they apply more broadly to any context where an exposure needs to be predicted.
Favorable Selection, Risk Adjustment, and the Medicare Advantage Program
Morrisey, Michael A; Kilgore, Meredith L; Becker, David J; Smith, Wilson; Delzell, Elizabeth
2013-01-01
Objectives To examine the effects of changes in payment and risk adjustment on (1) the annual enrollment and switching behavior of Medicare Advantage (MA) beneficiaries, and (2) the relative costliness of MA enrollees and disenrollees. Data From 1999 through 2008 national Medicare claims data from the 5 percent longitudinal sample of Parts A and B expenditures. Study Design Retrospective, fixed effects regression analysis of July enrollment and year-long switching into and out of MA. Similar regression analysis of the costliness of those switching into (out of) MA in the 6 months prior to enrollment (after disenrollment) relative to nonswitchers in the same county over the same period. Findings Payment generosity and more sophisticated risk adjustment were associated with substantial increases in MA enrollment and decreases in disenrollment. Claims experience of those newly switching into MA was not affected by any of the policy reforms, but disenrollment became increasingly concentrated among high-cost beneficiaries. Conclusions Enrollment is very sensitive to payment levels. The use of more sophisticated risk adjustment did not alter favorable selection into MA, but it did affect the costliness of disenrollees. PMID:23088500
Poisson Regression Analysis of Illness and Injury Surveillance Data
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra
Chang, Chun-Ming; Yin, Wen-Yao; Wei, Chang-Kao; Wu, Chin-Chia; Su, Yu-Chieh; Yu, Chia-Hui; Lee, Ching-Chih
2016-01-01
Background Identification of patients at risk of death from cancer surgery should aid in preoperative preparation. The purpose of this study is to assess and adjust the age-adjusted Charlson comorbidity index (ACCI) to identify cancer patients with increased risk of perioperative mortality. Methods We identified 156,151 patients undergoing surgery for one of the ten common cancers between 2007 and 2011 in the Taiwan National Health Insurance Research Database. Half of the patients were randomly selected, and a multivariate logistic regression analysis was used to develop an adjusted-ACCI score for estimating the risk of 90-day mortality by variables from the original ACCI. The score was validated. The association between the score and perioperative mortality was analyzed. Results The adjusted-ACCI score yield a better discrimination on mortality after cancer surgery than the original ACCI score, with c-statics of 0.75 versus 0.71. Over 80 years of age, 70–80 years, and renal disease had the strongest impact on mortality, hazard ratios 8.40, 3.63, and 3.09 (P < 0.001), respectively. The overall 90-day mortality rates in the entire cohort varied from 0.9%, 2.9%, 7.0%, and 13.2% in four risk groups stratifying by the adjusted-ACCI score; the adjusted hazard ratio for score 4–7, 8–11, and ≥ 12 was 2.84, 6.07, and 11.17 (P < 0.001), respectively, in 90-day mortality compared to score 0–3. Conclusions The adjusted-ACCI score helps to identify patients with a higher risk of 90-day mortality after cancer surgery. It might be particularly helpful for preoperative evaluation of patients over 80 years of age. PMID:26848761
Molloy, Cynthia A; Morrow, Ardythe L; Meinzen-Derr, Jareen; Dawson, Geraldine; Bernier, Raphael; Dunn, Michelle; Hyman, Susan L; McMahon, William M; Goudie-Nice, Julie; Hepburn, Susan; Minshew, Nancy; Rogers, Sally; Sigman, Marian; Spence, M Anne; Tager-Flusberg, Helen; Volkmar, Fred R; Lord, Catherine
2006-04-01
A multicenter study of 308 children with Autism Spectrum Disorder (ASD) was conducted through the Collaborative Programs of Excellence in Autism (CPEA), sponsored by the National Institute of Child Health and Human Development, to compare the family history of autoimmune disorders in children with ASD with and without a history of regression. A history of regression was determined from the results of the Autism Diagnostic Interview-Revised (ADI-R). Family history of autoimmune disorders was obtained by telephone interview. Regression was significantly associated with a family history of autoimmune disorders (adjusted OR=1.89; 95% CI: 1.17, 3.10). The only specific autoimmune disorder found to be associated with regression was autoimmune thyroid disease (adjusted OR=2.09; 95% CI: 1.28, 3.41).
Predicting Salt Permeability Coefficients in Highly Swollen, Highly Charged Ion Exchange Membranes.
Kamcev, Jovan; Paul, Donald R; Manning, Gerald S; Freeman, Benny D
2017-02-01
This study presents a framework for predicting salt permeability coefficients in ion exchange membranes in contact with an aqueous salt solution. The model, based on the solution-diffusion mechanism, was tested using experimental salt permeability data for a series of commercial ion exchange membranes. Equilibrium salt partition coefficients were calculated using a thermodynamic framework (i.e., Donnan theory), incorporating Manning's counterion condensation theory to calculate ion activity coefficients in the membrane phase and the Pitzer model to calculate ion activity coefficients in the solution phase. The model predicted NaCl partition coefficients in a cation exchange membrane and two anion exchange membranes, as well as MgCl2 partition coefficients in a cation exchange membrane, remarkably well at higher external salt concentrations (>0.1 M) and reasonably well at lower external salt concentrations (<0.1 M) with no adjustable parameters. Membrane ion diffusion coefficients were calculated using a combination of the Mackie and Meares model, which assumes ion diffusion in water-swollen polymers is affected by a tortuosity factor, and a model developed by Manning to account for electrostatic effects. Agreement between experimental and predicted salt diffusion coefficients was good with no adjustable parameters. Calculated salt partition and diffusion coefficients were combined within the framework of the solution-diffusion model to predict salt permeability coefficients. Agreement between model and experimental data was remarkably good. Additionally, a simplified version of the model was used to elucidate connections between membrane structure (e.g., fixed charge group concentration) and salt transport properties.
Time course for tail regression during metamorphosis of the ascidian Ciona intestinalis.
Matsunobu, Shohei; Sasakura, Yasunori
2015-09-01
In most ascidians, the tadpole-like swimming larvae dramatically change their body-plans during metamorphosis and develop into sessile adults. The mechanisms of ascidian metamorphosis have been researched and debated for many years. Until now information on the detailed time course of the initiation and completion of each metamorphic event has not been described. One dramatic and important event in ascidian metamorphosis is tail regression, in which ascidian larvae lose their tails to adjust themselves to sessile life. In the present study, we measured the time associated with tail regression in the ascidian Ciona intestinalis. Larvae are thought to acquire competency for each metamorphic event in certain developmental periods. We show that the timing with which the competence for tail regression is acquired is determined by the time since hatching, and this timing is not affected by the timing of post-hatching events such as adhesion. Because larvae need to adhere to substrates with their papillae to induce tail regression, we measured the duration for which larvae need to remain adhered in order to initiate tail regression and the time needed for the tail to regress. Larvae acquire the ability to adhere to substrates before they acquire tail regression competence. We found that when larvae adhered before they acquired tail regression competence, they were able to remember the experience of adhesion until they acquired the ability to undergo tail regression. The time course of the events associated with tail regression provides a valuable reference, upon which the cellular and molecular mechanisms of ascidian metamorphosis can be elucidated.
Reconstruction of missing daily streamflow data using dynamic regression models
NASA Astrophysics Data System (ADS)
Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault
2015-12-01
River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.
Assessment of Weighted Quantile Sum Regression for Modeling Chemical Mixtures and Cancer Risk
Czarnota, Jenna; Gennings, Chris; Wheeler, David C
2015-01-01
In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case–control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome. PMID:26005323
Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk.
Czarnota, Jenna; Gennings, Chris; Wheeler, David C
2015-01-01
In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case-control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome.
How to use linear regression and correlation in quantitative method comparison studies.
Twomey, P J; Kroll, M H
2008-04-01
Linear regression methods try to determine the best linear relationship between data points while correlation coefficients assess the association (as opposed to agreement) between the two methods. Linear regression and correlation play an important part in the interpretation of quantitative method comparison studies. Their major strength is that they are widely known and as a result both are employed in the vast majority of method comparison studies. While previously performed by hand, the availability of statistical packages means that regression analysis is usually performed by software packages including MS Excel, with or without the software programe Analyze-it as well as by other software packages. Such techniques need to be employed in a way that compares the agreement between the two methods examined and more importantly, because we are dealing with individual patients, whether the degree of agreement is clinically acceptable. Despite their use for many years, there is a lot of ignorance about the validity as well as the pros and cons of linear regression and correlation techniques. This review article describes the types of linear regression and regression (parametric and non-parametric methods) and the necessary general and specific requirements. The selection of the type of regression depends on where one has been trained, the tradition of the laboratory and the availability of adequate software.
Artificial neural network and regression models for flow velocity at sediment incipient deposition
NASA Astrophysics Data System (ADS)
Safari, Mir-Jafar-Sadegh; Aksoy, Hafzullah; Mohammadi, Mirali
2016-10-01
A set of experiments for the determination of flow characteristics at sediment incipient deposition has been carried out in a trapezoidal cross-section channel. Using experimental data, a regression model is developed for computing velocity of flow in a trapezoidal cross-section channel at the incipient deposition condition and is presented together with already available regression models of rectangular, circular, and U-shape channels. A generalized regression model is also provided by combining the available data of any cross-section. For comparison of the models, a powerful tool, the artificial neural network (ANN) is used for modelling incipient deposition of sediment in rigid boundary channels. Three different ANN techniques, namely, the feed-forward back propagation (FFBP), generalized regression (GR), and radial basis function (RBF), are applied using six input variables; flow discharge, flow depth, channel bed slope, hydraulic radius, relative specific mass of sediment and median size of sediment particles; all taken from laboratory experiments. Hydrodynamic forces acting on sediment particles in the flow are considered in the regression models indirectly for deriving particle Froude number and relative particle size, both being dimensionless. The accuracy of the models is studied by the root mean square error (RMSE), the mean absolute percentage error (MAPE), the discrepancy ratio (Dr) and the concordance coefficient (CC). Evaluation of the models finds ANN models superior and some regression models with an acceptable performance. Therefore, it is concluded that appropriately constructed ANN and regression models can be developed and used for the rigid boundary channel design.
Does religiosity help Muslims adjust to death?: a research note.
Hossain, Mohammad Samir; Siddique, Mohammad Zakaria
2008-01-01
Death is the end of life. But Muslims believe death is an event between two lives, not an absolute cessation of life. Thus religiosity may influence Muslims differently about death. To explore the impact of religious perception, thus religiosity, a cross-sectional, descriptive, analytic and correlational study was conducted on 150 Muslims. Self-declared healthy Muslims equally from both sexes (N = 150, Age range--20 to 50 years, Minimum education--Bachelor) were selected by stratified sampling and randomly under each stratum. Subjects, divided in five levels of religiosity, were assessed and scored for the presence of maladjustment symptoms and stage of adjustment with death. ANOVA and correlation coefficient was applied on the sets of data collected. All statistical tests were done at the level of 95% confidence (P < 0.05). Final results were higher than the table values used for ANOVA and correlation coefficient yielded P values of < 0.05, < 0.01, and < 0.001. Religiosity as a criterion of Muslims influenced the quality of adjustment with death positively. So we hypothesized that religiosity may help Muslims adjust to death.
THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE
Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan
2015-01-01
Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran’s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran’s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. Results: of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran’s libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Conclusions: Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries. PMID:26622203
Testing of a Fiber Optic Wear, Erosion and Regression Sensor
NASA Technical Reports Server (NTRS)
Korman, Valentin; Polzin, Kurt A.
2011-01-01
The nature of the physical processes and harsh environments associated with erosion and wear in propulsion environments makes their measurement and real-time rate quantification difficult. A fiber optic sensor capable of determining the wear (regression, erosion, ablation) associated with these environments has been developed and tested in a number of different applications to validate the technique. The sensor consists of two fiber optics that have differing attenuation coefficients and transmit light to detectors. The ratio of the two measured intensities can be correlated to the lengths of the fiber optic lines, and if the fibers and the host parent material in which they are embedded wear at the same rate the remaining length of fiber provides a real-time measure of the wear process. Testing in several disparate situations has been performed, with the data exhibiting excellent qualitative agreement with the theoretical description of the process and when a separate calibrated regression measurement is available good quantitative agreement is obtained as well. The light collected by the fibers can also be used to optically obtain the spectra and measure the internal temperature of the wear layer.
Bayesian latent factor regression for functional and longitudinal data.
Montagna, Silvia; Tokdar, Surya T; Neelon, Brian; Dunson, David B
2012-12-01
In studies involving functional data, it is commonly of interest to model the impact of predictors on the distribution of the curves, allowing flexible effects on not only the mean curve but also the distribution about the mean. Characterizing the curve for each subject as a linear combination of a high-dimensional set of potential basis functions, we place a sparse latent factor regression model on the basis coefficients. We induce basis selection by choosing a shrinkage prior that allows many of the loadings to be close to zero. The number of latent factors is treated as unknown through a highly-efficient, adaptive-blocked Gibbs sampler. Predictors are included on the latent variables level, while allowing different predictors to impact different latent factors. This model induces a framework for functional response regression in which the distribution of the curves is allowed to change flexibly with predictors. The performance is assessed through simulation studies and the methods are applied to data on blood pressure trajectories during pregnancy.
12 CFR 19.240 - Inflation adjustments.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 12 Banks and Banking 1 2012-01-01 2012-01-01 false Inflation adjustments. 19.240 Section 19.240... PROCEDURE Civil Money Penalty Inflation Adjustments § 19.240 Inflation adjustments. (a) The maximum amount of each civil money penalty within the OCC's jurisdiction is adjusted in accordance with the...
12 CFR 19.240 - Inflation adjustments.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 1 2010-01-01 2010-01-01 false Inflation adjustments. 19.240 Section 19.240... PROCEDURE Civil Money Penalty Inflation Adjustments § 19.240 Inflation adjustments. (a) The maximum amount... Civil Penalties Inflation Adjustment Act of 1990 (28 U.S.C. 2461 note) as follows: ER10NO08.001 (b)...
12 CFR 19.240 - Inflation adjustments.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 1 2011-01-01 2011-01-01 false Inflation adjustments. 19.240 Section 19.240... PROCEDURE Civil Money Penalty Inflation Adjustments § 19.240 Inflation adjustments. (a) The maximum amount... Civil Penalties Inflation Adjustment Act of 1990 (28 U.S.C. 2461 note) as follows: ER10NO08.001 (b)...
Adjusting to University: The Hong Kong Experience
ERIC Educational Resources Information Center
Yau, Hon Keung; Sun, Hongyi; Cheng, Alison Lai Fong
2012-01-01
Students' adjustment to the university environment is an important factor in predicting university outcomes and is crucial to their future achievements. University support to students' transition to university life can be divided into three dimensions: academic adjustment, social adjustment and psychological adjustment. However, these…
MULTILINEAR TENSOR REGRESSION FOR LONGITUDINAL RELATIONAL DATA.
Hoff, Peter D
2015-09-01
A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points. This is done via a separable, or Kronecker-structured, regression parameter along with a separable covariance model. In the context of an analysis of longitudinal multivariate relational data, it is shown how the multilinear tensor regression model can represent patterns that often appear in relational and network data, such as reciprocity and transitivity.
Hyperglycemia impairs atherosclerosis regression in mice.
Gaudreault, Nathalie; Kumar, Nikit; Olivas, Victor R; Eberlé, Delphine; Stephens, Kyle; Raffai, Robert L
2013-12-01
Diabetic patients are known to be more susceptible to atherosclerosis and its associated cardiovascular complications. However, the effects of hyperglycemia on atherosclerosis regression remain unclear. We hypothesized that hyperglycemia impairs atherosclerosis regression by modulating the biological function of lesional macrophages. HypoE (Apoe(h/h)Mx1-Cre) mice express low levels of apolipoprotein E (apoE) and develop atherosclerosis when fed a high-fat diet. Atherosclerosis regression occurs in these mice upon plasma lipid lowering induced by a change in diet and the restoration of apoE expression. We examined the morphological characteristics of regressed lesions and assessed the biological function of lesional macrophages isolated with laser-capture microdissection in euglycemic and hyperglycemic HypoE mice. Hyperglycemia induced by streptozotocin treatment impaired lesion size reduction (36% versus 14%) and lipid loss (38% versus 26%) after the reversal of hyperlipidemia. However, decreases in lesional macrophage content and remodeling in both groups of mice were similar. Gene expression analysis revealed that hyperglycemia impaired cholesterol transport by modulating ATP-binding cassette A1, ATP-binding cassette G1, scavenger receptor class B family member (CD36), scavenger receptor class B1, and wound healing pathways in lesional macrophages during atherosclerosis regression. Hyperglycemia impairs both reduction in size and loss of lipids from atherosclerotic lesions upon plasma lipid lowering without significantly affecting the remodeling of the vascular wall.
Regression models for estimating coseismic landslide displacement
Jibson, R.W.
2007-01-01
Newmark's sliding-block model is widely used to estimate coseismic slope performance. Early efforts to develop simple regression models to estimate Newmark displacement were based on analysis of the small number of strong-motion records then available. The current availability of a much larger set of strong-motion records dictates that these regression equations be updated. Regression equations were generated using data derived from a collection of 2270 strong-motion records from 30 worldwide earthquakes. The regression equations predict Newmark displacement in terms of (1) critical acceleration ratio, (2) critical acceleration ratio and earthquake magnitude, (3) Arias intensity and critical acceleration, and (4) Arias intensity and critical acceleration ratio. These equations are well constrained and fit the data well (71% < R2 < 88%), but they have standard deviations of about 0.5 log units, such that the range defined by the mean ?? one standard deviation spans about an order of magnitude. These regression models, therefore, are not recommended for use in site-specific design, but rather for regional-scale seismic landslide hazard mapping or for rapid preliminary screening of sites. ?? 2007 Elsevier B.V. All rights reserved.
Pawel, David; Leggett, Richard Wayne; Eckerman, Keith F; Nelson, Christopher
2007-01-01
Federal Guidance Report No. 13 (FGR 13) provides risk coefficients for estimation of the risk of cancer due to low-level exposure to each of more than 800 radionuclides. Uncertainties in risk coefficients were quantified in FGR 13 for 33 cases (exposure to each of 11 radionuclides by each of three exposure pathways) on the basis of sensitivity analyses in which various combinations of plausible biokinetic, dosimetric, and radiation risk models were used to generate alternative risk coefficients. The present report updates the uncertainty analysis in FGR 13 for the cases of inhalation and ingestion of radionuclides and expands the analysis to all radionuclides addressed in that report. The analysis indicates that most risk coefficients for inhalation or ingestion of radionuclides are determined within a factor of 5 or less by current information. That is, application of alternate plausible biokinetic and dosimetric models and radiation risk models (based on the linear, no-threshold hypothesis with an adjustment for the dose and dose rate effectiveness factor) is unlikely to change these coefficients by more than a factor of 5. In this analysis the assessed uncertainty in the radiation risk model was found to be the main determinant of the uncertainty category for most risk coefficients, but conclusions concerning the relative contributions of risk and dose models to the total uncertainty in a risk coefficient may depend strongly on the method of assessing uncertainties in the risk model.
Coverage-adjusted entropy estimation.
Vu, Vincent Q; Yu, Bin; Kass, Robert E
2007-09-20
Data on 'neural coding' have frequently been analyzed using information-theoretic measures. These formulations involve the fundamental and generally difficult statistical problem of estimating entropy. We review briefly several methods that have been advanced to estimate entropy and highlight a method, the coverage-adjusted entropy estimator (CAE), due to Chao and Shen that appeared recently in the environmental statistics literature. This method begins with the elementary Horvitz-Thompson estimator, developed for sampling from a finite population, and adjusts for the potential new species that have not yet been observed in the sample-these become the new patterns or 'words' in a spike train that have not yet been observed. The adjustment is due to I. J. Good, and is called the Good-Turing coverage estimate. We provide a new empirical regularization derivation of the coverage-adjusted probability estimator, which shrinks the maximum likelihood estimate. We prove that the CAE is consistent and first-order optimal, with rate O(P)(1/log n), in the class of distributions with finite entropy variance and that, within the class of distributions with finite qth moment of the log-likelihood, the Good-Turing coverage estimate and the total probability of unobserved words converge at rate O(P)(1/(log n)(q)). We then provide a simulation study of the estimator with standard distributions and examples from neuronal data, where observations are dependent. The results show that, with a minor modification, the CAE performs much better than the MLE and is better than the best upper bound estimator, due to Paninski, when the number of possible words m is unknown or infinite.
Aerosol Angstrom Absorption Coefficient Comparisons during MILAGRO.
NASA Astrophysics Data System (ADS)
Marley, N. A.; Marchany-Rivera, A.; Kelley, K. L.; Mangu, A.; Gaffney, J. S.
2007-12-01
aerosol Angstrom absorption exponents by linear regression over the entire UV-visible spectral range. These results are compared to results obtained from the absorbance measurements obtained in the field. The differences in calculated Angstrom absorption exponents between the field and laboratory measurements are attributed partly to the differences in time resolution of the sample collection resulting in heavier particle pileup on the filter surface of the 12-hour samples. Some differences in calculated results can also be attributed to the presence of narrow band absorbers below 400 nm that do not fall in the wavelengths covered by the 7 wavelengths of the aethalometer. 1. Marley, N.A., J.S. Gaffney, J.C. Baird, C.A. Blazer, P.J. Drayton, and J.E. Frederick, "The determination of scattering and absorption coefficients of size-fractionated aerosols for radiative transfer calculations." Aerosol Sci. Technol., 34, 535-549, (2001). This work was conducted as part of the Department of Energy's Atmospheric Science Program as part of the Megacity Aerosol Experiment - Mexico City during MILAGRO. This research was supported by the Office of Science (BER), U.S. Department of Energy Grant No. DE-FG02-07ER64329. We also wish to thank Mexican Scientists and students for their assistance from the Instituto Mexicano de Petroleo (IMP) and CENICA.
Spontaneous skin regression and predictors of skin regression in Thai scleroderma patients.
Foocharoen, Chingching; Mahakkanukrauh, Ajanee; Suwannaroj, Siraphop; Nanagara, Ratanavadee
2011-09-01
Skin tightness is a major clinical manifestation of systemic sclerosis (SSc). Importantly for both clinicians and patients, spontaneous regression of the fibrosis process has been documented. The purpose of this study is to identify the incidence and related clinical characteristics of spontaneous regression among Thai SSc patients. A historical cohort with 4 years of follow-up was performed among SSc patients over 15 years of age diagnosed with SSc between January 1, 2005 and December 31, 2006 in Khon Kaen, Thailand. The start date was the date of the first symptom and the end date was the date of the skin score ≤2. To estimate the respective probability of regression and to assess the associated factors, the Kaplan-Meier method and Cox regression analysis was used. One hundred seventeen cases of SSc were included with a female to male ratio of 1.5:1. Thirteen patients (11.1%) experienced regression. The incidence rate of spontaneous skin regression was 0.31 per 100 person-months and the average duration of SSc at the time of regression was 35.9±15.6 months (range, 15.7-60 months). The factors that negatively correlated with regression were (a) diffuse cutaneous type, (b) Raynaud's phenomenon, (c) esophageal dysmotility, and (d) colchicine treatment at onset with a respective hazard ratio (HR) of 0.19, 0.19, 0.26, and 0.20. By contrast, the factor that positively correlated with regression was active alveolitis with cyclophosphamide therapy at onset with an HR of 4.23 (95% CI, 1.23-14.10). After regression analysis, only Raynaud's phenomenon at onset and diffuse cutaneous type had a significantly negative correlation to regression. A spontaneous regression of the skin fibrosis process was not uncommon among Thai SSc patients. The factors suggesting a poor predictor for cutaneous manifestation were Raynaud's phenomenon, diffuse cutaneous type while early cyclophosphamide therapy might be related to a better skin outcome.
Multiple regression approach to optimize drilling operations in the Arabian Gulf area
Al-Betairi, E.A.; Moussa, M.M.; Al-Otaibi, S.
1988-03-01
This paper reports a successful application of multiple regression analysis, supported by a detailed statistical study to verify the Bourgoyne and Young model. The model estimates the optimum penetration rate (ROP), weight on bit (WOB), and rotary speed under the effect of controllable and uncontrollable factors. Field data from three wells in the Arabian Gulf were used and emphasized the validity of this model. The model coefficients are sensitive to the number of points included. The correlation coefficients and multicollinearity sensitivity of each drilling parameter on the ROP are studied.
Computing aspects of power for multiple regression.
Dunlap, William P; Xin, Xue; Myers, Leann
2004-11-01
Rules of thumb for power in multiple regression research abound. Most such rules dictate the necessary sample size, but they are based only upon the number of predictor variables, usually ignoring other critical factors necessary to compute power accurately. Other guides to power in multiple regression typically use approximate rather than precise equations for the underlying distribution; entail complex preparatory computations; require interpolation with tabular presentation formats; run only under software such as Mathmatica or SAS that may not be immediately available to the user; or are sold to the user as parts of power computation packages. In contrast, the program we offer herein is immediately downloadable at no charge, runs under Windows, is interactive, self-explanatory, flexible to fit the user's own regression problems, and is as accurate as single precision computation ordinarily permits.
Uncertainty quantification in DIC with Kriging regression
NASA Astrophysics Data System (ADS)
Wang, Dezhi; DiazDelaO, F. A.; Wang, Weizhuo; Lin, Xiaoshan; Patterson, Eann A.; Mottershead, John E.
2016-03-01
A Kriging regression model is developed as a post-processing technique for the treatment of measurement uncertainty in classical subset-based Digital Image Correlation (DIC). Regression is achieved by regularising the sample-point correlation matrix using a local, subset-based, assessment of the measurement error with assumed statistical normality and based on the Sum of Squared Differences (SSD) criterion. This leads to a Kriging-regression model in the form of a Gaussian process representing uncertainty on the Kriging estimate of the measured displacement field. The method is demonstrated using numerical and experimental examples. Kriging estimates of displacement fields are shown to be in excellent agreement with 'true' values for the numerical cases and in the experimental example uncertainty quantification is carried out using the Gaussian random process that forms part of the Kriging model. The root mean square error (RMSE) on the estimated displacements is produced and standard deviations on local strain estimates are determined.
Sicras-Mainar, Antoni; Navarro-Artieda, Ruth; Blanca-Tamayo, Milagrosa; Velasco-Velasco, Soledad; Escribano-Herranz, Esperanza; Llopart-López, Josep Ramon; Violan-Fors, Concepción; Vilaseca-Llobet, Josep Maria; Sánchez-Fontcuberta, Encarna; Benavent-Areu, Jaume; Flor-Serra, Ferran; Aguado-Jodar, Alba; Rodríguez-López, Daniel; Prados-Torres, Alejandra; Estelrich-Bennasar, Jose
2009-01-01
Background The main objective of this study is to measure the relationship between morbidity, direct health care costs and the degree of clinical effectiveness (resolution) of health centres and health professionals by the retrospective application of Adjusted Clinical Groups in a Spanish population setting. The secondary objectives are to determine the factors determining inadequate correlations and the opinion of health professionals on these instruments. Methods/Design We will carry out a multi-centre, retrospective study using patient records from 15 primary health care centres and population data bases. The main measurements will be: general variables (age and sex, centre, service [family medicine, paediatrics], and medical unit), dependent variables (mean number of visits, episodes and direct costs), co-morbidity (Johns Hopkins University Adjusted Clinical Groups Case-Mix System) and effectiveness. The totality of centres/patients will be considered as the standard for comparison. The efficiency index for visits, tests (laboratory, radiology, others), referrals, pharmaceutical prescriptions and total will be calculated as the ratio: observed variables/variables expected by indirect standardization. The model of cost/patient/year will differentiate fixed/semi-fixed (visits) costs of the variables for each patient attended/year (N = 350,000 inhabitants). The mean relative weights of the cost of care will be obtained. The effectiveness will be measured using a set of 50 indicators of process, efficiency and/or health results, and an adjusted synthetic index will be constructed (method: percentile 50). The correlation between the efficiency (relative-weights) and synthetic (by centre and physician) indices will be established using the coefficient of determination. The opinion/degree of acceptance of physicians (N = 1,000) will be measured using a structured questionnaire including various dimensions. Statistical analysis: multiple regression analysis (procedure
Predictors of sociocultural adjustment among sojourning Malaysian students in Britain.
Swami, Viren
2009-08-01
The process of cross-cultural migration may be particularly difficult for students travelling overseas for further or higher education, especially where qualitative differences exist between the home and host nations. The present study examined the sociocultural adjustment of sojourning Malaysian students in Britain. Eighty-one Malay and 110 Chinese students enrolled in various courses answered a self-report questionnaire that examined various aspects of sociocultural adjustment. A series of one-way analyses of variance showed that Malay participants experienced poorer sociocultural adjustment in comparison with their Chinese counterparts. They were also less likely than Chinese students to have contact with co-nationals and host nationals, more likely to perceive their actual experience in Britain as worse than they had expected, and more likely to perceive greater cultural distance and greater discrimination. The results of regression analyses showed that, for Malay participants, perceived discrimination accounted for the greatest proportion of variance in sociocultural adjustment (73%), followed by English language proficiency (10%) and contact with host nationals (4%). For Chinese participants, English language proficiency was the strongest predictor of sociocultural adjustment (54%), followed by expectations of life in Britain (18%) and contact with host nationals (3%). By contrast, participants' sex, age, and length of residence failed to emerge as significant predictors for either ethnic group. Possible explanations for this pattern of findings are discussed, including the effects of Islamophobia on Malay-Muslims in Britain, possible socioeconomic differences between Malay and Chinese students, and personality differences between the two ethnic groups. The results are further discussed in relation to practical steps that can be taken to improve the sociocultural adjustment of sojourning students in Britain.
A tutorial on Bayesian Normal linear regression
NASA Astrophysics Data System (ADS)
Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens
2015-12-01
Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.
Recursive prescription for logarithmic jet rate coefficients
NASA Astrophysics Data System (ADS)
Gerwick, Erik
2013-11-01
We derive a recursion relation for the analytic leading logarithmic coefficients of a final state gluon cascade. We demonstrate the potential of our method by analytically computing the rate coefficients for the emission of up to 80 gluons in both the exclusive-kt (Durham) and generalized inclusive-kt class of jet algorithms. There is a particularly simple form for the ratios of resolved coefficients. We suggest potential applications for our method including the efficient generation of shower histories.
46 CFR 45.55 - Freeboard coefficient.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 46 Shipping 2 2014-10-01 2014-10-01 false Freeboard coefficient. 45.55 Section 45.55 Shipping... § 45.55 Freeboard coefficient. (a) For ships less than 350 feet in length (L), the freeboard coefficient is P 1 in the formula: P 1=P+A[(L/D)-(L/D s)] where P is a factor, which is a function of...
46 CFR 45.55 - Freeboard coefficient.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 46 Shipping 2 2010-10-01 2010-10-01 false Freeboard coefficient. 45.55 Section 45.55 Shipping... § 45.55 Freeboard coefficient. (a) For ships less than 350 feet in length (L), the freeboard coefficient is P 1 in the formula: P 1=P+A[(L/D)-(L/D s)] where P is a factor, which is a function of...
46 CFR 45.55 - Freeboard coefficient.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 46 Shipping 2 2012-10-01 2012-10-01 false Freeboard coefficient. 45.55 Section 45.55 Shipping... § 45.55 Freeboard coefficient. (a) For ships less than 350 feet in length (L), the freeboard coefficient is P 1 in the formula: P 1=P+A[(L/D)-(L/D s)] where P is a factor, which is a function of...
46 CFR 45.55 - Freeboard coefficient.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 46 Shipping 2 2013-10-01 2013-10-01 false Freeboard coefficient. 45.55 Section 45.55 Shipping... § 45.55 Freeboard coefficient. (a) For ships less than 350 feet in length (L), the freeboard coefficient is P 1 in the formula: P 1=P+A[(L/D)-(L/D s)] where P is a factor, which is a function of...
46 CFR 45.55 - Freeboard coefficient.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 46 Shipping 2 2011-10-01 2011-10-01 false Freeboard coefficient. 45.55 Section 45.55 Shipping... § 45.55 Freeboard coefficient. (a) For ships less than 350 feet in length (L), the freeboard coefficient is P 1 in the formula: P 1=P+A[(L/D)-(L/D s)] where P is a factor, which is a function of...
Jones, Lauren A; Campbell, Jonathan M
2010-01-01
We investigated correlates of language regression for children diagnosed with autism spectrum disorders (ASD). Using archival data, children diagnosed with ASD (N = 114, M age = 41.4 months) were divided into four groups based on language development (i.e., regression, plateau, general delay, no delay) and compared on developmental, adaptive behavior, symptom severity, and behavioral adjustment variables. Few overall differences emerged between groups, including similar non-language developmental history, equal risk for seizure disorder, and comparable behavioral adjustment. Groups did not differ with respect to autism symptomatology as measured by the Autism Diagnostic Observation Schedule and Autism Diagnostic Interview-Revised. Language plateau was associated with better adaptive social skills as measured by the Vineland Adaptive Behavior Scales. Implications and study limitations are discussed.
Zou, G Y; Donner, Allan
2013-12-01
The Poisson regression model using a sandwich variance estimator has become a viable alternative to the logistic regression model for the analysis of prospective studies with independent binary outcomes. The primary advantage of this approach is that it readily provides covariate-adjusted risk ratios and associated standard errors. In this article, the model is extended to studies with correlated binary outcomes as arise in longitudinal or cluster randomization studies. The key step involves a cluster-level grouping strategy for the computation of the middle term in the sandwich estimator. For a single binary exposure variable without covariate adjustment, this approach results in risk ratio estimates and standard errors that are identical to those found in the survey sampling literature. Simulation results suggest that it is reliable for studies with correlated binary data, provided the total number of clusters is at least 50. Data from observational and cluster randomized studies are used to illustrate the methods.
NASA Astrophysics Data System (ADS)
Zhan, Liwei; Li, Chengwei
2017-02-01
A hybrid PSO-SVM-based model is proposed to predict the friction coefficient between aircraft tire and coating. The presented hybrid model combines a support vector machine (SVM) with particle swarm optimization (PSO) technique. SVM has been adopted to solve regression problems successfully. Its regression accuracy is greatly related to optimizing parameters such as the regularization constant C , the parameter gamma γ corresponding to RBF kernel and the epsilon parameter \\varepsilon in the SVM training procedure. However, the friction coefficient which is predicted based on SVM has yet to be explored between aircraft tire and coating. The experiment reveals that drop height and tire rotational speed are the factors affecting friction coefficient. Bearing in mind, the friction coefficient can been predicted using the hybrid PSO-SVM-based model by the measured friction coefficient between aircraft tire and coating. To compare regression accuracy, a grid search (GS) method and a genetic algorithm (GA) are used to optimize the relevant parameters (C , γ and \\varepsilon ), respectively. The regression accuracy could be reflected by the coefficient of determination ({{R}2} ). The result shows that the hybrid PSO-RBF-SVM-based model has better accuracy compared with the GS-RBF-SVM- and GA-RBF-SVM-based models. The agreement of this model (PSO-RBF-SVM) with experiment data confirms its good performance.
Na, Hyunjoo; Dancy, Barbara L; Park, Chang
2015-06-01
The study's purpose was to explore whether frequency of cyberbullying victimization, cognitive appraisals, and coping strategies were associated with psychological adjustments among college student cyberbullying victims. A convenience sample of 121 students completed questionnaires. Linear regression analyses found frequency of cyberbullying victimization, cognitive appraisals, and coping strategies respectively explained 30%, 30%, and 27% of the variance in depression, anxiety, and self-esteem. Frequency of cyberbullying victimization and approach and avoidance coping strategies were associated with psychological adjustments, with avoidance coping strategies being associated with all three psychological adjustments. Interventions should focus on teaching cyberbullying victims to not use avoidance coping strategies.
Salience Assignment for Multiple-Instance Regression
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Lane, Terran
2007-01-01
We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.
The Lax-Onsager regression `theorem' revisited
NASA Astrophysics Data System (ADS)
Lax, Melvin
2000-05-01
It is stated by Ford and O'Connell in this festschrift issue and elsewhere that "there is no quantum regression theorem" although Lax "obtained a formula for correlation in a driven quantum system that has come to be called the quantum regression theorem". This produces a puzzle: "How can it be that a non-existent theorem gives correct results?" Clarification will be provided in this paper by a description of the Lax procedure, with a quantitative estimate of the error for a damped harmonic oscillator based on expressions published in the 1960's.
Measurement of effective air diffusion coefficients for trichloroethene in undisturbed soil cores.
Bartelt-Hunt, Shannon L; Smith, James A
2002-06-01
In this study, we measure effective diffusion coefficients for trichloroethene in undisturbed soil samples taken from Picatinny Arsenal, New Jersey. The measured effective diffusion coefficients ranged from 0.0053 to 0.0609 cm2/s over a range of air-filled porosity of 0.23-0.49. The experimental data were compared to several previously published relations that predict diffusion coefficients as a function of air-filled porosity and porosity. A multiple linear regression analysis was developed to determine if a modification of the exponents in Millington's [Science 130 (1959) 100] relation would better fit the experimental data. The literature relations appeared to generally underpredict the effective diffusion coefficient for the soil cores studied in this work. Inclusion of a particle-size distribution parameter, d10, did not significantly improve the fit of the linear regression equation. The effective diffusion coefficient and porosity data were used to recalculate estimates of diffusive flux through the subsurface made in a previous study performed at the field site. It was determined that the method of calculation used in the previous study resulted in an underprediction of diffusive flux from the subsurface. We conclude that although Millington's [Science 130 (1959) 100] relation works well to predict effective diffusion coefficients in homogeneous soils with relatively uniform particle-size distributions, it may be inaccurate for many natural soils with heterogeneous structure and/or non-uniform particle-size distributions.
Measurement of effective air diffusion coefficients for trichloroethene in undisturbed soil cores
NASA Astrophysics Data System (ADS)
Bartelt-Hunt, Shannon L.; Smith, James A.
2002-06-01
In this study, we measure effective diffusion coefficients for trichloroethene in undisturbed soil samples taken from Picatinny Arsenal, New Jersey. The measured effective diffusion coefficients ranged from 0.0053 to 0.0609 cm 2/s over a range of air-filled porosity of 0.23-0.49. The experimental data were compared to several previously published relations that predict diffusion coefficients as a function of air-filled porosity and porosity. A multiple linear regression analysis was developed to determine if a modification of the exponents in Millington's [Science 130 (1959) 100] relation would better fit the experimental data. The literature relations appeared to generally underpredict the effective diffusion coefficient for the soil cores studied in this work. Inclusion of a particle-size distribution parameter, d10, did not significantly improve the fit of the linear regression equation. The effective diffusion coefficient and porosity data were used to recalculate estimates of diffusive flux through the subsurface made in a previous study performed at the field site. It was determined that the method of calculation used in the previous study resulted in an underprediction of diffusive flux from the subsurface. We conclude that although Millington's [Science 130 (1959) 100] relation works well to predict effective diffusion coefficients in homogeneous soils with relatively uniform particle-size distributions, it may be inaccurate for many natural soils with heterogeneous structure and/or non-uniform particle-size distributions.
Moyer, Douglas; Hirsch, Robert M.; Hyer, Kenneth
2012-01-01
Nutrient and sediment fluxes and changes in fluxes over time are key indicators that water resource managers can use to assess the progress being made in improving the structure and function of the Chesapeake Bay ecosystem. The U.S. Geological Survey collects annual nutrient (nitrogen and phosphorus) and sediment flux data and computes trends that describe the extent to which water-quality conditions are changing within the major Chesapeake Bay tributaries. Two regression-based approaches were compared for estimating annual nutrient and sediment fluxes and for characterizing how these annual fluxes are changing over time. The two regression models compared are the traditionally used ESTIMATOR and the newly developed Weighted Regression on Time, Discharge, and Season (WRTDS). The model comparison focused on answering three questions: (1) What are the differences between the functional form and construction of each model? (2) Which model produces estimates of flux with the greatest accuracy and least amount of bias? (3) How different would the historical estimates of annual flux be if WRTDS had been used instead of ESTIMATOR? One additional point of comparison between the two models is how each model determines trends in annual flux once the year-to-year variations in discharge have been determined. All comparisons were made using total nitrogen, nitrate, total phosphorus, orthophosphorus, and suspended-sediment concentration data collected at the nine U.S. Geological Survey River Input Monitoring stations located on the Susquehanna, Potomac, James, Rappahannock, Appomattox, Pamunkey, Mattaponi, Patuxent, and Choptank Rivers in the Chesapeake Bay watershed. Two model characteristics that uniquely distinguish ESTIMATOR and WRTDS are the fundamental model form and the determination of model coefficients. ESTIMATOR and WRTDS both predict water-quality constituent concentration by developing a linear relation between the natural logarithm of observed constituent
NASA Astrophysics Data System (ADS)
Juan Collados-Lara, Antonio; Pardo-Iguzquiza, Eulogio; Pulido-Velazquez, David
2016-04-01
The estimation of Snow Water Equivalent (SWE) is essential for an appropriate assessment of the available water resources in Alpine catchment. The hydrologic regime in these areas is dominated by the storage of water in the snowpack, which is discharged to rivers throughout the melt season. An accurate estimation of the resources will be necessary for an appropriate analysis of the system operation alternatives using basin scale management models. In order to obtain an appropriate estimation of the SWE we need to know the spatial distribution snowpack and snow density within the Snow Cover Area (SCA). Data for these snow variables can be extracted from in-situ point measurements and air-borne/space-borne remote sensing observations. Different interpolation and simulation techniques have been employed for the estimation of the cited variables. In this paper we propose to estimate snowpack from a reduced number of ground-truth data (1 or 2 campaigns per year with 23 observation point from 2000-2014) and MODIS satellite-based observations in the Sierra Nevada Mountain (Southern Spain). Regression based methodologies has been used to study snowpack distribution using different kind of explicative variables: geographic, topographic, climatic. 40 explicative variables were considered: the longitude, latitude, altitude, slope, eastness, northness, radiation, maximum upwind slope and some mathematical transformation of each of them [Ln(v), (v)^-1; (v)^2; (v)^0.5). Eight different structure of regression models have been tested (combining 1, 2, 3 or 4 explicative variables). Y=B0+B1Xi (1); Y=B0+B1XiXj (2); Y=B0+B1Xi+B2Xj (3); Y=B0+B1Xi+B2XjXl (4); Y=B0+B1XiXk+B2XjXl (5); Y=B0+B1Xi+B2Xj+B3Xl (6); Y=B0+B1Xi+B2Xj+B3XlXk (7); Y=B0+B1Xi+B2Xj+B3Xl+B4Xk (8). Where: Y is the snow depth; (Xi, Xj, Xl, Xk) are the prediction variables (any of the 40 variables); (B0, B1, B2, B3) are the coefficients to be estimated. The ground data are employed to calibrate the multiple regressions. In
van Mierlo, Trevor; Hyatt, Douglas; Ching, Andrew T
2016-01-01
Digital Health Social Networks (DHSNs) are common; however, there are few metrics that can be used to identify participation inequality. The objective of this study was to investigate whether the Gini coefficient, an economic measure of statistical dispersion traditionally used to measure income inequality, could be employed to measure DHSN inequality. Quarterly Gini coefficients were derived from four long-standing DHSNs. The combined data set included 625,736 posts that were generated from 15,181 actors over 18,671 days. The range of actors (8-2323), posts (29-28,684), and Gini coefficients (0.15-0.37) varied. Pearson correlations indicated statistically significant associations between number of actors and number of posts (0.527-0.835, p < .001), and Gini coefficients and number of posts (0.342-0.725, p < .001). However, the association between Gini coefficient and number of actors was only statistically significant for the addiction networks (0.619 and 0.276, p < .036). Linear regression models had positive but mixed R(2) results (0.333-0.527). In all four regression models, the association between Gini coefficient and posts was statistically significant (t = 3.346-7.381, p < .002). However, unlike the Pearson correlations, the association between Gini coefficient and number of actors was only statistically significant in the two mental health networks (t = -4.305 and -5.934, p < .000). The Gini coefficient is helpful in measuring shifts in DHSN inequality. However, as a standalone metric, the Gini coefficient does not indicate optimal numbers or ratios of actors to posts, or effective network engagement. Further, mixed-methods research investigating quantitative performance metrics is required.
A step-by-step regressed pediatric kidney depth formula validated by a reasonable index
Hongwei, Si; Yingmao, Chen; Li, Li; Guangyu, Ma; Liuhai, Shen; Zhifang, Wu; Mingzhe, Shao; Sijin, Li
2017-01-01
Abstract In predicting pediatric kidney depth, we are especially interested in that the errors of most estimates are within a narrow range. Therefore, this study was intended to use the proportion of estimates within a range of −5 to 5 mm (P5 mm) to evaluate the formulas and tried to regress a kidney depth formula for children. The enrolled children aged from 1 to 19 years were randomly sampled into group A and group B (75% and 25% of all recruits, respectively). Using data of the group A, the test formula was regressed by nonlinear regression and subsequently Passing & Bablok regression, and validated in group B. The Raynaud, Gordon, Tonnesen, Taylor, and the test formulas were evaluated in the 2 groups. Accuracy was evaluated by bias, absolute bias, and P5 mm; and precision was evaluated by correlation coefficient. In addition, root-mean square error was used as a mixed index for both accuracy and precision. Body weight, height, and age did not have significant differences between the 2 groups. In the nonlinear regression, coefficients of the formula (kidney depth = a × weight/height + b × age) from group A were in narrower 95% confidence intervals. After the Passing & Bablok regression, biases of left and right kidney estimates were significantly decreased. In the evaluation of formulas, the test formula was obviously better than other formulas mentioned above, and P5 mm for left and right kidneys was about 60%. Among children younger than 10 years, P5 mm was even more than 70% for left and right kidney depths. To predict pediatric kidney depth, accuracy and precision of a step-by-step regressed formula were better than the 4 “standard” formulas. PMID:28353617
Demonstration of a Fiber Optic Regression Probe
NASA Technical Reports Server (NTRS)
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Li, Yanming; Zhu, Ji
2015-01-01
Summary We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functioning groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study. PMID:25732839
Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection
Goldsmith, Jeff; Huang, Lei; Crainiceanu, Ciprian M.
2013-01-01
We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in less than one page in the Appendix. We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white matter microstructure at every voxel of the corpus callosum for hundreds of subjects. PMID:24729670
NASA Astrophysics Data System (ADS)
Martin, Joshua; Wong-Ng, Winnie; Caillat, Thierry; Yonenaga, I.; Green, Martin L.
2014-05-01
The Seebeck coefficient is the most widely measured property specific to thermoelectric materials. There is currently no consensus on measurement protocols, and researchers employ a variety of techniques to measure the Seebeck coefficient. The implementation of standardized measurement protocols and the use of reliable Seebeck Coefficient Standard Reference Materials (SRMs®) will allow the accurate interlaboratory comparison and validation of materials data, thereby accelerating the development and commercialization of more efficient thermoelectric materials and devices. To enable members of the thermoelectric materials community the means to calibrate Seebeck coefficient measurement equipment, NIST certified SRM® 3451 "Low Temperature Seebeck Coefficient Standard (10 K to 390 K)". Due to different practical requirements in instrumentation, sample contact methodology, and thermal stability, a complementary SRM® is required for the high temperature regime (300 K to 900 K). The principal requirement of a SRM® for the Seebeck coefficient at high temperature is thermocyclic stability. We therefore characterized the thermocyclic behavior of the Seebeck coefficient for a series of candidate materials: constantan, p-type single crystal SiGe, and p-type polycrystalline SiGe, by measuring the temperature dependence of the Seebeck coefficient as a function of 10 sequential thermal cycles, between 300 K and 900 K. We employed multiple regression analysis to interpolate and analyze the thermocyclic variability in the measurement curves.
Creativity and Regression on the Rorschach.
ERIC Educational Resources Information Center
Lazar, Billie S.
This paper describes the results of a study to further test and replicate previous studies partially supporting Kris's view that creativity is a regression in the service of the ego. For this sample of 42 female art and business college students, it was predicted that (1) highly creative Ss (measured by the Torrance Tests) produce more, and more…
Locating the Extrema of Fungible Regression Weights
ERIC Educational Resources Information Center
Waller, Niels G.; Jones, Jeff A.
2009-01-01
In a multiple regression analysis with three or more predictors, every set of alternate weights belongs to an infinite class of "fungible weights" (Waller, Psychometrica, "in press") that yields identical "SSE" (sum of squared errors) and R[superscript 2] values. When the R[superscript 2] using the alternate weights is a fixed value, fungible…
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Invariant Ordering of Item-Total Regressions
ERIC Educational Resources Information Center
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas
2011-01-01
A new observable consequence of the property of invariant item ordering is presented, which holds under Mokken's double monotonicity model for dichotomous data. The observable consequence is an invariant ordering of the item-total regressions. Kendall's measure of concordance "W" and a weighted version of this measure are proposed as measures for…
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
buffered reliability, uncertainty quantification, surrogate estimation, superquantile tracking, dualization of risk 147 Unclassified Unclassified...series of numerical examples that show some of the ap- plication of superquantile regression, such as superquantile tracking and surrogate estimation...dissertation by surrogate estimation. It usually occurs when the explanatory random variable is beyond our direct control, but the dependence between the
A Skew-Normal Mixture Regression Model
ERIC Educational Resources Information Center
Liu, Min; Lin, Tsung-I
2014-01-01
A challenge associated with traditional mixture regression models (MRMs), which rest on the assumption of normally distributed errors, is determining the number of unobserved groups. Specifically, even slight deviations from normality can lead to the detection of spurious classes. The current work aims to (a) examine how sensitive the commonly…
Regression Segmentation for M³ Spinal Images.
Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo
2015-08-01
Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases.
Categorical Variables in Multiple Regression: Some Cautions.
ERIC Educational Resources Information Center
O'Grady, Kevin E.; Medoff, Deborah R.
1988-01-01
Limitations of dummy coding and nonsense coding as methods of coding categorical variables for use as predictors in multiple regression analysis are discussed. The combination of these approaches often yields estimates and tests of significance that are not intended by researchers for inclusion in their models. (SLD)
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
ERIC Educational Resources Information Center
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
A Spline Regression Model for Latent Variables
ERIC Educational Resources Information Center
Harring, Jeffrey R.
2014-01-01
Spline (or piecewise) regression models have been used in the past to account for patterns in observed data that exhibit distinct phases. The changepoint or knot marking the shift from one phase to the other, in many applications, is an unknown parameter to be estimated. As an extension of this framework, this research considers modeling the…
Model building in nonproportional hazard regression.
Rodríguez-Girondo, Mar; Kneib, Thomas; Cadarso-Suárez, Carmen; Abu-Assi, Emad
2013-12-30
Recent developments of statistical methods allow for a very flexible modeling of covariates affecting survival times via the hazard rate, including also the inspection of possible time-dependent associations. Despite their immediate appeal in terms of flexibility, these models typically introduce additional difficulties when a subset of covariates and the corresponding modeling alternatives have to be chosen, that is, for building the most suitable model for given data. This is particularly true when potentially time-varying associations are given. We propose to conduct a piecewise exponential representation of the original survival data to link hazard regression with estimation schemes based on of the Poisson likelihood to make recent advances for model building in exponential family regression accessible also in the nonproportional hazard regression context. A two-stage stepwise selection approach, an approach based on doubly penalized likelihood, and a componentwise functional gradient descent approach are adapted to the piecewise exponential regression problem. These three techniques were compared via an intensive simulation study. An application to prognosis after discharge for patients who suffered a myocardial infarction supplements the simulation to demonstrate the pros and cons of the approaches in real data analyses.
Prediction of dynamical systems by symbolic regression
NASA Astrophysics Data System (ADS)
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
Prediction of dynamical systems by symbolic regression.
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Assumptions of Multiple Regression: Correcting Two Misconceptions
ERIC Educational Resources Information Center
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason
2013-01-01
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
The Shadow Side of Regressive Groups.
ERIC Educational Resources Information Center
McClure, Bud A.
1994-01-01
Contends that inability of groups to address conflict, encourage dissenting views, and face their negative characteristics can result in destructive behavior toward others that remains largely outside awareness of individual members. Examines regressive group characteristics; behavior of United States during Persian Gulf War is used to highlight…
Moving the Bar: Transformations in Linear Regression.
ERIC Educational Resources Information Center
Miranda, Janet
The assumption that is most important to the hypothesis testing procedure of multiple linear regression is the assumption that the residuals are normally distributed, but this assumption is not always tenable given the realities of some data sets. When normal distribution of the residuals is not met, an alternative method can be initiated. As an…
1990-05-01
the middle of the protruding point of shoulder from the the relaxed abdomen neck to the tip of the of a seated subject. shoulder. Acropodioa: The tip...MNidsouldcr:. The Q (2 point on top of the right shoulder midway Midpatcllao The between the neck anterior point halfway (right trapezius point) between...rightlatm-r, and left ’ Midspine: A line lateral Anterior and down the center of lateral points at the -" the back. base of the neck . Olecranon
Inflation Adjustments for Defense Acquisition
2014-10-01
Harmon Daniel B. Levine Stanley A. Horowitz, Project Leader INSTITUTE FOR DEFENSE ANALYSES 4850 Mark Center Drive Alexandria, Virginia 22311-1882 Approved...T U T E F O R D E F E N S E A N A L Y S E S IDA Document D-5112 Inflation Adjustments for Defense Acquisition Bruce R. Harmon Daniel B. Levine...might do a better job? The focus of the study is on aircraft procurement. By way of terminology , “cost index,” “price index,” and “deflator” are used
Adjustable extender for instrument module
Sevec, J.B.; Stein, A.D.
1975-11-01
A blank extender module used to mount an instrument module in front of its console for repair or test purposes has been equipped with a rotatable mount and means for locking the mount at various angles of rotation for easy accessibility. The rotatable mount includes a horizontal conduit supported by bearings within the blank module. The conduit is spring-biased in a retracted position within the blank module and in this position a small gear mounted on the conduit periphery is locked by a fixed pawl. The conduit and instrument mount can be pulled into an extended position with the gear clearing the pawl to permit rotation and adjustment of the instrument.
Montgomery, M E; White, M E; Martin, S W
1987-01-01
Results from discriminant analysis and logistic regression were compared using two data sets from a study on predictors of coliform mastitis in dairy cows. Both techniques selected the same set of variables as important predictors and were of nearly equal value in classifying cows as having, or not having mastitis. The logistic regression model made fewer classification errors. The magnitudes of the effects were considerably different for some variables. Given the failure to meet the underlying assumptions of discriminant analysis, the coefficients from logistic regression are preferable. PMID:3453271
Drag and energy accommodation coefficients during sunspot maximum
NASA Astrophysics Data System (ADS)
Pardini, Carmen; Anselmo, Luciano; Moe, Kenneth; Moe, Mildred M.
over 30-day intervals from October 1999 to December 2002, was found to be 2.16. When the measurement was adjusted upward by 12 % to correct for the bias in the density model, the resulting physical drag coefficient became 2.42. Comparison of this result with the calculated dependence of the drag coefficient of SNOE on the energy accommodation coefficient has revealed agreement for α = 0.97. This high an accommodation coefficient implies that a considerable portion of the satellite surface had oxygen atoms adsorbed on it. This is physically reasonable because during sunspot maximum the increased solar UV radiation dissociates more molecular oxygen and the UV heating causes the atmosphere to expand to higher altitudes. The analysis was carried out for several satellites, having perigee altitudes between 300 and 800 km, during the last solar cycle maximum. This sample included a subset of Surrey satellites, for which the main characteristics (i.e. size, shape, mass, attitude, etc.) were known, and some spherical, or nearly spherical, objects.
Commentary on Coefficient Alpha: A Cautionary Tale
ERIC Educational Resources Information Center
Green, Samuel B.; Yang, Yanyun
2009-01-01
The general use of coefficient alpha to assess reliability should be discouraged on a number of grounds. The assumptions underlying coefficient alpha are unlikely to hold in practice, and violation of these assumptions can result in nontrivial negative or positive bias. Structural equation modeling was discussed as an informative process both to…
Implications of NGA for NEHRP site coefficients
Borcherdt, Roger D.
2012-01-01
Three proposals are provided to update tables 11.4-1 and 11.4-2 of Minimum Design Loads for Buildings and Other Structures (7-10), by the American Society of Civil Engineers (2010) (ASCE/SEI 7-10), with site coefficients implied directly by NGA (Next Generation Attenuation) ground motion prediction equations (GMPEs). Proposals include a recommendation to use straight-line interpolation to infer site coefficients at intermediate values of ̅vs (average shear velocity). Site coefficients are recommended to ensure consistency with ASCE/SEI 7-10 MCER (Maximum Considered Earthquake) seismic-design maps and simplified site-specific design spectra procedures requiring site classes with associated tabulated site coefficients and a reference site class with unity site coefficients. Recommended site coefficients are confirmed by independent observations of average site amplification coefficients inferred with respect to an average ground condition consistent with that used for the MCER maps. The NGA coefficients recommended for consideration are implied directly by the NGA GMPEs and do not require introduction of additional models.
Coefficient Alpha and Reliability of Scale Scores
ERIC Educational Resources Information Center
Almehrizi, Rashid S.
2013-01-01
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Decay of (p,q)-Fourier coefficients.
Edmunds, David E; Gurka, Petr; Lang, Jan
2014-10-08
We show that essentially the speed of decay of the Fourier sine coefficients of a function in a Lebesgue space is comparable to that of the corresponding coefficients with respect to the basis formed by the generalized sine functions sin p,q .
A gain-coefficient switched Alexandrite laser
NASA Astrophysics Data System (ADS)
Lee, Chris J.; van der Slot, Peter J. M.; Boller, Klaus-J.
2013-01-01
We report on a gain-coefficient switched Alexandrite laser. An electro-optic modulator is used to switch between high and low gain states by making use of the polarization dependent gain of Alexandrite. In gain-coefficient switched mode, the laser produces 85 ns pulses with a pulse energy of 240 mJ at a repetition rate of 5 Hz.
Coefficient Alpha Bootstrap Confidence Interval under Nonnormality
ERIC Educational Resources Information Center
Padilla, Miguel A.; Divers, Jasmin; Newton, Matthew
2012-01-01
Three different bootstrap methods for estimating confidence intervals (CIs) for coefficient alpha were investigated. In addition, the bootstrap methods were compared with the most promising coefficient alpha CI estimation methods reported in the literature. The CI methods were assessed through a Monte Carlo simulation utilizing conditions…
Calculator program set up for film coefficients
Gracey, J.O.; Teter, D.L.
1982-11-15
Describes a mechanized computation scheme for the film coefficients used in heat transfer calculations designed for the Texas Instruments TI-59 programmable calculator. Presents tables showing application conditions (small diagram included) and the corresponding heat transfer equations for 10 heat flow situations; symbols used; user instructions, a complete film coefficient program; and storage assignments. Example problem and corresponding printout are given.
Meta-Analysis of Coefficient Alpha
ERIC Educational Resources Information Center
Rodriguez, Michael C.; Maeda, Yukiko
2006-01-01
The meta-analysis of coefficient alpha across many studies is becoming more common in psychology by a methodology labeled reliability generalization. Existing reliability generalization studies have not used the sampling distribution of coefficient alpha for precision weighting and other common meta-analytic procedures. A framework is provided for…
Disability and Coping as Predictors of Psychological Adjustment to Rheumatoid Arthritis.
ERIC Educational Resources Information Center
Revenson, Tracey A.; Felton, Barbara J.
1989-01-01
Examined degree to which self-reported functional disability and coping efforts contributed to psychological adjustment among 45 rheumatoid arthritis patients over six months. Hierarchical multiple regression analyses indicated that increases in disability were related to decreased acceptance of illness and increased negative affect, while coping…
Native American Racial Identity Development and College Adjustment at Two-Year Institutions
ERIC Educational Resources Information Center
Watson, Joshua C.
2009-01-01
In this study, a series of simultaneous multiple regression analyses were conducted to examine the relationship between racial identity development and college adjustment for a sample of 76 Choctaw community college students in the South. Results indicated that 3 of the 4 racial identity statuses (dissonance, immersion-emersion, and…
Teaching Practices and the Promotion of Achievement and Adjustment in First Grade
ERIC Educational Resources Information Center
Perry, Kathryn E.; Donohue, Kathleen M.; Weinstein, Rhona S.
2007-01-01
The effects of teacher practices in promoting student academic achievement, behavioral adjustment, and feelings of competence were investigated in a prospective study of 257 children in 14 first grade classrooms. Using hierarchical linear modeling and regression techniques, observed teaching practices in the fall were explored as predictors of…
Parenting Styles, Drug Use, and Children's Adjustment in Families of Young Adults.
ERIC Educational Resources Information Center
Kandel, Denise B.
1990-01-01
Examined childrearing practices and child adjustment in longitudinal cohort of young adults for whom detailed drug histories were available. Maternal drug use retained statistically significant unique effect on child control problems when other parental variables were entered simultaneously in multiple regression equation and was one of two…
A Study of Perfectionism, Attachment, and College Student Adjustment: Testing Mediational Models.
ERIC Educational Resources Information Center
Hood, Camille A.; Kubal, Anne E.; Pfaller, Joan; Rice, Kenneth G.
Mediational models predicting college students' adjustment were tested using regression analyses. Contemporary adult attachment theory was employed to explore the cognitive/affective mechanisms by which adult attachment and perfectionism affect various aspects of psychological functioning. Consistent with theoretical expectations, results…
Polymer-water partition coefficients in polymeric passive samplers.
Asgarpour Khansary, Milad; Shirazian, Saeed; Asadollahzadeh, Mehdi
2017-01-01
Passive samplers are of the most applied methods and tools for measuring concentration of hydrophobic organic compounds in water (c 1(W) ) in which the polymer-water partition coefficients (D) are of fundamental importance for reliability of measurements. Due to the cost and time associated with the experimental researches, development of a predictive method for estimation and evaluation of performance of polymeric passive samplers for various hydrophobic organic compounds is highly needed and valuable. For this purpose, in this work, following the fundamental chemical thermodynamic equations governing the concerned local equilibrium, successful attempts were made to establish a theoretical model of polymer-water partition coefficients. Flory-Huggins model based on the Hansen solubility parameters was used for calculation of activity coefficients. The method was examined for reliability of calculations using collected data of three polymeric passive samplers and ten compounds. A regression model of form ln(D) = 0.707ln(c 1(p) ) - 2.7391 with an R (2) = 0.9744 was obtained to relate the polymer-water partition coefficients (D) and concentration of hydrophobic organic compounds in passive sampler (c 1(p) ). It was also found that polymer-water partition coefficients are related to the concentration of hydrophobic organic compounds in water (c 1(W) ) as ln(D) = 2.412ln(c 1(p) ) - 9.348. Based on the results, the tie lines of concentration for hydrophobic organic compounds in passive sampler (c 1(p) ) and concentration of hydrophobic organic compounds in water (c 1(W) ) are in the form of ln(c 1(W) ) = 0.293ln(c 1(p) ) + 2.734. The composition of water sample and the interaction parameters of dissolved compound-water and dissolved compound-polymer, temperature, etc. actively influence the values of partition coefficient. The discrepancy observed over experimental data can be simply justified based on the local condition of sampling sites which alter
Lee, Wonyul; Liu, Yufeng
2012-10-01
Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L(1) regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation.
Evaluation of Regression Models of Balance Calibration Data Using an Empirical Criterion
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert; Volden, Thomas R.
2012-01-01
An empirical criterion for assessing the significance of individual terms of regression models of wind tunnel strain gage balance outputs is evaluated. The criterion is based on the percent contribution of a regression model term. It considers a term to be significant if its percent contribution exceeds the empirical threshold of 0.05%. The criterion has the advantage that it can easily be computed using the regression coefficients of the gage outputs and the load capacities of the balance. First, a definition of the empirical criterion is provided. Then, it is compared with an alternate statistical criterion that is widely used in regression analysis. Finally, calibration data sets from a variety of balances are used to illustrate the connection between the empirical and the statistical criterion. A review of these results indicated that the empirical criterion seems to be suitable for a crude assessment of the significance of a regression model term as the boundary between a significant and an insignificant term cannot be defined very well. Therefore, regression model term reduction should only be performed by using the more universally applicable statistical criterion.
Deng, Yangyang; Parajuli, Prem B.
2011-08-10
Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.
Demenais, F M
1991-01-01
Statistical models have been developed to delineate the major-gene and non-major-gene factors accounting for the familial aggregation of complex diseases. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affection is defined by a threshold on the liability scale. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype, phenotypes of antecedents and other covariates. An equivalence between these two approaches cannot be derived analytically. I propose a formulation of the regressive logistic models on the supposition of an underlying liability model of disease. Relatives are assumed to have correlated liabilities to the disease; affected persons have liabilities exceeding an estimable threshold. Under the assumption that the correlation structure of the relatives' liabilities follows a regressive model, the regression coefficients on antecedents are expressed in terms of the relevant familial correlations. A parsimonious parameterization is a consequence of the assumed liability model, and a one-to-one correspondence with the parameters of the mixed model can be established. The logits, derived under the class A regressive model and under the class D regressive model, can be extended to include a large variety of patterns of family dependence, as well as gene-environment interactions. PMID:1897524
Estimation of the simple correlation coefficient.
Shieh, Gwowen
2010-11-01
This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.
An agreement coefficient for image comparison
Ji, L.; Gallo, K.
2006-01-01
Combination of datasets acquired from different sensor systems is necessary to construct a long time-series dataset for remotely sensed land-surface variables. Assessment of the agreement of the data derived from various sources is an important issue in understanding the data continuity through the time-series. Some traditional measures, including correlation coefficient, coefficient of determination, mean absolute error, and root mean square error, are not always optimal for evaluating the data agreement. For this reason, we developed a new agreement coefficient for comparing two different images. The agreement coefficient has the following properties: non-dimensional, bounded, symmetric, and distinguishable between systematic and unsystematic differences. The paper provides examples of agreement analyses for hypothetical data and actual remotely sensed data. The results demonstrate that the agreement coefficient does include the above properties, and therefore is a useful tool for image comparison. ?? 2006 American Society for Photogrammetry and Remote Sensing.
Jung, Kyung-Won; Ahn, Kyu-Hong
2016-01-01
The present study is focused on the application of recovered coagulant (RC) by acidification from drinking water treatment residuals for both adjusting the initial pH and aiding coagulant in electrocoagulation. To do this, real cotton textile wastewater was used as a target pollutant, and decolorization and chemical oxygen demand (COD) removal efficiency were monitored. A preliminary test indicated that a stainless steel electrode combined with RC significantly accelerated decolorization and COD removal efficiencies, by about 52% and 56%, respectively, even at an operating time of 5 min. A single electrocoagulation system meanwhile requires at least 40 min to attain the similar removal performances. Subsequently, the interactive effect of three independent variables (applied voltage, initial pH, and reaction time) on the response variables (decolorization and COD removal) was evaluated, and these parameters were statistically optimized using the response surface methodology. Analysis of variance showed a high coefficient of determination values (decolorization, R(2) = 0.9925 and COD removal, R(2) = 0.9973) and satisfactory prediction second-order polynomial quadratic regression models. Average decolorization and COD removal of 89.52% and 94.14%, respectively, were achieved, corresponding to 97.8% and 98.1% of the predicted values under statistically optimized conditions. The results suggest that the RC effectively played a dual role of both adjusting the initial pH and aiding coagulant in the electrocoagulation process.
Confidence interval of difference of proportions in logistic regression in presence of covariates.
Reeve, Russell
2016-03-16
Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.
JOINT STRUCTURE SELECTION AND ESTIMATION IN THE TIME-VARYING COEFFICIENT COX MODEL
Xiao, Wei; Lu, Wenbin; Zhang, Hao Helen
2016-01-01
Time-varying coefficient Cox model has been widely studied and popularly used in survival data analysis due to its flexibility for modeling covariate effects. It is of great practical interest to accurately identify the structure of covariate effects in a time-varying coefficient Cox model, i.e. covariates with null effect, constant effect and truly time-varying effect, and estimate the corresponding regression coefficients. Combining the ideas of local polynomial smoothing and group nonnegative garrote, we develop a new penalization approach to achieve such goals. Our method is able to identify the underlying true model structure with probability tending to one and simultaneously estimate the time-varying coefficients consistently. The asymptotic normalities of the resulting estimators are also established. We demonstrate the performance of our method using simulations and an application to the primary biliary cirrhosis data. PMID:27540275
A Two-step Estimation Approach for Logistic Varying Coefficient Modeling of Longitudinal Data.
Dong, Jun; Estes, Jason P; Li, Gang; Şentürk, Damla
2016-07-01
Varying coefficient models are useful for modeling longitudinal data and have been extensively studied in the past decade. Motivated by commonly encountered dichotomous outcomes in medical and health cohort studies, we propose a two-step method to estimate the regression coefficient functions in a logistic varying coefficient model for a longitudinal binary outcome. The model depicts time-varying covariate effects without imposing stringent parametric assumptions. The proposed estimation is simple and can be conveniently implemented using existing statistical packages such as SAS and R. We study asymptotic properties of the proposed estimators which lead to asymptotic inference and also develop bootstrap inferential procedures to test whether the coefficient functions are indeed time-varying or are equal to zero. The proposed methodology is illustrated with the analysis of a smoking cessation data set. Simulations are used to evaluate the performance of the proposed method compared to an alternative estimation method based on local maximum likelihood.
Embedded Sensors for Measuring Surface Regression
NASA Technical Reports Server (NTRS)
Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.
2006-01-01
The development and evaluation of new hybrid and solid rocket motors requires accurate characterization of the propellant surface regression as a function of key operational parameters. These characteristics establish the propellant flow rate and are prime design drivers affecting the propulsion system geometry, size, and overall performance. There is a similar need for the development of advanced ablative materials, and the use of conventional ablatives exposed to new operational environments. The Miniature Surface Regression Sensor (MSRS) was developed to serve these applications. It is designed to be cast or embedded in the material of interest and regresses along with it. During this process, the resistance of the sensor is related to its instantaneous length, allowing the real-time thickness of the host material to be established. The time derivative of this data reveals the instantaneous surface regression rate. The MSRS could also be adapted to perform similar measurements for a variety of other host materials when it is desired to monitor thicknesses and/or regression rate for purposes of safety, operational control, or research. For example, the sensor could be used to monitor the thicknesses of brake linings or racecar tires and indicate when they need to be replaced. At the time of this reporting, over 200 of these sensors have been installed into a variety of host materials. An MSRS can be made in either of two configurations, denoted ladder and continuous (see Figure 1). A ladder MSRS includes two highly electrically conductive legs, across which narrow strips of electrically resistive material are placed at small increments of length. These strips resemble the rungs of a ladder and are electrically equivalent to many tiny resistors connected in parallel. A substrate material provides structural support for the legs and rungs. The instantaneous sensor resistance is read by an external signal conditioner via wires attached to the conductive legs on the
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression.
Non-crossing weighted kernel quantile regression with right censored data.
Bang, Sungwan; Eo, Soo-Heang; Cho, Yong Mee; Jhun, Myoungshic; Cho, HyungJun
2016-01-01
Regarding survival data analysis in regression modeling, multiple conditional quantiles are useful summary statistics to assess covariate effects on survival times. In this study, we consider an estimation problem of multiple nonlinear quantile functions with right censored survival data. To account for censoring in estimating a nonlinear quantile function, weighted kernel quantile regression (WKQR) has been developed by using the kernel trick and inverse-censoring-probability weights. However, the individually estimated quantile functions based on the WKQR often cross each other and consequently violate the basic properties of quantiles. To avoid this problem of quantile crossing, we propose the non-crossing weighted kernel quantile regression (NWKQR), which estimates multiple nonlinear conditional quantile functions simultaneously by enforcing the non-crossing constraints on kernel coefficients. The numerical results are presented to demonstrate the competitive performance of the proposed NWKQR over the WKQR.
Statistical methods for astronomical data with upper limits. II - Correlation and regression
NASA Technical Reports Server (NTRS)
Isobe, T.; Feigelson, E. D.; Nelson, P. I.
1986-01-01
Statistical methods for calculating correlations and regressions in bivariate censored data where the dependent variable can have upper or lower limits are presented. Cox's regression and the generalization of Kendall's rank correlation coefficient provide significant levels of correlations, and the EM algorithm, under the assumption of normally distributed errors, and its nonparametric analog using the Kaplan-Meier estimator, give estimates for the slope of a regression line. Monte Carlo simulations demonstrate that survival analysis is reliable in determining correlations between luminosities at different bands. Survival analysis is applied to CO emission in infrared galaxies, X-ray emission in radio galaxies, H-alpha emission in cooling cluster cores, and radio emission in Seyfert galaxies.
Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo
2017-02-26
In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness-death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score equations that are able to remove the bias due to censoring are introduced. By solving these equations, one can estimate the possibly time-varying regression coefficients, which have an immediate interpretation as covariate effects on the transition probabilities. The performance of the proposed estimator is investigated through simulations. We apply the method to data from the Registry of Systematic Lupus Erythematosus RELESSER, a multicenter registry created by the Spanish Society of Rheumatology. Specifically, we investigate the effect of age at Lupus diagnosis, sex, and ethnicity on the probability of damage and death along time. Copyright © 2017 John Wiley & Sons, Ltd.
Adjusting the Contour of Reflector Panels
NASA Technical Reports Server (NTRS)
Palmer, W. B.; Giebler, M. M.
1984-01-01
Postfabrication adjustment of contour of panels for reflector, such as parabolic reflector for radio antennas, possible with simple mechanism consisting of threaded stud, two nuts, and flexure. Contours adjusted manually.
48 CFR 1450.103 - Contract adjustments.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Contract adjustments. 1450.103 Section 1450.103 Federal Acquisition Regulations System DEPARTMENT OF THE INTERIOR CONTRACT... Contract adjustments....
First Year Adjustment in the Secondary School.
ERIC Educational Resources Information Center
Loosemore, Jean Ann
1978-01-01
This study investigated the relationship between adjustment to secondary school and 17 cognitive and noncognitive variables, including intelligence (verbal and nonverbal reasoning), academic achievement, extraversion-introversion, stable/unstable, social adjustment, endeavor, age, sex, and school form. (CP)
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Theoretical calculation of Joule-Thomson coefficient by using third virial coefficient
NASA Astrophysics Data System (ADS)
Mamedov, Bahtiyar Akber; Somuncu, Elif; Askerov, Iskender M.
2017-02-01
The Joule-Thomson coefficient has been theoretical investigated by using third virial coefficient. Established expressions enable us accurate and rapid calculations of Joule-Thomson coefficient. As seen from numerical results the analytical expressions for third virial coefficients are a very useful, giving a very fast method to calculate other thermodynamics properties of gasses. As an example, the calculation results have been successfully tested by using various literature data.
Experimental Determination of Infrared Extinction Coefficients of Interplanetary Dust Particles
NASA Technical Reports Server (NTRS)
Spann, J. F., Jr.; Abbas, M. M.
1998-01-01
This technique is based on irradiating a single isolated charged dust particle suspended in balance by an electric field, and measuring the scattered radiation as a function of angle. The observed scattered intensity profile at a specific wavelength obtained for a dust particle of known composition is compared with Mie theory calculations, and the variable parameters relating to the particle size and complex refractive index are adjusted for a best fit between the two profiles. This leads to a simultaneous determination of the particle radius, the complex refractive index, and the scattering and extinction coefficients. The results of these experiments can be utilized to examine the IRAS and DIRBE (Diffuse Infrared Background Experiment) infrared data sets in order to determine the dust particle physical characteristics and distributions by using infrared models and inversion techniques. This technique may also be employed for investigation of the rotational bursting phenomena whereby large size cosmic and interplanetary particles are believed to fragment into smaller dust particles.
Generalized adjustment by least squares ( GALS).
Elassal, A.A.
1983-01-01
The least-squares principle is universally accepted as the basis for adjustment procedures in the allied fields of geodesy, photogrammetry and surveying. A prototype software package for Generalized Adjustment by Least Squares (GALS) is described. The package is designed to perform all least-squares-related functions in a typical adjustment program. GALS is capable of supporting development of adjustment programs of any size or degree of complexity. -Author
Use of Multiple Correlation Analysis and Multiple Regression Analysis.
ERIC Educational Resources Information Center
Huberty, Carl J.; Petoskey, Martha D.
1999-01-01
Distinguishes between multiple correlation and multiple regression analysis. Illustrates suggested information reporting methods and reviews the use of regression methods when dealing with problems of missing data. (SK)
49 CFR 393.53 - Automatic brake adjusters and brake adjustment indicators.
Code of Federal Regulations, 2013 CFR
2013-10-01
... brake adjustment indicators. (a) Automatic brake adjusters (hydraulic brake systems). Each commercial motor vehicle manufactured on or after October 20, 1993, and equipped with a hydraulic brake...
A note on the use of multiple linear regression in molecular ecology.
Frasier, Timothy R
2016-03-01
Multiple linear regression analyses (also often referred to as generalized linear models--GLMs, or generalized linear mixed models--GLMMs) are widely used in the analysis of data in molecular ecology, often to assess the relative effects of genetic characteristics on individual fitness or traits, or how environmental characteristics influence patterns of genetic differentiation. However, the coefficients resulting from multiple regression analyses are sometimes misinterpreted, which can lead to incorrect interpretations and conclusions within individual studies, and can propagate to wider-spread errors in the general understanding of a topic. The primary issue revolves around the interpretation of coefficients for independent variables when interaction terms are also included in the analyses. In this scenario, the coefficients associated with each independent variable are often interpreted as the independent effect of each predictor variable on the predicted variable. However, this interpretation is incorrect. The correct interpretation is that these coefficients represent the effect of each predictor variable on the predicted variable when all other predictor variables are zero. This difference may sound subtle, but the ramifications cannot be overstated. Here, my goals are to raise awareness of this issue, to demonstrate and emphasize the problems that can result and to provide alternative approaches for obtaining the desired information.
Lasso adjustments of treatment effect estimates in randomized experiments
Bloniarz, Adam; Liu, Hanzhong; Zhang, Cun-Hui; Sekhon, Jasjeet S.; Yu, Bin
2016-01-01
We provide a principled way for investigators to analyze randomized experiments when the number of covariates is large. Investigators often use linear multivariate regression to analyze randomized experiments instead of simply reporting the difference of means between treatment and control groups. Their aim is to reduce the variance of the estimated treatment effect by adjusting for covariates. If there are a large number of covariates relative to the number of observations, regression may perform poorly because of overfitting. In such cases, the least absolute shrinkage and selection operator (Lasso) may be helpful. We study the resulting Lasso-based treatment effect estimator under the Neyman–Rubin model of randomized experiments. We present theoretical conditions that guarantee that the estimator is more efficient than the simple difference-of-means estimator, and we provide a conservative estimator of the asymptotic variance, which can yield tighter confidence intervals than the difference-of-means estimator. Simulation and data examples show that Lasso-based adjustment can be advantageous even when the number of covariates is less than the number of observations. Specifically, a variant using Lasso for selection and ordinary least squares (OLS) for estimation performs particularly well, and it chooses a smoothing parameter based on combined performance of Lasso and OLS. PMID:27382153
Path-counting formulas for generalized kinship coefficients and condensed identity coefficients.
Cheng, En; Ozsoyoglu, Z Meral
2014-01-01
An important computation on pedigree data is the calculation of condensed identity coefficients, which provide a complete description of the degree of relatedness of two individuals. The applications of condensed identity coefficients range from genetic counseling to disease tracking. Condensed identity coefficients can be computed using linear combinations of generalized kinship coefficients for two, three, four individuals, and two pairs of individuals and there are recursive formulas for computing those generalized kinship coefficients (Karigl, 1981). Path-counting formulas have been proposed for the (generalized) kinship coefficients for two (three) individuals but there have been no path-counting formulas for the other generalized kinship coefficients. It has also been shown that the computation of the (generalized) kinship coefficients for two (three) individuals using path-counting formulas is efficient for large pedigrees, together with path encoding schemes tailored for pedigree graphs. In this paper, we propose a framework for deriving path-counting formulas for generalized kinship coefficients. Then, we present the path-counting formulas for all generalized kinship coefficients for which there are recursive formulas and which are sufficient for computing condensed identity coefficients. We also perform experiments to compare the efficiency of our method with the recursive method for computing condensed identity coefficients on large pedigrees.
ERIC Educational Resources Information Center
Green, Samuel B.; Yang, Yanyun
2015-01-01
In the lead article, Davenport, Davison, Liou, & Love demonstrate the relationship among homogeneity, internal consistency, and coefficient alpha, and also distinguish among them. These distinctions are important because too often coefficient alpha--a reliability coefficient--is interpreted as an index of homogeneity or internal consistency.…
A locally adaptive kernel regression method for facies delineation
NASA Astrophysics Data System (ADS)
Fernàndez-Garcia, D.; Barahona-Palomo, M.; Henri, C. V.; Sanchez-Vila, X.
2015-12-01
Facies delineation is defined as the separation of geological units with distinct intrinsic characteristics (grain size, hydraulic conductivity, mineralogical composition). A major challenge in this area stems from the fact that only a few scattered pieces of hydrogeological information are available to delineate geological facies. Several methods to delineate facies are available in the literature, ranging from those based only on existing hard data, to those including secondary data or external knowledge about sedimentological patterns. This paper describes a methodology to use kernel regression methods as an effective tool for facies delineation. The method uses both the spatial and the actual sampled values to produce, for each individual hard data point, a locally adaptive steering kernel function, self-adjusting the principal directions of the local anisotropic kernels to the direction of highest local spatial correlation. The method is shown to outperform the nearest neighbor classification method in a number of synthetic aquifers whenever the available number of hard data is small and randomly distributed in space. In the case of exhaustive sampling, the steering kernel regression method converges to the true solution. Simulations ran in a suite of synthetic examples are used to explore the selection of kernel parameters in typical field settings. It is shown that, in practice, a rule of thumb can be used to obtain suboptimal results. The performance of the method is demonstrated to significantly improve when external information regarding facies proportions is incorporated. Remarkably, the method allows for a reasonable reconstruction of the facies connectivity patterns, shown in terms of breakthrough curves performance.