Yamazaki, Takeshi; Takeda, Hisato; Hagiya, Koichi; Yamaguchi, Satoshi; Sasaki, Osamu
2018-03-13
Because lactation periods in dairy cows lengthen with increasing total milk production, it is important to predict individual productivities after 305 days in milk (DIM) to determine the optimal lactation period. We therefore examined whether the random regression (RR) coefficient from 306 to 450 DIM (M2) can be predicted from those during the first 305 DIM (M1) by using a random regression model. We analyzed test-day milk records from 85690 Holstein cows in their first lactations and 131727 cows in their later (second to fifth) lactations. Data in M1 and M2 were analyzed separately by using different single-trait RR animal models. We then performed a multiple regression analysis of the RR coefficients of M2 on those of M1 during the first and later lactations. The first-order Legendre polynomials were practical covariates of random regression for the milk yields of M2. All RR coefficients for the additive genetic (AG) effect and the intercept for the permanent environmental (PE) effect of M2 had moderate to strong correlations with the intercept for the AG effect of M1. The coefficients of determination for multiple regression of the combined intercepts for the AG and PE effects of M2 on the coefficients for the AG effect of M1 were moderate to high. The daily milk yields of M2 predicted by using the RR coefficients for the AG effect of M1 were highly correlated with those obtained by using the coefficients of M2. Milk production after 305 DIM can be predicted by using the RR coefficient estimates of the AG effect during the first 305 DIM.
Biases and Standard Errors of Standardized Regression Coefficients
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Chan, Wai
2011-01-01
The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Boligon, A A; Baldi, F; Mercadante, M E Z; Lobo, R B; Pereira, R J; Albuquerque, L G
2011-06-28
We quantified the potential increase in accuracy of expected breeding value for weights of Nelore cattle, from birth to mature age, using multi-trait and random regression models on Legendre polynomials and B-spline functions. A total of 87,712 weight records from 8144 females were used, recorded every three months from birth to mature age from the Nelore Brazil Program. For random regression analyses, all female weight records from birth to eight years of age (data set I) were considered. From this general data set, a subset was created (data set II), which included only nine weight records: at birth, weaning, 365 and 550 days of age, and 2, 3, 4, 5, and 6 years of age. Data set II was analyzed using random regression and multi-trait models. The model of analysis included the contemporary group as fixed effects and age of dam as a linear and quadratic covariable. In the random regression analyses, average growth trends were modeled using a cubic regression on orthogonal polynomials of age. Residual variances were modeled by a step function with five classes. Legendre polynomials of fourth and sixth order were utilized to model the direct genetic and animal permanent environmental effects, respectively, while third-order Legendre polynomials were considered for maternal genetic and maternal permanent environmental effects. Quadratic polynomials were applied to model all random effects in random regression models on B-spline functions. Direct genetic and animal permanent environmental effects were modeled using three segments or five coefficients, and genetic maternal and maternal permanent environmental effects were modeled with one segment or three coefficients in the random regression models on B-spline functions. For both data sets (I and II), animals ranked differently according to expected breeding value obtained by random regression or multi-trait models. With random regression models, the highest gains in accuracy were obtained at ages with a low number of weight records. The results indicate that random regression models provide more accurate expected breeding values than the traditionally finite multi-trait models. Thus, higher genetic responses are expected for beef cattle growth traits by replacing a multi-trait model with random regression models for genetic evaluation. B-spline functions could be applied as an alternative to Legendre polynomials to model covariance functions for weights from birth to mature age.
Analyzing degradation data with a random effects spline regression model
Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip
2017-03-17
This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.
Analyzing degradation data with a random effects spline regression model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip
This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.
The Bayesian group lasso for confounded spatial data
Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.
2017-01-01
Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
Genetic parameters for stayability to consecutive calvings in Zebu cattle.
Silva, D O; Santana, M L; Ayres, D R; Menezes, G R O; Silva, L O C; Nobre, P R C; Pereira, R J
2017-12-22
Longer-lived cows tend to be more profitable and the stayability trait is a selection criterion correlated to longevity. An alternative to the traditional approach to evaluate stayability is its definition based on consecutive calvings, whose main advantage is the more accurate evaluation of young bulls. However, no study using this alternative approach has been conducted for Zebu breeds. Therefore, the objective of this study was to compare linear random regression models to fit stayability to consecutive calvings of Guzerá, Nelore and Tabapuã cows and to estimate genetic parameters for this trait in the respective breeds. Data up to the eighth calving were used. The models included the fixed effects of age at first calving and year-season of birth of the cow and the random effects of contemporary group, additive genetic, permanent environmental and residual. Random regressions were modeled by orthogonal Legendre polynomials of order 1 to 4 (2 to 5 coefficients) for contemporary group, additive genetic and permanent environmental effects. Using Deviance Information Criterion as the selection criterion, the model with 4 regression coefficients for each effect was the most adequate for the Nelore and Tabapuã breeds and the model with 5 coefficients is recommended for the Guzerá breed. For Guzerá, heritabilities ranged from 0.05 to 0.08, showing a quadratic trend with a peak between the fourth and sixth calving. For the Nelore and Tabapuã breeds, the estimates ranged from 0.03 to 0.07 and from 0.03 to 0.08, respectively, and increased with increasing calving number. The additive genetic correlations exhibited a similar trend among breeds and were higher for stayability between closer calvings. Even between more distant calvings (second v. eighth), stayability showed a moderate to high genetic correlation, which was 0.77, 0.57 and 0.79 for the Guzerá, Nelore and Tabapuã breeds, respectively. For Guzerá, when the models with 4 or 5 regression coefficients were compared, the rank correlations between predicted breeding values for the intercept were always higher than 0.99, indicating the possibility of practical application of the least parameterized model. In conclusion, the model with 4 random regression coefficients is recommended for the genetic evaluation of stayability to consecutive calvings in Zebu cattle.
Random effects coefficient of determination for mixed and meta-analysis models
Demidenko, Eugene; Sargent, James; Onega, Tracy
2011-01-01
The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, Rr2, that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If Rr2 is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of Rr2 apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects—the model can be estimated using the dummy variable approach. We derive explicit formulas for Rr2 in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine. PMID:23750070
Revisiting crash spatial heterogeneity: A Bayesian spatially varying coefficients approach.
Xu, Pengpeng; Huang, Helai; Dong, Ni; Wong, S C
2017-01-01
This study was performed to investigate the spatially varying relationships between crash frequency and related risk factors. A Bayesian spatially varying coefficients model was elaborately introduced as a methodological alternative to simultaneously account for the unstructured and spatially structured heterogeneity of the regression coefficients in predicting crash frequencies. The proposed method was appealing in that the parameters were modeled via a conditional autoregressive prior distribution, which involved a single set of random effects and a spatial correlation parameter with extreme values corresponding to pure unstructured or pure spatially correlated random effects. A case study using a three-year crash dataset from the Hillsborough County, Florida, was conducted to illustrate the proposed model. Empirical analysis confirmed the presence of both unstructured and spatially correlated variations in the effects of contributory factors on severe crash occurrences. The findings also suggested that ignoring spatially structured heterogeneity may result in biased parameter estimates and incorrect inferences, while assuming the regression coefficients to be spatially clustered only is probably subject to the issue of over-smoothness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Random effects coefficient of determination for mixed and meta-analysis models.
Demidenko, Eugene; Sargent, James; Onega, Tracy
2012-01-01
The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, [Formula: see text], that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If [Formula: see text] is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of [Formula: see text] apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects-the model can be estimated using the dummy variable approach. We derive explicit formulas for [Formula: see text] in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine.
Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B
2016-09-01
Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that genetic gain for body weight can be achieved by selection. Also, selection for body weight at 42 days of age can be maintained as a selection criterion. © 2016 Poultry Science Association Inc.
NASA Technical Reports Server (NTRS)
Tomberlin, T. J.
1985-01-01
Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.
[How to fit and interpret multilevel models using SPSS].
Pardo, Antonio; Ruiz, Miguel A; San Martín, Rafael
2007-05-01
Hierarchic or multilevel models are used to analyse data when cases belong to known groups and sample units are selected both from the individual level and from the group level. In this work, the multilevel models most commonly discussed in the statistic literature are described, explaining how to fit these models using the SPSS program (any version as of the 11 th ) and how to interpret the outcomes of the analysis. Five particular models are described, fitted, and interpreted: (1) one-way analysis of variance with random effects, (2) regression analysis with means-as-outcomes, (3) one-way analysis of covariance with random effects, (4) regression analysis with random coefficients, and (5) regression analysis with means- and slopes-as-outcomes. All models are explained, trying to make them understandable to researchers in health and behaviour sciences.
Testing a single regression coefficient in high dimensional linear models
Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2017-01-01
In linear regression models with high dimensional data, the classical z-test (or t-test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z-test to assess the significance of each covariate. Based on the p-value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively. PMID:28663668
Testing a single regression coefficient in high dimensional linear models.
Lan, Wei; Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2016-11-01
In linear regression models with high dimensional data, the classical z -test (or t -test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z -test to assess the significance of each covariate. Based on the p -value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively.
Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection
Goldsmith, Jeff; Huang, Lei; Crainiceanu, Ciprian M.
2013-01-01
We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in less than one page in the Appendix. We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white matter microstructure at every voxel of the corpus callosum for hundreds of subjects. PMID:24729670
To, Minh-Son; Prakash, Shivesh; Poonnoose, Santosh I; Bihari, Shailesh
2018-05-01
The study uses meta-regression analysis to quantify the dose-dependent effects of statin pharmacotherapy on vasospasm, delayed ischemic neurologic deficits (DIND), and mortality in aneurysmal subarachnoid hemorrhage. Prospective, retrospective observational studies, and randomized controlled trials (RCTs) were retrieved by a systematic database search. Summary estimates were expressed as absolute risk (AR) for a given statin dose or control (placebo). Meta-regression using inverse variance weighting and robust variance estimation was performed to assess the effect of statin dose on transformed AR in a random effects model. Dose-dependence of predicted AR with 95% confidence interval (CI) was recovered by using Miller's Freeman-Tukey inverse. The database search and study selection criteria yielded 18 studies (2594 patients) for analysis. These included 12 RCTs, 4 retrospective observational studies, and 2 prospective observational studies. Twelve studies investigated simvastatin, whereas the remaining studies investigated atorvastatin, pravastatin, or pitavastatin, with simvastatin-equivalent doses ranging from 20 to 80 mg. Meta-regression revealed dose-dependent reductions in Freeman-Tukey-transformed AR of vasospasm (slope coefficient -0.00404, 95% CI -0.00720 to -0.00087; P = 0.0321), DIND (slope coefficient -0.00316, 95% CI -0.00586 to -0.00047; P = 0.0392), and mortality (slope coefficient -0.00345, 95% CI -0.00623 to -0.00067; P = 0.0352). The present meta-regression provides weak evidence for dose-dependent reductions in vasospasm, DIND and mortality associated with acute statin use after aneurysmal subarachnoid hemorrhage. However, the analysis was limited by substantial heterogeneity among individual studies. Greater dosing strategies are a potential consideration for future RCTs. Copyright © 2018 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Choi, Kilchan; Seltzer, Michael
2010-01-01
In studies of change in education and numerous other fields, interest often centers on how differences in the status of individuals at the start of a period of substantive interest relate to differences in subsequent change. In this article, the authors present a fully Bayesian approach to estimating three-level Hierarchical Models in which latent…
On marker-based parentage verification via non-linear optimization.
Boerner, Vinzent
2017-06-15
Parentage verification by molecular markers is mainly based on short tandem repeat markers. Single nucleotide polymorphisms (SNPs) as bi-allelic markers have become the markers of choice for genotyping projects. Thus, the subsequent step is to use SNP genotypes for parentage verification as well. Recent developments of algorithms such as evaluating opposing homozygous SNP genotypes have drawbacks, for example the inability of rejecting all animals of a sample of potential parents. This paper describes an algorithm for parentage verification by constrained regression which overcomes the latter limitation and proves to be very fast and accurate even when the number of SNPs is as low as 50. The algorithm was tested on a sample of 14,816 animals with 50, 100 and 500 SNP genotypes randomly selected from 40k genotypes. The samples of putative parents of these animals contained either five random animals, or four random animals and the true sire. Parentage assignment was performed by ranking of regression coefficients, or by setting a minimum threshold for regression coefficients. The assignment quality was evaluated by the power of assignment (P[Formula: see text]) and the power of exclusion (P[Formula: see text]). If the sample of putative parents contained the true sire and parentage was assigned by coefficient ranking, P[Formula: see text] and P[Formula: see text] were both higher than 0.99 for the 500 and 100 SNP genotypes, and higher than 0.98 for the 50 SNP genotypes. When parentage was assigned by a coefficient threshold, P[Formula: see text] was higher than 0.99 regardless of the number of SNPs, but P[Formula: see text] decreased from 0.99 (500 SNPs) to 0.97 (100 SNPs) and 0.92 (50 SNPs). If the sample of putative parents did not contain the true sire and parentage was rejected using a coefficient threshold, the algorithm achieved a P[Formula: see text] of 1 (500 SNPs), 0.99 (100 SNPs) and 0.97 (50 SNPs). The algorithm described here is easy to implement, fast and accurate, and is able to assign parentage using genomic marker data with a size as low as 50 SNPs.
Analysis of a Split-Plot Experimental Design Applied to a Low-Speed Wind Tunnel Investigation
NASA Technical Reports Server (NTRS)
Erickson, Gary E.
2013-01-01
A procedure to analyze a split-plot experimental design featuring two input factors, two levels of randomization, and two error structures in a low-speed wind tunnel investigation of a small-scale model of a fighter airplane configuration is described in this report. Standard commercially-available statistical software was used to analyze the test results obtained in a randomization-restricted environment often encountered in wind tunnel testing. The input factors were differential horizontal stabilizer incidence and the angle of attack. The response variables were the aerodynamic coefficients of lift, drag, and pitching moment. Using split-plot terminology, the whole plot, or difficult-to-change, factor was the differential horizontal stabilizer incidence, and the subplot, or easy-to-change, factor was the angle of attack. The whole plot and subplot factors were both tested at three levels. Degrees of freedom for the whole plot error were provided by replication in the form of three blocks, or replicates, which were intended to simulate three consecutive days of wind tunnel facility operation. The analysis was conducted in three stages, which yielded the estimated mean squares, multiple regression function coefficients, and corresponding tests of significance for all individual terms at the whole plot and subplot levels for the three aerodynamic response variables. The estimated regression functions included main effects and two-factor interaction for the lift coefficient, main effects, two-factor interaction, and quadratic effects for the drag coefficient, and only main effects for the pitching moment coefficient.
Prediction models for clustered data: comparison of a random intercept and standard regression model
2013-01-01
Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436
Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne
2013-02-15
When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.
Wang, Wei; Griswold, Michael E
2016-11-30
The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis
NASA Astrophysics Data System (ADS)
Tang, Kunkun; Congedo, Pietro; Abgrall, Remi
2014-11-01
Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.
Hewitt, Angela L.; Popa, Laurentiu S.; Pasalar, Siavash; Hendrix, Claudia M.
2011-01-01
Encoding of movement kinematics in Purkinje cell simple spike discharge has important implications for hypotheses of cerebellar cortical function. Several outstanding questions remain regarding representation of these kinematic signals. It is uncertain whether kinematic encoding occurs in unpredictable, feedback-dependent tasks or kinematic signals are conserved across tasks. Additionally, there is a need to understand the signals encoded in the instantaneous discharge of single cells without averaging across trials or time. To address these questions, this study recorded Purkinje cell firing in monkeys trained to perform a manual random tracking task in addition to circular tracking and center-out reach. Random tracking provides for extensive coverage of kinematic workspaces. Direction and speed errors are significantly greater during random than circular tracking. Cross-correlation analyses comparing hand and target velocity profiles show that hand velocity lags target velocity during random tracking. Correlations between simple spike firing from 120 Purkinje cells and hand position, velocity, and speed were evaluated with linear regression models including a time constant, τ, as a measure of the firing lead/lag relative to the kinematic parameters. Across the population, velocity accounts for the majority of simple spike firing variability (63 ± 30% of Radj2), followed by position (28 ± 24% of Radj2) and speed (11 ± 19% of Radj2). Simple spike firing often leads hand kinematics. Comparison of regression models based on averaged vs. nonaveraged firing and kinematics reveals lower Radj2 values for nonaveraged data; however, regression coefficients and τ values are highly similar. Finally, for most cells, model coefficients generated from random tracking accurately estimate simple spike firing in either circular tracking or center-out reach. These findings imply that the cerebellum controls movement kinematics, consistent with a forward internal model that predicts upcoming limb kinematics. PMID:21795616
Davies, Simon J.C.; Mulsant, Benoit H.; Flint, Alastair J.; Rothschild, Anthony J.; Whyte, Ellen M.; Meyers, Barnett S.
2014-01-01
Background There are conflicting results on the impact of anxiety on depression outcomes. The impact of anxiety has not been studied in major depression with psychotic features (“psychotic depression”). Aims We assessed the impact of specific anxiety symptoms and disorders on the outcomes of psychotic depression. Methods We analyzed data from the Study of Pharmacotherapy for Psychotic Depression that randomized 259 younger and older participants to either olanzapine plus placebo or olanzapine plus sertraline. We assessed the impact of specific anxiety symptoms from the Brief Psychiatric Rating Scale (“tension”, “anxiety” and “somatic concerns” and a composite anxiety score) and diagnoses (panic disorder and GAD) on psychotic depression outcomes using linear or logistic regression. Age, gender, education and benzodiazepine use (at baseline and end) were included as covariates. Results Anxiety symptoms at baseline and anxiety disorder diagnoses differentially impacted outcomes. On adjusted linear regression there was an association between improvement in depressive symptoms and both baseline “tension” (coefficient = 0.784; 95% CI: 0.169–1.400; p = 0.013) and the composite anxiety score (regression coefficient = 0.348; 95% CI: 0.064–0.632; p = 0.017). There was an interaction between “tension” and treatment group, with better responses in those randomized to combination treatment if they had high baseline anxiety scores (coefficient = 1.309; 95% CI: 0.105–2.514; p = 0.033). In contrast, panic disorder was associated with worse clinical outcomes (coefficient = −3.858; 95% CI: –7.281 to −0.434; p = 0.027) regardless of treatment. Conclusions Our results suggest that analysis of the impact of anxiety on depression outcome needs to differentiate psychic and somatic symptoms. PMID:24656524
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
NASA Astrophysics Data System (ADS)
Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.
2018-03-01
Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Solving large test-day models by iteration on data and preconditioned conjugate gradient.
Lidauer, M; Strandén, I; Mäntysaari, E A; Pösö, J; Kettunen, A
1999-12-01
A preconditioned conjugate gradient method was implemented into an iteration on a program for data estimation of breeding values, and its convergence characteristics were studied. An algorithm was used as a reference in which one fixed effect was solved by Gauss-Seidel method, and other effects were solved by a second-order Jacobi method. Implementation of the preconditioned conjugate gradient required storing four vectors (size equal to number of unknowns in the mixed model equations) in random access memory and reading the data at each round of iteration. The preconditioner comprised diagonal blocks of the coefficient matrix. Comparison of algorithms was based on solutions of mixed model equations obtained by a single-trait animal model and a single-trait, random regression test-day model. Data sets for both models used milk yield records of primiparous Finnish dairy cows. Animal model data comprised 665,629 lactation milk yields and random regression test-day model data of 6,732,765 test-day milk yields. Both models included pedigree information of 1,099,622 animals. The animal model ¿random regression test-day model¿ required 122 ¿305¿ rounds of iteration to converge with the reference algorithm, but only 88 ¿149¿ were required with the preconditioned conjugate gradient. To solve the random regression test-day model with the preconditioned conjugate gradient required 237 megabytes of random access memory and took 14% of the computation time needed by the reference algorithm.
NASA Astrophysics Data System (ADS)
Cambra-López, María; Winkel, Albert; Mosquera, Julio; Ogink, Nico W. M.; Aarnink, André J. A.
2015-06-01
The objective of this study was to compare co-located real-time light scattering devices and equivalent gravimetric samplers in poultry and pig houses for PM10 mass concentration, and to develop animal-specific calibration factors for light scattering samplers. These results will contribute to evaluate the comparability of different sampling instruments for PM10 concentrations. Paired DustTrak light scattering device (DustTrak aerosol monitor, TSI, U.S.) and PM10 gravimetric cyclone sampler were used for measuring PM10 mass concentrations during 24 h periods (from noon to noon) inside animal houses. Sampling was conducted in 32 animal houses in the Netherlands, including broilers, broiler breeders, layers in floor and in aviary system, turkeys, piglets, growing-finishing pigs in traditional and low emission housing with dry and liquid feed, and sows in individual and group housing. A total of 119 pairs of 24 h measurements (55 for poultry and 64 for pigs) were recorded and analyzed using linear regression analysis. Deviations between samplers were calculated and discussed. In poultry, cyclone sampler and DustTrak data fitted well to a linear regression, with a regression coefficient equal to 0.41, an intercept of 0.16 mg m-3 and a correlation coefficient of 0.91 (excluding turkeys). Results in turkeys showed a regression coefficient equal to 1.1 (P = 0.49), an intercept of 0.06 mg m-3 (P < 0.0001) and a correlation coefficient of 0.98. In pigs, we found a regression coefficient equal to 0.61, an intercept of 0.05 mg m-3 and a correlation coefficient of 0.84. Measured PM10 concentrations using DustTraks were clearly underestimated (approx. by a factor 2) in both poultry and pig housing systems compared with cyclone pre-separators. Absolute, relative, and random deviations increased with concentration. DustTrak light scattering devices should be self-calibrated to investigate PM10 mass concentrations accurately in animal houses. We recommend linear regression equations as animal-specific calibration factors for DustTraks instead of manufacturer calibration factors, especially in heavily dusty environments such as animal houses.
Ristić-Medić, Danijela; Dullemeijer, Carla; Tepsić, Jasna; Petrović-Oggiano, Gordana; Popović, Tamara; Arsić, Aleksandra; Glibetić, Marija; Souverein, Olga W; Collings, Rachel; Cavelaars, Adriënne; de Groot, Lisette; van't Veer, Pieter; Gurinović, Mirjana
2014-03-01
The objective of this systematic review was to identify studies investigating iodine intake and biomarkers of iodine status, to assess the data of the selected studies, and to estimate dose-response relationships using meta-analysis. All randomized controlled trials, prospective cohort studies, nested case-control studies, and cross-sectional studies that supplied or measured dietary iodine and measured iodine biomarkers were included. The overall pooled regression coefficient (β) and the standard error of β were calculated by random-effects meta-analysis on a double-log scale, using the calculated intake-status regression coefficient (β) for each individual study. The results of pooled randomized controlled trials indicated that the doubling of dietary iodine intake increased urinary iodine concentrations by 14% in children and adolescents, by 57% in adults and the elderly, and by 81% in pregnant women. The dose-response relationship between iodine intake and biomarkers of iodine status indicated a 12% decrease in thyroid-stimulating hormone and a 31% decrease in thyroglobulin in pregnant women. The model of dose-response quantification used to describe the relationship between iodine intake and biomarkers of iodine status may be useful for providing complementary evidence to support recommendations for iodine intake in different population groups.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-01-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
NASA Astrophysics Data System (ADS)
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-12-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework.
Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S
2011-09-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework
Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.
2012-01-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets. PMID:22308015
Use of Thematic Mapper for water quality assessment
NASA Technical Reports Server (NTRS)
Horn, E. M.; Morrissey, L. A.
1984-01-01
The evaluation of simulated TM data obtained on an ER-2 aircraft at twenty-five predesignated sample sites for mapping water quality factors such as conductivity, pH, suspended solids, turbidity, temperature, and depth, is discussed. Using a multiple regression for the seven TM bands, an equation is developed for the suspended solids. TM bands 1, 2, 3, 4, and 6 are used with logarithm conductivity in a multiple regression. The assessment of regression equations for a high coefficient of determination (R-squared) and statistical significance is considered. Confidence intervals about the mean regression point are calculated in order to assess the robustness of the regressions used for mapping conductivity, turbidity, and suspended solids, and by regressing random subsamples of sites and comparing the resultant range of R-squared, cross validation is conducted.
Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello
2018-04-22
A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.
Hewitt, Angela L; Popa, Laurentiu S; Pasalar, Siavash; Hendrix, Claudia M; Ebner, Timothy J
2011-11-01
Encoding of movement kinematics in Purkinje cell simple spike discharge has important implications for hypotheses of cerebellar cortical function. Several outstanding questions remain regarding representation of these kinematic signals. It is uncertain whether kinematic encoding occurs in unpredictable, feedback-dependent tasks or kinematic signals are conserved across tasks. Additionally, there is a need to understand the signals encoded in the instantaneous discharge of single cells without averaging across trials or time. To address these questions, this study recorded Purkinje cell firing in monkeys trained to perform a manual random tracking task in addition to circular tracking and center-out reach. Random tracking provides for extensive coverage of kinematic workspaces. Direction and speed errors are significantly greater during random than circular tracking. Cross-correlation analyses comparing hand and target velocity profiles show that hand velocity lags target velocity during random tracking. Correlations between simple spike firing from 120 Purkinje cells and hand position, velocity, and speed were evaluated with linear regression models including a time constant, τ, as a measure of the firing lead/lag relative to the kinematic parameters. Across the population, velocity accounts for the majority of simple spike firing variability (63 ± 30% of R(adj)(2)), followed by position (28 ± 24% of R(adj)(2)) and speed (11 ± 19% of R(adj)(2)). Simple spike firing often leads hand kinematics. Comparison of regression models based on averaged vs. nonaveraged firing and kinematics reveals lower R(adj)(2) values for nonaveraged data; however, regression coefficients and τ values are highly similar. Finally, for most cells, model coefficients generated from random tracking accurately estimate simple spike firing in either circular tracking or center-out reach. These findings imply that the cerebellum controls movement kinematics, consistent with a forward internal model that predicts upcoming limb kinematics.
Extrapolating intensified forest inventory data to the surrounding landscape using landsat
Evan B. Brooks; John W. Coulston; Valerie A. Thomas; Randolph H. Wynne
2015-01-01
In 2011, a collection of spatially intensified plots was established on three of the Experimental Forests and Ranges (EFRs) sites with the intent of facilitating FIA program objectives for regional extrapolation. Characteristic coefficients from harmonic regression (HR) analysis of associated Landsat stacks are used as inputs into a conditional random forests model to...
Simulation of land use change in the three gorges reservoir area based on CART-CA
NASA Astrophysics Data System (ADS)
Yuan, Min
2018-05-01
This study proposes a new method to simulate spatiotemporal complex multiple land uses by using classification and regression tree algorithm (CART) based CA model. In this model, we use classification and regression tree algorithm to calculate land class conversion probability, and combine neighborhood factor, random factor to extract cellular transformation rules. The overall Kappa coefficient is 0.8014 and the overall accuracy is 0.8821 in the land dynamic simulation results of the three gorges reservoir area from 2000 to 2010, and the simulation results are satisfactory.
Belief in complementary and alternative medicine is related to age and paranormal beliefs in adults.
Van den Bulck, Jan; Custers, Kathleen
2010-04-01
The use of complementary and alternative medicine (CAM) is widespread, even among people who use conventional medicine. Positive beliefs about CAM are common among physicians and medical students. Little is known about the beliefs regarding CAM among the general public. Among science students, belief in CAM was predicted by belief in the paranormal. In a cross-sectional study, 712 randomly selected adults (>18 years old) responded to the CAM Health Belief Questionnaire (CHBQ) and a paranormal beliefs scale. CAM beliefs were very prevalent in this sample of adult Flemish men and women. Zero-order correlations indicated that belief in CAM was associated with age (r = 0.173 P < 0.001) level of education (r = -0.079 P = 0.039) social desirability (r = -0.119 P = 0.002) and paranormal belief (r = 0.365 P < 0.001). In a multivariate model, two variables predicted CAM beliefs. Support for CAM increased with age (regression coefficient: 0.01; 95% confidence interval (CI): 0.006 to 0.014), but the strongest relationship existed between support for CAM and beliefs in the paranormal. Paranormal beliefs accounted for 14% of the variance of the CAM beliefs (regression coefficient: 0.376; 95%: CI 0.30-0.44). The level of education (regression coefficient: 0.06; 95% CI: -0.014-0.129) and social desirability (regression coefficient: -0.023; 95% CI: -0.048-0.026) did not make a significant contribution to the explained variance (<0.1%, P = 0.867). Support of CAM was very prevalent in this Flemish adult population. CAM beliefs were strongly associated with paranormal beliefs.
Sullivan, Sarah; Lewis, Glyn; Mohr, Christine; Herzig, Daniela; Corcoran, Rhiannon; Drake, Richard; Evans, Jonathan
2014-01-01
There is some cross-sectional evidence that theory of mind ability is associated with social functioning in those with psychosis but the direction of this relationship is unknown. This study investigates the longitudinal association between both theory of mind and psychotic symptoms and social functioning outcome in first-episode psychosis. Fifty-four people with first-episode psychosis were followed up at 6 and 12 months. Random effects regression models were used to estimate the stability of theory of mind over time and the association between baseline theory of mind and psychotic symptoms and social functioning outcome. Neither baseline theory of mind ability (regression coefficients: Hinting test 1.07 95% CI -0.74, 2.88; Visual Cartoon test -2.91 95% CI -7.32, 1.51) nor baseline symptoms (regression coefficients: positive symptoms -0.04 95% CI -1.24, 1.16; selected negative symptoms -0.15 95% CI -2.63, 2.32) were associated with social functioning outcome. There was evidence that theory of mind ability was stable over time, (regression coefficients: Hinting test 5.92 95% CI -6.66, 8.92; Visual Cartoon test score 0.13 95% CI -0.17, 0.44). Neither baseline theory of mind ability nor psychotic symptoms are associated with social functioning outcome. Further longitudinal work is needed to understand the origin of social functioning deficits in psychosis.
Modified Regression Correlation Coefficient for Poisson Regression Model
NASA Astrophysics Data System (ADS)
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Waller, Niels G
2016-01-01
For a fixed set of standardized regression coefficients and a fixed coefficient of determination (R-squared), an infinite number of predictor correlation matrices will satisfy the implied quadratic form. I call such matrices fungible correlation matrices. In this article, I describe an algorithm for generating positive definite (PD), positive semidefinite (PSD), or indefinite (ID) fungible correlation matrices that have a random or fixed smallest eigenvalue. The underlying equations of this algorithm are reviewed from both algebraic and geometric perspectives. Two simulation studies illustrate that fungible correlation matrices can be profitably used in Monte Carlo research. The first study uses PD fungible correlation matrices to compare penalized regression algorithms. The second study uses ID fungible correlation matrices to compare matrix-smoothing algorithms. R code for generating fungible correlation matrices is presented in the supplemental materials.
MANCOVA for one way classification with homogeneity of regression coefficient vectors
NASA Astrophysics Data System (ADS)
Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.
2017-11-01
The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
Wagner, Brian J.; Gorelick, Steven M.
1986-01-01
A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
The mycotic ulcer treatment trial: a randomized trial comparing natamycin vs voriconazole.
Prajna, N Venkatesh; Krishnan, Tiruvengada; Mascarenhas, Jeena; Rajaraman, Revathi; Prajna, Lalitha; Srinivasan, Muthiah; Raghavan, Anita; Oldenburg, Catherine E; Ray, Kathryn J; Zegans, Michael E; McLeod, Stephen D; Porco, Travis C; Acharya, Nisha R; Lietman, Thomas M
2013-04-01
To compare topical natamycin vs voriconazole in the treatment of filamentous fungal keratitis. This phase 3, double-masked, multicenter trial was designed to randomize 368 patients to voriconazole (1%) or natamycin (5%), applied topically every hour while awake until reepithelialization, then 4 times daily for at least 3 weeks. Eligibility included smear-positive filamentous fungal ulcer and visual acuity of 20/40 to 20/400. The primary outcome was best spectacle-corrected visual acuity at 3 months; secondary outcomes included corneal perforation and/or therapeutic penetrating keratoplasty. A total of 940 patients were screened and 323 were enrolled. Causative organisms included Fusarium (128 patients [40%]), Aspergillus (54 patients [17%]), and other filamentous fungi (141 patients [43%]). Natamycintreated cases had significantly better 3-month best spectacle-corrected visual acuity than voriconazole-treated cases (regression coefficient=0.18 logMAR; 95% CI, 0.30 to 0.05; P=.006). Natamycin-treated cases were less likely to have perforation or require therapeutic penetrating keratoplasty (odds ratio=0.42; 95% CI, 0.22 to 0.80; P=.009). Fusarium cases fared better with natamycin than with voriconazole (regression coefficient=0.41 logMAR; 95% CI,0.61 to 0.20; P<.001; odds ratio for perforation=0.06; 95% CI, 0.01 to 0.28; P<.001), while non-Fusarium cases fared similarly (regression coefficient=0.02 logMAR; 95% CI, 0.17 to 0.13; P=.81; odds ratio for perforation=1.08; 95% CI, 0.48 to 2.43; P=.86). Natamycin treatment was associated with significantly better clinical and microbiological outcomes than voriconazole treatment for smear-positive filamentous fungal keratitis, with much of the difference attributable to improved results in Fusarium cases. Voriconazole should not be used as monotherapy in filamentous keratitis. clinicaltrials.gov Identifier: NCT00996736
Investigating bias in squared regression structure coefficients
Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273
Howard, Jeremy T; Jiao, Shihui; Tiezzi, Francesco; Huang, Yijian; Gray, Kent A; Maltecca, Christian
2015-05-30
Feed intake and growth are economically important traits in swine production. Previous genome wide association studies (GWAS) have utilized average daily gain or daily feed intake to identify regions that impact growth and feed intake across time. The use of longitudinal models in GWAS studies, such as random regression, allows for SNPs having a heterogeneous effect across the trajectory to be characterized. The objective of this study is therefore to conduct a single step GWAS (ssGWAS) on the animal polynomial coefficients for feed intake and growth. Corrected daily feed intake (DFI Adj) and average daily weight measurements (DBW Avg) on 8981 (n=525,240 observations) and 5643 (n=283,607 observations) animals were utilized in a random regression model using Legendre polynomials (order=2) and a relationship matrix that included genotyped and un-genotyped animals. A ssGWAS was conducted on the animal polynomials coefficients (intercept, linear and quadratic) for animals with genotypes (DFIAdj: n=855; DBWAvg: n=590). Regions were characterized based on the variance of 10-SNP sliding windows GEBV (WGEBV). A bootstrap analysis (n=1000) was conducted to declare significance. Heritability estimates for the traits trajectory ranged from 0.34-0.52 to 0.07-0.23 for DBWAvg and DFIAdj, respectively. Genetic correlations across age classes were large and positive for both DBWAvg and DFIAdj, albeit age classes at the beginning had a small to moderate genetic correlation with age classes towards the end of the trajectory for both traits. The WGEBV variance explained by significant regions (P<0.001) for each polynomial coefficient ranged from 0.2-0.9 to 0.3-1.01% for DBWAvg and DFIAdj, respectively. The WGEBV variance explained by significant regions for the trajectory was 1.54 and 1.95% for DBWAvg and DFIAdj. Both traits identified candidate genes with functions related to metabolite and energy homeostasis, glucose and insulin signaling and behavior. We have identified regions of the genome that have an impact on the intercept, linear and quadratic terms for DBWAvg and DFIAdj. These results provide preliminary evidence that individual growth and feed intake trajectories are impacted by different regions of the genome at different times.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang, Kunkun, E-mail: ktg@illinois.edu; Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence; Congedo, Pietro M.
The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable formore » real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.« less
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL
Huang, Jian; Sun, Tingni; Ying, Zhiliang; Yu, Yi; Zhang, Cun-Hui
2013-01-01
We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities. PMID:24086091
ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL.
Huang, Jian; Sun, Tingni; Ying, Zhiliang; Yu, Yi; Zhang, Cun-Hui
2013-06-01
We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities.
Homogenization Issues in the Combustion of Heterogeneous Solid Propellants
NASA Technical Reports Server (NTRS)
Chen, M.; Buckmaster, J.; Jackson, T. L.; Massa, L.
2002-01-01
We examine random packs of discs or spheres, models for ammonium-perchlorate-in-binder propellants, and discuss their average properties. An analytical strategy is described for calculating the mean or effective heat conduction coefficient in terms of the heat conduction coefficients of the individual components, and the results are verified by comparison with those of direct numerical simulations (dns) for both 2-D (disc) and 3-D (sphere) packs across which a temperature difference is applied. Similarly, when the surface regression speed of each component is related to the surface temperature via a simple Arrhenius law, an analytical strategy is developed for calculating an effective Arrhenius law for the combination, and these results are verified using dns in which a uniform heat flux is applied to the pack surface, causing it to regress. These results are needed for homogenization strategies necessary for fully integrated 2-D or 3-D simulations of heterogeneous propellant combustion.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
NASA Astrophysics Data System (ADS)
Tang, Kunkun; Congedo, Pietro M.; Abgrall, Rémi
2016-06-01
The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.
Does higher education protect against obesity? Evidence using Mendelian randomization.
Böckerman, Petri; Viinikainen, Jutta; Pulkki-Råback, Laura; Hakulinen, Christian; Pitkänen, Niina; Lehtimäki, Terho; Pehkonen, Jaakko; Raitakari, Olli T
2017-08-01
The aim of this explorative study was to examine the effect of education on obesity using Mendelian randomization. Participants (N=2011) were from the on-going nationally representative Young Finns Study (YFS) that began in 1980 when six cohorts (aged 30, 33, 36, 39, 42 and 45 in 2007) were recruited. The average value of BMI (kg/m 2 ) measurements in 2007 and 2011 and genetic information were linked to comprehensive register-based information on the years of education in 2007. We first used a linear regression (Ordinary Least Squares, OLS) to estimate the relationship between education and BMI. To identify a causal relationship, we exploited Mendelian randomization and used a genetic score as an instrument for education. The genetic score was based on 74 genetic variants that genome-wide association studies (GWASs) have found to be associated with the years of education. Because the genotypes are randomly assigned at conception, the instrument causes exogenous variation in the years of education and thus enables identification of causal effects. The years of education in 2007 were associated with lower BMI in 2007/2011 (regression coefficient (b)=-0.22; 95% Confidence Intervals [CI]=-0.29, -0.14) according to the linear regression results. The results based on Mendelian randomization suggests that there may be a negative causal effect of education on BMI (b=-0.84; 95% CI=-1.77, 0.09). The findings indicate that education could be a protective factor against obesity in advanced countries. Copyright © 2017 Elsevier Inc. All rights reserved.
Naserkheil, Masoumeh; Miraie-Ashtiani, Seyed Reza; Nejati-Javaremi, Ardeshir; Son, Jihyun; Lee, Deukhwan
2016-12-01
The objective of this study was to estimate the genetic parameters of milk protein yields in Iranian Holstein dairy cattle. A total of 1,112,082 test-day milk protein yield records of 167,269 first lactation Holstein cows, calved from 1990 to 2010, were analyzed. Estimates of the variance components, heritability, and genetic correlations for milk protein yields were obtained using a random regression test-day model. Milking times, herd, age of recording, year, and month of recording were included as fixed effects in the model. Additive genetic and permanent environmental random effects for the lactation curve were taken into account by applying orthogonal Legendre polynomials of the fourth order in the model. The lowest and highest additive genetic variances were estimated at the beginning and end of lactation, respectively. Permanent environmental variance was higher at both extremes. Residual variance was lowest at the middle of the lactation and contrarily, heritability increased during this period. Maximum heritability was found during the 12th lactation stage (0.213±0.007). Genetic, permanent, and phenotypic correlations among test-days decreased as the interval between consecutive test-days increased. A relatively large data set was used in this study; therefore, the estimated (co)variance components for random regression coefficients could be used for national genetic evaluation of dairy cattle in Iran.
Naserkheil, Masoumeh; Miraie-Ashtiani, Seyed Reza; Nejati-Javaremi, Ardeshir; Son, Jihyun; Lee, Deukhwan
2016-01-01
The objective of this study was to estimate the genetic parameters of milk protein yields in Iranian Holstein dairy cattle. A total of 1,112,082 test-day milk protein yield records of 167,269 first lactation Holstein cows, calved from 1990 to 2010, were analyzed. Estimates of the variance components, heritability, and genetic correlations for milk protein yields were obtained using a random regression test-day model. Milking times, herd, age of recording, year, and month of recording were included as fixed effects in the model. Additive genetic and permanent environmental random effects for the lactation curve were taken into account by applying orthogonal Legendre polynomials of the fourth order in the model. The lowest and highest additive genetic variances were estimated at the beginning and end of lactation, respectively. Permanent environmental variance was higher at both extremes. Residual variance was lowest at the middle of the lactation and contrarily, heritability increased during this period. Maximum heritability was found during the 12th lactation stage (0.213±0.007). Genetic, permanent, and phenotypic correlations among test-days decreased as the interval between consecutive test-days increased. A relatively large data set was used in this study; therefore, the estimated (co)variance components for random regression coefficients could be used for national genetic evaluation of dairy cattle in Iran. PMID:26954192
Novaković, Romana; Geelen, Anouk; Ristić-Medić, Danijela; Nikolić, Marina; Souverein, Olga W; McNulty, Helene; Duffy, Maresa; Hoey, Leane; Dullemeijer, Carla; Renkema, Jacoba M S; Gurinović, Mirjana; Glibetić, Marija; de Groot, Lisette C P G M; Van't Veer, Pieter
2018-06-07
Dietary reference values for folate intake vary widely across Europe. MEDLINE and Embase through November 2016 were searched for data on the association between folate intake and biomarkers (serum/plasma folate, red blood cell [RBC] folate, plasma homocysteine) from observational studies in healthy adults and elderly. The regression coefficient of biomarkers on intake (β) was extracted from each study, and the overall and stratified pooled β and SE (β) were obtained by random effects meta-analysis on a double log scale. These dose-response estimates may be used to derive folate intake reference values. For every doubling in folate intake, the changes in serum/plasma folate, RBC folate and plasma homocysteine were +22, +21, and -16% respectively. The overall pooled regression coefficients were β = 0.29 (95% CI 0.21-0.37) for serum/plasma folate (26 estimates from 17 studies), β = 0.28 (95% CI 0.21-0.36) for RBC (13 estimates from 11 studies), and β = -0.21 (95% CI -0.31 to -0.11) for plasma homocysteine (10 estimates from 6 studies). These estimates along with those from randomized controlled trials can be used for underpinning dietary recommendations for folate in adults and elderly. © 2018 S. Karger AG, Basel.
Forster, Jeri E.; MaWhinney, Samantha; Ball, Erika L.; Fairclough, Diane
2011-01-01
Dropout is common in longitudinal clinical trials and when the probability of dropout depends on unobserved outcomes even after conditioning on available data, it is considered missing not at random and therefore nonignorable. To address this problem, mixture models can be used to account for the relationship between a longitudinal outcome and dropout. We propose a Natural Spline Varying-coefficient mixture model (NSV), which is a straightforward extension of the parametric Conditional Linear Model (CLM). We assume that the outcome follows a varying-coefficient model conditional on a continuous dropout distribution. Natural cubic B-splines are used to allow the regression coefficients to semiparametrically depend on dropout and inference is therefore more robust. Additionally, this method is computationally stable and relatively simple to implement. We conduct simulation studies to evaluate performance and compare methodologies in settings where the longitudinal trajectories are linear and dropout time is observed for all individuals. Performance is assessed under conditions where model assumptions are both met and violated. In addition, we compare the NSV to the CLM and a standard random-effects model using an HIV/AIDS clinical trial with probable nonignorable dropout. The simulation studies suggest that the NSV is an improvement over the CLM when dropout has a nonlinear dependence on the outcome. PMID:22101223
Tests of Hypotheses Arising In the Correlated Random Coefficient Model*
Heckman, James J.; Schmierer, Daniel
2010-01-01
This paper examines the correlated random coefficient model. It extends the analysis of Swamy (1971), who pioneered the uncorrelated random coefficient model in economics. We develop the properties of the correlated random coefficient model and derive a new representation of the variance of the instrumental variable estimator for that model. We develop tests of the validity of the correlated random coefficient model against the null hypothesis of the uncorrelated random coefficient model. PMID:21170148
Kaambwa, Billingsley; Bryan, Stirling; Billingham, Lucinda
2012-06-27
Missing data is a common statistical problem in healthcare datasets from populations of older people. Some argue that arbitrarily assuming the mechanism responsible for the missingness and therefore the method for dealing with this missingness is not the best option-but is this always true? This paper explores what happens when extra information that suggests that a particular mechanism is responsible for missing data is disregarded and methods for dealing with the missing data are chosen arbitrarily. Regression models based on 2,533 intermediate care (IC) patients from the largest evaluation of IC done and published in the UK to date were used to explain variation in costs, EQ-5D and Barthel index. Three methods for dealing with missingness were utilised, each assuming a different mechanism as being responsible for the missing data: complete case analysis (assuming missing completely at random-MCAR), multiple imputation (assuming missing at random-MAR) and Heckman selection model (assuming missing not at random-MNAR). Differences in results were gauged by examining the signs of coefficients as well as the sizes of both coefficients and associated standard errors. Extra information strongly suggested that missing cost data were MCAR. The results show that MCAR and MAR-based methods yielded similar results with sizes of most coefficients and standard errors differing by less than 3.4% while those based on MNAR-methods were statistically different (up to 730% bigger). Significant variables in all regression models also had the same direction of influence on costs. All three mechanisms of missingness were shown to be potential causes of the missing EQ-5D and Barthel data. The method chosen to deal with missing data did not seem to have any significant effect on the results for these data as they led to broadly similar conclusions with sizes of coefficients and standard errors differing by less than 54% and 322%, respectively. Arbitrary selection of methods to deal with missing data should be avoided. Using extra information gathered during the data collection exercise about the cause of missingness to guide this selection would be more appropriate.
Mainou, Maria; Madenidou, Anastasia-Vasiliki; Liakos, Aris; Paschos, Paschalis; Karagiannis, Thomas; Bekiari, Eleni; Vlachaki, Efthymia; Wang, Zhen; Murad, Mohammad Hassan; Kumar, Shaji; Tsapas, Apostolos
2017-06-01
We performed a systematic review and meta-regression analysis of randomized control trials to investigate the association between response to initial treatment and survival outcomes in patients with newly diagnosed multiple myeloma (MM). Response outcomes included complete response (CR) and the combined outcome of CR or very good partial response (VGPR), while survival outcomes were overall survival (OS) and progression-free survival (PFS). We used random-effect meta-regression models and conducted sensitivity analyses based on definition of CR and study quality. Seventy-two trials were included in the systematic review, 63 of which contributed data in meta-regression analyses. There was no association between OS and CR in patients without autologous stem cell transplant (ASCT) (regression coefficient: .02, 95% confidence interval [CI] -0.06, 0.10), in patients undergoing ASCT (-.11, 95% CI -0.44, 0.22) and in trials comparing ASCT with non-ASCT patients (.04, 95% CI -0.29, 0.38). Similarly, OS did not correlate with the combined metric of CR or VGPR, and no association was evident between response outcomes and PFS. Sensitivity analyses yielded similar results. This meta-regression analysis suggests that there is no association between conventional response outcomes and survival in patients with newly diagnosed MM. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Ridge: a computer program for calculating ridge regression estimates
Donald E. Hilt; Donald W. Seegrist
1977-01-01
Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.
Hunter, Paul R
2009-12-01
Household water treatment (HWT) is being widely promoted as an appropriate intervention for reducing the burden of waterborne disease in poor communities in developing countries. A recent study has raised concerns about the effectiveness of HWT, in part because of concerns over the lack of blinding and in part because of considerable heterogeneity in the reported effectiveness of randomized controlled trials. This study set out to attempt to investigate the causes of this heterogeneity and so identify factors associated with good health gains. Studies identified in an earlier systematic review and meta-analysis were supplemented with more recently published randomized controlled trials. A total of 28 separate studies of randomized controlled trials of HWT with 39 intervention arms were included in the analysis. Heterogeneity was studied using the "metareg" command in Stata. Initial analyses with single candidate predictors were undertaken and all variables significant at the P < 0.2 level were included in a final regression model. Further analyses were done to estimate the effect of the interventions over time by MonteCarlo modeling using @Risk and the parameter estimates from the final regression model. The overall effect size of all unblinded studies was relative risk = 0.56 (95% confidence intervals 0.51-0.63), but after adjusting for bias due to lack of blinding the effect size was much lower (RR = 0.85, 95% CI = 0.76-0.97). Four main variables were significant predictors of effectiveness of intervention in a multipredictor meta regression model: Log duration of study follow-up (regression coefficient of log effect size = 0.186, standard error (SE) = 0.072), whether or not the study was blinded (coefficient 0.251, SE 0.066) and being conducted in an emergency setting (coefficient -0.351, SE 0.076) were all significant predictors of effect size in the final model. Compared to the ceramic filter all other interventions were much less effective (Biosand 0.247, 0.073; chlorine and safe waste storage 0.295, 0.061; combined coagulant-chlorine 0.2349, 0.067; SODIS 0.302, 0.068). A Monte Carlo model predicted that over 12 months ceramic filters were likely to be still effective at reducing disease, whereas SODIS, chlorination, and coagulation-chlorination had little if any benefit. Indeed these three interventions are predicted to have the same or less effect than what may be expected due purely to reporting bias in unblinded studies With the currently available evidence ceramic filters are the most effective form of HWT in the longterm, disinfection-only interventions including SODIS appear to have poor if any longterm public health benefit.
Liu, Xian; Engel, Charles C
2012-12-20
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Hewson, Caroline J.; Dohoo, Ian R.
2006-01-01
Abstract Factors affecting the postincisional use of analgesics for ovariohysterectomy (OVH) in dogs and cats were assessed by using data collected from 280 Canadian veterinarians, as part of a national, randomized mail survey (response rate 57.8%). Predictors of analgesic usage identified by logistic regression included the presence of at least 1 animal health technician (AHT) per 2 veterinarians (OR = 2.3, P = 0.004), and the veterinarians’ perception of the pain caused by surgery without analgesia (OR = 1.5, P < 0.001). Linear regression identified the following predictors of veterinarians’ perception of pain: the presence of more than 1 AHT per 2 veterinarians (coefficient = 0.42, P = 0.048) and the number of years since graduation (coefficient = −0.073, P < 0.001). Some of these risk factors are similar to those identified in 1994. The results suggest that continuing education may help to increase analgesic usage. Other important contributors may be client education and a valid method of pain assessment. PMID:16734371
Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.
2015-01-01
Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
Gjerde, Hallvard; Verstraete, Alain
2010-02-25
To study several methods for estimating the prevalence of high blood concentrations of tetrahydrocannabinol and amphetamine in a population of drug users by analysing oral fluid (saliva). Five methods were compared, including simple calculation procedures dividing the drug concentrations in oral fluid by average or median oral fluid/blood (OF/B) drug concentration ratios or linear regression coefficients, and more complex Monte Carlo simulations. Populations of 311 cannabis users and 197 amphetamine users from the Rosita-2 Project were studied. The results of a feasibility study suggested that the Monte Carlo simulations might give better accuracies than simple calculations if good data on OF/B ratios is available. If using only 20 randomly selected OF/B ratios, a Monte Carlo simulation gave the best accuracy but not the best precision. Dividing by the OF/B regression coefficient gave acceptable accuracy and precision, and was therefore the best method. None of the methods gave acceptable accuracy if the prevalence of high blood drug concentrations was less than 15%. Dividing the drug concentration in oral fluid by the OF/B regression coefficient gave an acceptable estimation of high blood drug concentrations in a population, and may therefore give valuable additional information on possible drug impairment, e.g. in roadside surveys of drugs and driving. If good data on the distribution of OF/B ratios are available, a Monte Carlo simulation may give better accuracy. 2009 Elsevier Ireland Ltd. All rights reserved.
Neither fixed nor random: weighted least squares meta-analysis.
Stanley, T D; Doucouliagos, Hristos
2015-06-15
This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects. Copyright © 2015 John Wiley & Sons, Ltd.
Maas, Iris L; Nolte, Sandra; Walter, Otto B; Berger, Thomas; Hautzinger, Martin; Hohagen, Fritz; Lutz, Wolfgang; Meyer, Björn; Schröder, Johanna; Späth, Christina; Klein, Jan Philipp; Moritz, Steffen; Rose, Matthias
2017-02-01
To compare treatment effect estimates obtained from a regression discontinuity (RD) design with results from an actual randomized controlled trial (RCT). Data from an RCT (EVIDENT), which studied the effect of an Internet intervention on depressive symptoms measured with the Patient Health Questionnaire (PHQ-9), were used to perform an RD analysis, in which treatment allocation was determined by a cutoff value at baseline (PHQ-9 = 10). A linear regression model was fitted to the data, selecting participants above the cutoff who had received the intervention (n = 317) and control participants below the cutoff (n = 187). Outcome was PHQ-9 sum score 12 weeks after baseline. Robustness of the effect estimate was studied; the estimate was compared with the RCT treatment effect. The final regression model showed a regression coefficient of -2.29 [95% confidence interval (CI): -3.72 to -.85] compared with a treatment effect found in the RCT of -1.57 (95% CI: -2.07 to -1.07). Although the estimates obtained from two designs are not equal, their confidence intervals overlap, suggesting that an RD design can be a valid alternative for RCTs. This finding is particularly important for situations where an RCT may not be feasible or ethical as is often the case in clinical research settings. Copyright © 2016 Elsevier Inc. All rights reserved.
A Small and Slim Coaxial Probe for Single Rice Grain Moisture Sensing
You, Kok Yeow; Mun, Hou Kit; You, Li Ling; Salleh, Jamaliah; Abbas, Zulkifly
2013-01-01
A moisture detection of single rice grains using a slim and small open-ended coaxial probe is presented. The coaxial probe is suitable for the nondestructive measurement of moisture values in the rice grains ranging from from 9.5% to 26%. Empirical polynomial models are developed to predict the gravimetric moisture content of rice based on measured reflection coefficients using a vector network analyzer. The relationship between the reflection coefficient and relative permittivity were also created using a regression method and expressed in a polynomial model, whose model coefficients were obtained by fitting the data from Finite Element-based simulation. Besides, the designed single rice grain sample holder and experimental set-up were shown. The measurement of single rice grains in this study is more precise compared to the measurement in conventional bulk rice grains, as the random air gap present in the bulk rice grains is excluded. PMID:23493127
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
Grieve, Richard; Nixon, Richard; Thompson, Simon G
2010-01-01
Cost-effectiveness analyses (CEA) may be undertaken alongside cluster randomized trials (CRTs) where randomization is at the level of the cluster (for example, the hospital or primary care provider) rather than the individual. Costs (and outcomes) within clusters may be correlated so that the assumption made by standard bivariate regression models, that observations are independent, is incorrect. This study develops a flexible modeling framework to acknowledge the clustering in CEA that use CRTs. The authors extend previous Bayesian bivariate models for CEA of multicenter trials to recognize the specific form of clustering in CRTs. They develop new Bayesian hierarchical models (BHMs) that allow mean costs and outcomes, and also variances, to differ across clusters. They illustrate how each model can be applied using data from a large (1732 cases, 70 primary care providers) CRT evaluating alternative interventions for reducing postnatal depression. The analyses compare cost-effectiveness estimates from BHMs with standard bivariate regression models that ignore the data hierarchy. The BHMs show high levels of cost heterogeneity across clusters (intracluster correlation coefficient, 0.17). Compared with standard regression models, the BHMs yield substantially increased uncertainty surrounding the cost-effectiveness estimates, and altered point estimates. The authors conclude that ignoring clustering can lead to incorrect inferences. The BHMs that they present offer a flexible modeling framework that can be applied more generally to CEA that use CRTs.
Job strain and resting heart rate: a cross-sectional study in a Swedish random working sample.
Eriksson, Peter; Schiöler, Linus; Söderberg, Mia; Rosengren, Annika; Torén, Kjell
2016-03-05
Numerous studies have reported an association between stressing work conditions and cardiovascular disease. However, more evidence is needed, and the etiological mechanisms are unknown. Elevated resting heart rate has emerged as a possible risk factor for cardiovascular disease, but little is known about the relation to work-related stress. This study therefore investigated the association between job strain, job control, and job demands and resting heart rate. We conducted a cross-sectional survey of randomly selected men and women in Västra Götalandsregionen, Sweden (West county of Sweden) (n = 1552). Information about job strain, job demands, job control, heart rate and covariates was collected during the period 2001-2004 as part of the INTERGENE/ADONIX research project. Six different linear regression models were used with adjustments for gender, age, BMI, smoking, education, and physical activity in the fully adjusted model. Job strain was operationalized as the log-transformed ratio of job demands over job control in the statistical analyses. No associations were seen between resting heart rate and job demands. Job strain was associated with elevated resting heart rate in the unadjusted model (linear regression coefficient 1.26, 95 % CI 0.14 to 2.38), but not in any of the extended models. Low job control was associated with elevated resting heart rate after adjustments for gender, age, BMI, and smoking (linear regression coefficient -0.18, 95 % CI -0.30 to -0.02). However, there were no significant associations in the fully adjusted model. Low job control and job strain, but not job demands, were associated with elevated resting heart rate. However, the observed associations were modest and may be explained by confounding effects.
Liu, Quan; Ma, Li; Fan, Shou-Zen; Abbod, Maysam F; Shieh, Jiann-Shing
2018-01-01
Estimating the depth of anaesthesia (DoA) in operations has always been a challenging issue due to the underlying complexity of the brain mechanisms. Electroencephalogram (EEG) signals are undoubtedly the most widely used signals for measuring DoA. In this paper, a novel EEG-based index is proposed to evaluate DoA for 24 patients receiving general anaesthesia with different levels of unconsciousness. Sample Entropy (SampEn) algorithm was utilised in order to acquire the chaotic features of the signals. After calculating the SampEn from the EEG signals, Random Forest was utilised for developing learning regression models with Bispectral index (BIS) as the target. Correlation coefficient, mean absolute error, and area under the curve (AUC) were used to verify the perioperative performance of the proposed method. Validation comparisons with typical nonstationary signal analysis methods (i.e., recurrence analysis and permutation entropy) and regression methods (i.e., neural network and support vector machine) were conducted. To further verify the accuracy and validity of the proposed methodology, the data is divided into four unconsciousness-level groups on the basis of BIS levels. Subsequently, analysis of variance (ANOVA) was applied to the corresponding index (i.e., regression output). Results indicate that the correlation coefficient improved to 0.72 ± 0.09 after filtering and to 0.90 ± 0.05 after regression from the initial values of 0.51 ± 0.17. Similarly, the final mean absolute error dramatically declined to 5.22 ± 2.12. In addition, the ultimate AUC increased to 0.98 ± 0.02, and the ANOVA analysis indicates that each of the four groups of different anaesthetic levels demonstrated significant difference from the nearest levels. Furthermore, the Random Forest output was extensively linear in relation to BIS, thus with better DoA prediction accuracy. In conclusion, the proposed method provides a concrete basis for monitoring patients' anaesthetic level during surgeries.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.
ERIC Educational Resources Information Center
Thompson, Bruce
Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Nielsen, Jannie; Bahendeka, Silver K; Whyte, Susan R; Meyrowitsch, Dan W; Bygbjerg, Ib C; Witte, Daniel R
2017-09-21
Prevention of type 2 diabetes (T2D) has been successfully established in randomised clinical trials. However, the best methods for the translation of this evidence into effective population-wide interventions remain unclear. To assess whether households could be a target for T2D prevention and screening, we investigated the resemblance of T2D risk factors at household level and by type of familial dyadic relationship in a rural Ugandan community. This cross-sectional household-based study included 437 individuals ≥13 years of age from 90 rural households in south-western Uganda. Resemblance in glycosylated haemoglobin (HbA1c), anthropometry, blood pressure, fitness status and sitting time were analysed using a general mixed model with random effects (by household or dyad) to calculate household intraclass correlation coefficients (ICCs) and dyadic regression coefficients. Logistic regression with household as a random effect was used to calculate the ORs for individuals having a condition or risk factor if another household member had the same condition. The strongest degree of household member resemblances in T2D risk factors was seen in relation to fitness status (ICC=0.24), HbA1c (ICC=0.18) and systolic blood pressure (ICC=0.11). Regarding dyadic resemblance, the highest standardised regression coefficient was seen in fitness status for spouses (0.54, 95% CI 0.32 to 0.76), parent-offspring (0.41, 95% CI 0.28 0.54) and siblings (0.41, 95% CI 0.25 to 0.57). Overall, parent-offspring and sibling pairs were the dyads with strongest resemblance, followed by spouses. The marked degree of resemblance in T2D risk factors at household level and between spouses, parent-offspring and sibling dyads suggest that shared behavioural and environmental factors may influence risk factor levels among cohabiting individuals, which point to the potential of the household setting for screening and prevention of T2D. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
NASA Astrophysics Data System (ADS)
Sulistianingsih, E.; Kiftiah, M.; Rosadi, D.; Wahyuni, H.
2017-04-01
Gross Domestic Product (GDP) is an indicator of economic growth in a region. GDP is a panel data, which consists of cross-section and time series data. Meanwhile, panel regression is a tool which can be utilised to analyse panel data. There are three models in panel regression, namely Common Effect Model (CEM), Fixed Effect Model (FEM) and Random Effect Model (REM). The models will be chosen based on results of Chow Test, Hausman Test and Lagrange Multiplier Test. This research analyses palm oil about production, export, and government consumption to five district GDP are in West Kalimantan, namely Sanggau, Sintang, Sambas, Ketapang and Bengkayang by panel regression. Based on the results of analyses, it concluded that REM, which adjusted-determination-coefficient is 0,823, is the best model in this case. Also, according to the result, only Export and Government Consumption that influence GDP of the districts.
On the Occurrence of Standardized Regression Coefficients Greater than One.
ERIC Educational Resources Information Center
Deegan, John, Jr.
1978-01-01
It is demonstrated here that standardized regression coefficients greater than one can legitimately occur. Furthermore, the relationship between the occurrence of such coefficients and the extent of multicollinearity present among the set of predictor variables in an equation is examined. Comments on the interpretation of these coefficients are…
Chen, Ying-Jen; Ho, Meng-Yang; Chen, Kwan-Ju; Hsu, Chia-Fen; Ryu, Shan-Jin
2009-08-01
The aims of the present study were to (i) investigate if traditional Chinese word reading ability can be used for estimating premorbid general intelligence; and (ii) to provide multiple regression equations for estimating premorbid performance on Raven's Standard Progressive Matrices (RSPM), using age, years of education and Chinese Graded Word Reading Test (CGWRT) scores as predictor variables. Four hundred and twenty-six healthy volunteers (201 male, 225 female), aged 16-93 years (mean +/- SD, 41.92 +/- 18.19 years) undertook the tests individually under supervised conditions. Seventy percent of subjects were randomly allocated to the derivation group (n = 296), and the rest to the validation group (n = 130). RSPM score was positively correlated with CGWRT score and years of education. RSPM and CGWRT scores and years of education were also inversely correlated with age, but the declining trend for RSPM performance against age was steeper than that for CGWRT performance. Separate multiple regression equations were derived for estimating RSPM scores using different combinations of age, years of education, and CGWRT score for both groups. The multiple regression coefficient of each equation ranged from 0.71 to 0.80 with the standard error of estimate between 7 and 8 RSPM points. When fitting the data of one group to the equations derived from its counterpart group, the cross-validation multiple regression coefficients ranged from 0.71 to 0.79. There were no significant differences in the 'predicted-obtained' RSPM discrepancies between any equations. The regression equations derived in the present study may provide a basis for estimating premorbid RSPM performance.
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients
NASA Astrophysics Data System (ADS)
Adnan, Arisman; Sugiarto, Sigit
2017-06-01
The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.
Refraction data survey: 2nd generation correlation of myopia.
Greene, Peter R; Medina, Antonio
2016-10-01
The objective herein is to provide refraction data, myopia progression rate, prevalence, and 1st and 2nd generation correlations, relevant to whether myopia is random or inherited. First- and second-generation ocular refraction data are assembled from N = 34 families, average of 2.8 children per family. From this group, data are available from N = 165 subjects. Inter-generation regressions are performed on all the data sets, including correlation coefficient r, and myopia prevalence [%]. Prevalence of myopia is [M] = 38.5 %. Prevalence of high myopes with |R| >6 D is [M-] = 20.5 %. Average refraction is = -7.52 D ± 1.31 D (N = 33). Regression parameters are calculated for all the data sets, yielding correlation coefficients in the range r = 0.48-0.72 for some groups of myopes and high myopes, fathers to daughters, and mothers to sons. Also of interest, some categories show essentially no correlation, -0.20 < r < 0.20, indicating that the refractive errors occur randomly. Time series results show myopia diopter rates = -0.50 D/year.
NASA Astrophysics Data System (ADS)
Indarsih, Indrati, Ch. Rini
2016-02-01
In this paper, we define variance of the fuzzy random variables through alpha level. We have a theorem that can be used to know that the variance of fuzzy random variables is a fuzzy number. We have a multi-objective linear programming (MOLP) with fuzzy random of objective function coefficients. We will solve the problem by variance approach. The approach transform the MOLP with fuzzy random of objective function coefficients into MOLP with fuzzy of objective function coefficients. By weighted methods, we have linear programming with fuzzy coefficients and we solve by simplex method for fuzzy linear programming.
Morfeld, Peter; Spallek, Michael
2015-01-01
Vermeulen et al. 2014 published a meta-regression analysis of three relevant epidemiological US studies (Steenland et al. 1998, Garshick et al. 2012, Silverman et al. 2012) that estimated the association between occupational diesel engine exhaust (DEE) exposure and lung cancer mortality. The DEE exposure was measured as cumulative exposure to estimated respirable elemental carbon in μg/m(3)-years. Vermeulen et al. 2014 found a statistically significant dose-response association and described elevated lung cancer risks even at very low exposures. We performed an extended re-analysis using different modelling approaches (fixed and random effects regression analyses, Greenland/Longnecker method) and explored the impact of varying input data (modified coefficients of Garshick et al. 2012, results from Crump et al. 2015 replacing Silverman et al. 2012, modified analysis of Moehner et al. 2013). We reproduced the individual and main meta-analytical results of Vermeulen et al. 2014. However, our analysis demonstrated a heterogeneity of the baseline relative risk levels between the three studies. This heterogeneity was reduced after the coefficients of Garshick et al. 2012 were modified while the dose coefficient dropped by an order of magnitude for this study and was far from being significant (P = 0.6). A (non-significant) threshold estimate for the cumulative DEE exposure was found at 150 μg/m(3)-years when extending the meta-analyses of the three studies by hockey-stick regression modelling (including the modified coefficients for Garshick et al. 2012). The data used by Vermeulen and colleagues led to the highest relative risk estimate across all sensitivity analyses performed. The lowest relative risk estimate was found after exclusion of the explorative study by Steenland et al. 1998 in a meta-regression analysis of Garshick et al. 2012 (modified), Silverman et al. 2012 (modified according to Crump et al. 2015) and Möhner et al. 2013. The meta-coefficient was estimated to be about 10-20 % of the main effect estimate in Vermeulen et al. 2014 in this analysis. The findings of Vermeulen et al. 2014 should not be used without reservations in any risk assessments. This is particularly true for the low end of the exposure scale.
Villar Balboa, Iván; Carrillo Muñoz, Ricard; Regí Bosque, Meritxell; Marzo Castillejo, Mercè; Arcusa Villacampa, Núria; Segundo Yagüe, Marta
2014-04-01
To describe the relationship between individual or combined prognostic factors in the multidimensional classifications (BODE and ADO), and health-related quality of life (HRQOL) in patients with chronic obstructive pulmonary disease (COPD). Cross-sectional descriptive study. Primary care. Systematic random sample of 102 patients diagnosed with COPD, excluding those patients with acute exacerbation, dementia, terminal illness or those who receive home care. Demographics variables, smoking habits, body mass index and number of exacerbations. Comorbidity. Degree of dyspnea. Respiratory function tests. Exercise capacity. The BODE index and the ADO index. The EuroQol-5D questionnaire (EQ-5D), and visual analogue scale (VAS). EQ-5D: mobility: 43.9%; personal care: 13.3%; daily-life activities: 29.6%; pain/discomfort: 55.1%; anxiety/depression: 37.8%, and 34.7% VAS ≤ 60%. Exacerbations: Mobility, OR: 1.85 (95%CI: 1.08-3.20); personal care, OR: 2.12 (95%CI: 1.3-4.76); daily-life activities, OR: 2.35 (95%CI: 1.17-4.71); VAS, regression coefficient: -3.50 (95%CI: 6.31- -0.70). Dyspnea: mobility, OR: 4.47 (95%CI: 1.39-14.42); daily-life activities, OR: 7.71 (95%CI: 2.03-12.34); VAS, regression coefficient: -7.15 (95%CI: 11.71- -2.59). BODE: mobility, OR: 1.53 (95%CI: 1.15-2.02); personal care, OR: 2.08 (95%CI: 1.40-3.11); daily-life activities, OR: 1.97 (95%CI: 1.38-2.80); VAS, regression coefficient: -3.96 (95%CI: -5.51- -2.42). ADO: mobility, OR: 2.42 (95%CI: 1.39-4.20); personal care, OR: 3.21 (95%CI: 1.67-6.18); daily-life activities, OR: 3.17 (95%CI: 1.69-5.93); VAS, regression coefficient: -3.53 (95%CI: -5.57- -1.49). The BODE index and the ADO index showed a significant association with HRQOL. Exacerbations and dyspnea were the best individual factors related to HRQoL. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Park, Trevor
2017-01-01
A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
Viability estimation of pepper seeds using time-resolved photothermal signal characterization
NASA Astrophysics Data System (ADS)
Kim, Ghiseok; Kim, Geon-Hee; Lohumi, Santosh; Kang, Jum-Soon; Cho, Byoung-Kwan
2014-11-01
We used infrared thermal signal measurement system and photothermal signal and image reconstruction techniques for viability estimation of pepper seeds. Photothermal signals from healthy and aged seeds were measured for seven periods (24, 48, 72, 96, 120, 144, and 168 h) using an infrared camera and analyzed by a regression method. The photothermal signals were regressed using a two-term exponential decay curve with two amplitudes and two time variables (lifetime) as regression coefficients. The regression coefficients of the fitted curve showed significant differences for each seed groups, depending on the aging times. In addition, the viability of a single seed was estimated by imaging of its regression coefficient, which was reconstructed from the measured photothermal signals. The time-resolved photothermal characteristics, along with the regression coefficient images, can be used to discriminate the aged or dead pepper seeds from the healthy seeds.
Ready-to-use supplementary food increases fat mass and BMI in Haitian school-aged children.
Iannotti, Lora L; Henretty, Nicole M; Delnatus, Jacques Raymond; Previl, Windy; Stehl, Tom; Vorkoper, Susan; Bodden, Jaime; Maust, Amanda; Smidt, Rachel; Nash, Marilyn L; Tamimie, Courtney A; Owen, Bridget C; Wolff, Patricia B
2015-04-01
In Haiti and other countries, large-scale investments in school feeding programs have been made with marginal evidence of nutrition outcomes. We aimed to examine the effectiveness of a fortified ready-to-use supplementary food (RUSF), Mamba, on reduced anemia and improved body composition in school-aged children compared to an unfortified cereal bar, Tablet Yo, and control groups. A cluster, randomized trial with children ages 3-13 y (n = 1167) was conducted in the north of Haiti. Six schools were matched and randomized to the control group, Tablet Yo group (42 g, 165 kcal), or Mamba group (50 g, 260 kcal, and >75% of the RDA for critical micronutrients). Children in the supplementation groups received the snack daily for 100 d, and all were followed longitudinally for hemoglobin concentrations, anthropometry, and bioelectrical impedance measures: baseline (December 2012), midline (March 2013), and endline (June 2013). Parent surveys were conducted at baseline and endline to examine secondary outcomes of morbidities and dietary intakes. Longitudinal regression modeling using generalized least squares and logit with random effects tested the main effects. At baseline,14.0% of children were stunted, 14.5% underweight, 9.1% thin, and 73% anemic. Fat mass percentage (mean ± SD) was 8.1% ± 4.3% for boys and 12.5% ± 4.4% for girls. In longitudinal modeling, Mamba supplementation increased body mass index z score (regression coefficient ± SEE) 0.25 ± 0.06, fat mass 0.45 ± 0.14 kg, and percentage fat mass 1.28% ± 0.27% compared with control at each time point (P < 0.001). Among boys, Mamba increased fat mass (regression coefficient ± SEE) 0.73 ± 0.19 kg and fat-free mass 0.62 ± 0.34 kg compared with control (P < 0.001). Mamba reduced the odds of developing anemia by 28% compared to control (adjusted OR: 0.72; 95% CI: 0.57, 0.91; P < 0.001). No treatment effect was found for hemoglobin concentration. To our knowledge, this is the first study to give evidence of body composition effects from an RUSF in school-aged children. © 2015 American Society for Nutrition.
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Fujino, Yoshihisa; Kubo, Tatsuhiko; Kunimoto, Masamizu; Tabata, Hidetoshi; Tsuchiya, Takuto; Kadowaki, Koji; Nakamura, Takehiro; Oyama, Ichiro
2013-01-01
We examined the contextual effect of workplace social capital on systolic blood pressure (SBP). Cross-sectional. A conglomerate from 58 workplaces in Japan. Of the 5844 workers at a Japanese conglomerate from 58 workplaces, 5368 were recruited. Individuals who received drugs for hypertension (n=531) and who lacked information on any variable (n=167) were excluded from the analyses, leaving 4735 individuals (3281 men and 1454 women) for inclusion. Systolic blood pressure. The contextual effect of workplace social capital on SBP was examined using a multilevel regression analysis with a random intercept. Coworker support had a contextual effect at the workplace level (coefficient=-1.97, p=0.043), while a lack of trust for coworkers (coefficient=0.27, p=0.039) and lack of helpfulness from coworkers were associated with SBP (coefficient=0.28, p=0.002). The present study suggested that social capital at the workplace level has beneficial effects on SBP.
Penalized spline estimation for functional coefficient regression models.
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan
2010-04-01
The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.
Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert
2013-01-01
Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.
Auras, Silke; Ostermann, Thomas; de Cruppé, Werner; Bitzer, Eva-Maria; Diel, Franziska; Geraedts, Max
2016-12-01
The study aimed to illustrate the effect of the patients' sex, age, self-rated health and medical practice specialization on patient satisfaction. Secondary analysis of patient survey data using multilevel analysis (generalized linear mixed model, medical practice as random effect) using a sequential modelling strategy. We examined the effects of the patients' sex, age, self-rated health and medical practice specialization on four patient satisfaction dimensions: medical practice organization, information, interaction, professional competence. The study was performed in 92 German medical practices providing ambulatory care in general medicine, internal medicine or gynaecology. In total, 9888 adult patients participated in a patient survey using the validated 'questionnaire on satisfaction with ambulatory care-quality from the patient perspective [ZAP]'. We calculated four models for each satisfaction dimension, revealing regression coefficients with 95% confidence intervals (CIs) for all independent variables, and using Wald Chi-Square statistic for each modelling step (model validity) and LR-Tests to compare the models of each step with the previous model. The patients' sex and age had a weak effect (maximum regression coefficient 1.09, CI 0.39; 1.80), and the patients' self-rated health had the strongest positive effect (maximum regression coefficient 7.66, CI 6.69; 8.63) on satisfaction ratings. The effect of medical practice specialization was heterogeneous. All factors studied, specifically the patients' self-rated health, affected patient satisfaction. Adjustment should always be considered because it improves the comparability of patient satisfaction in medical practices with atypically varying patient populations and increases the acceptance of comparisons. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa
2011-08-01
In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.
Ham, Joo-ho; Park, Hun-Young; Kim, Youn-ho; Bae, Sang-kon; Ko, Byung-hoon
2017-01-01
[Purpose] The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. [Methods] We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20–59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. [Results] Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. [Conclusion] These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. PMID:29036765
Ham, Joo-Ho; Park, Hun-Young; Kim, Youn-Ho; Bae, Sang-Kon; Ko, Byung-Hoon; Nam, Sang-Seok
2017-09-30
The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20-59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. ©2017 The Korean Society for Exercise Nutrition
Fan, Shou-Zen; Abbod, Maysam F.
2018-01-01
Estimating the depth of anaesthesia (DoA) in operations has always been a challenging issue due to the underlying complexity of the brain mechanisms. Electroencephalogram (EEG) signals are undoubtedly the most widely used signals for measuring DoA. In this paper, a novel EEG-based index is proposed to evaluate DoA for 24 patients receiving general anaesthesia with different levels of unconsciousness. Sample Entropy (SampEn) algorithm was utilised in order to acquire the chaotic features of the signals. After calculating the SampEn from the EEG signals, Random Forest was utilised for developing learning regression models with Bispectral index (BIS) as the target. Correlation coefficient, mean absolute error, and area under the curve (AUC) were used to verify the perioperative performance of the proposed method. Validation comparisons with typical nonstationary signal analysis methods (i.e., recurrence analysis and permutation entropy) and regression methods (i.e., neural network and support vector machine) were conducted. To further verify the accuracy and validity of the proposed methodology, the data is divided into four unconsciousness-level groups on the basis of BIS levels. Subsequently, analysis of variance (ANOVA) was applied to the corresponding index (i.e., regression output). Results indicate that the correlation coefficient improved to 0.72 ± 0.09 after filtering and to 0.90 ± 0.05 after regression from the initial values of 0.51 ± 0.17. Similarly, the final mean absolute error dramatically declined to 5.22 ± 2.12. In addition, the ultimate AUC increased to 0.98 ± 0.02, and the ANOVA analysis indicates that each of the four groups of different anaesthetic levels demonstrated significant difference from the nearest levels. Furthermore, the Random Forest output was extensively linear in relation to BIS, thus with better DoA prediction accuracy. In conclusion, the proposed method provides a concrete basis for monitoring patients’ anaesthetic level during surgeries. PMID:29844970
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Nattee, Cholwich; Khamsemanan, Nirattaya; Lawtrakul, Luckhana; Toochinda, Pisanu; Hannongbua, Supa
2017-01-01
Malaria is still one of the most serious diseases in tropical regions. This is due in part to the high resistance against available drugs for the inhibition of parasites, Plasmodium, the cause of the disease. New potent compounds with high clinical utility are urgently needed. In this work, we created a novel model using a regression tree to study structure-activity relationships and predict the inhibition constant, K i of three different antimalarial analogues (Trimethoprim, Pyrimethamine, and Cycloguanil) based on their molecular descriptors. To the best of our knowledge, this work is the first attempt to study the structure-activity relationships of all three analogues combined. The most relevant descriptors and appropriate parameters of the regression tree are harvested using extremely randomized trees. These descriptors are water accessible surface area, Log of the aqueous solubility, total hydrophobic van der Waals surface area, and molecular refractivity. Out of all possible combinations of these selected parameters and descriptors, the tree with the strongest coefficient of determination is selected to be our prediction model. Predicted K i values from the proposed model show a strong coefficient of determination, R 2 =0.996, to experimental K i values. From the structure of the regression tree, compounds with high accessible surface area of all hydrophobic atoms (ASA_H) and low aqueous solubility of inhibitors (Log S) generally possess low K i values. Our prediction model can also be utilized as a screening test for new antimalarial drug compounds which may reduce the time and expenses for new drug development. New compounds with high predicted K i should be excluded from further drug development. It is also our inference that a threshold of ASA_H greater than 575.80 and Log S less than or equal to -4.36 is a sufficient condition for a new compound to possess a low K i . Copyright © 2016 Elsevier Inc. All rights reserved.
Application of random effects to the study of resource selection by animals
Gillies, C.S.; Hebblewhite, M.; Nielsen, S.E.; Krawchuk, M.A.; Aldridge, Cameron L.; Frair, J.L.; Saher, D.J.; Stevens, C.E.; Jerde, C.L.
2006-01-01
1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence.2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability.3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed.4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects.5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection.6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.
Application of random effects to the study of resource selection by animals.
Gillies, Cameron S; Hebblewhite, Mark; Nielsen, Scott E; Krawchuk, Meg A; Aldridge, Cameron L; Frair, Jacqueline L; Saher, D Joanne; Stevens, Cameron E; Jerde, Christopher L
2006-07-01
1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence. 2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability. 3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed. 4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects. 5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection. 6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.
Kajbafnezhad, H; Ahadi, H; Heidarie, A; Askari, P; Enayati, M
2012-10-01
The aim of this study was to predict athletic success motivation by mental skills, emotional intelligence and its components. The research sample consisted of 153 male athletes who were selected through random multistage sampling. The subjects completed the Mental Skills Questionnaire, Bar-On Emotional Intelligence questionnaire and the perception of sport success questionnaire. Data were analyzed using Pearson correlation coefficient and multiple regressions. Regression analysis shows that between the two variables of mental skill and emotional intelligence, mental skill is the best predictor for athletic success motivation and has a better ability to predict the success rate of the participants. Regression analysis results showed that among all the components of emotional intelligence, self-respect had a significantly higher ability to predict athletic success motivation. The use of psychological skills and emotional intelligence as an mediating and regulating factor and organizer cause leads to improved performance and can not only can to help athletes in making suitable and effective decisions for reaching a desired goal.
Chaurasia, Ashok; Harel, Ofer
2015-02-10
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J
2010-05-01
Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.
Balemans, Astrid C J; van Wely, Leontien; Becher, Jules G; Dallmeijer, Annet J
2015-07-01
A vicious circle of decreased physical fitness, early fatigue, and low physical activity levels (PAL) is thought to affect children with cerebral palsy (CP). However, the relationship of changes in physical fitness to changes in PAL and fatigue is unclear. The objective of this study was to investigate the associations among changes in physical fitness, walking-related PAL, and fatigue in children with CP. This study was a secondary analysis of a randomized controlled trial with measurements at baseline, 6 months (after the intervention period), and 12 months. Twenty-four children with bilateral spastic CP and 22 with unilateral spastic CP, aged 7 to 13 years, all walking, participated in this study. Physical fitness was measured by aerobic capacity, anaerobic threshold, anaerobic capacity, and isometric and functional muscle strength. Walking-related PAL was measured using an ankle-worn activity monitor for 1 week. Fatigue was determined with the Pediatric Quality of Life (PedsQL) Multidimensional Fatigue Scale. Longitudinal associations were analyzed by random coefficient regression analysis. In children with bilateral CP, all fitness parameters showed a positive, significant association with walking-related PAL, whereas no associations between physical fitness and walking-related PAL were seen in children with unilateral CP. No clinically relevant association between physical fitness and fatigue was found. Although random coefficient regression analysis can be used to investigate longitudinal associations between parameters, a causal relationship cannot be determined. The actual direction of the association between physical fitness and walking-related PAL, therefore, remains inconclusive. Children with bilateral spastic CP might benefit from improved physical fitness to increase their PAL or vice versa, although this is not the case in children with unilateral CP. There appears to be no relationship between physical fitness and self-reported fatigue in children with CP. Interventions aimed at improving PAL may be differently targeted in children with either bilateral or unilateral CP. © 2015 American Physical Therapy Association.
Experiment Design for Complex VTOL Aircraft with Distributed Propulsion and Tilt Wing
NASA Technical Reports Server (NTRS)
Murphy, Patrick C.; Landman, Drew
2015-01-01
Selected experimental results from a wind tunnel study of a subscale VTOL concept with distributed propulsion and tilt lifting surfaces are presented. The vehicle complexity and automated test facility were ideal for use with a randomized designed experiment. Design of Experiments and Response Surface Methods were invoked to produce run efficient, statistically rigorous regression models with minimized prediction error. Static tests were conducted at the NASA Langley 12-Foot Low-Speed Tunnel to model all six aerodynamic coefficients over a large flight envelope. This work supports investigations at NASA Langley in developing advanced configurations, simulations, and advanced control systems.
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Zhang, Hualing
2014-03-01
To learn characteristics and their mutual relations of self-esteem, self-harmony and interpersonal-harmony of university students, in order to provide the basis for mental health education. With a stratified cluster random sampling method, a questionnaire survey was conducted in 820 university students from 16 classes of four universities, chosen from 30 universities in Anhui Province. Meanwhile, Rosenberg Self-esteem Scale, Self-harmony Scale and Interpersonal-harmony Diagnostic Scale were used for assessment. Self-esteem of university students has an average score of (30.71 +/- 4.77), higher than median thoery 25, and there existed statistical significance in the dimensions of gender (P = 0.004), origin (P = 0.038) and only-child (P = 0.005). University students' self-harmony has an average score of (98.66 +/- 8.69), among which there were 112 students in the group of low score, counting for 13.7%, 442 in that of middle score, counting for 53.95%, 265 in that of high score, counting for 32.33%. And there existed no statistical significance in the total-score of self-harmony and score differences from most of subscales in the dimention of gender and origin, but satistical significance did exist in the dimention of only-child (P = 0.004). It was statistically significant (P = 0.006) on the "stereotype" subscales, on the differences between university students from urban areas and rural areas. Every dimension of self-esteem and self -harmony and interpersonal harmony was correlated and statistically significant. Multiple regression analysis found that when there was a variable in self-esteem, the amount of the variable of self-harmony for explaination of interpersonal conversation dropped from 22.6% to 12%, and standard regression coefficient changing from 0.087 to 0.035. The trouble of interpersonal dating fell from 27.6% to 13.1%, the standard regression coefficient changing from 0.104 to 0.019. The bother of treating people fell from 30.9% to 15%, and the standard regression coefficient changing from 0.079 to 0.020. The problem of heterosexual contact fell from 23.4% to 17.3%, and the standard regression coefficient changing from 0.095 to 0.024. Self-esteem was a mediator variable between self-harmony and interpersonal-harmony. By cultivating university students' level of self-esteem to achieve their self-harmony and interpersonal-harmony, university students' mental health level can be improved.
An instrumental variable random-coefficients model for binary outcomes
Chesher, Andrew; Rosen, Adam M
2014-01-01
In this paper, we study a random-coefficients model for a binary outcome. We allow for the possibility that some or even all of the explanatory variables are arbitrarily correlated with the random coefficients, thus permitting endogeneity. We assume the existence of observed instrumental variables Z that are jointly independent with the random coefficients, although we place no structure on the joint determination of the endogenous variable X and instruments Z, as would be required for a control function approach. The model fits within the spectrum of generalized instrumental variable models, and we thus apply identification results from our previous studies of such models to the present context, demonstrating their use. Specifically, we characterize the identified set for the distribution of random coefficients in the binary response model with endogeneity via a collection of conditional moment inequalities, and we investigate the structure of these sets by way of numerical illustration. PMID:25798048
Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah
2018-01-01
Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Fujino, Yoshihisa; Kubo, Tatsuhiko; Kunimoto, Masamizu; Tabata, Hidetoshi; Tsuchiya, Takuto; Kadowaki, Koji; Nakamura, Takehiro; Oyama, Ichiro
2013-01-01
Objectives We examined the contextual effect of workplace social capital on systolic blood pressure (SBP). Design Cross-sectional. Setting A conglomerate from 58 workplaces in Japan. Participants Of the 5844 workers at a Japanese conglomerate from 58 workplaces, 5368 were recruited. Individuals who received drugs for hypertension (n=531) and who lacked information on any variable (n=167) were excluded from the analyses, leaving 4735 individuals (3281 men and 1454 women) for inclusion. Primary and secondary outcome measures Systolic blood pressure. Results The contextual effect of workplace social capital on SBP was examined using a multilevel regression analysis with a random intercept. Coworker support had a contextual effect at the workplace level (coefficient=−1.97, p=0.043), while a lack of trust for coworkers (coefficient=0.27, p=0.039) and lack of helpfulness from coworkers were associated with SBP (coefficient=0.28, p=0.002). Conclusions The present study suggested that social capital at the workplace level has beneficial effects on SBP. PMID:23386581
NASA Astrophysics Data System (ADS)
Tsangaratos, Paraskevas; Ilia, Ioanna; Loupasakis, Constantinos; Papadakis, Michalis; Karimalis, Antonios
2017-04-01
The main objective of the present study was to apply two machine learning methods for the production of a landslide susceptibility map in the Finikas catchment basin, located in North Peloponnese, Greece and to compare their results. Specifically, Logistic Regression and Random Forest were utilized, based on a database of 40 sites classified into two categories, non-landslide and landslide areas that were separated into a training dataset (70% of the total data) and a validation dataset (remaining 30%). The identification of the areas was established by analyzing airborne imagery, extensive field investigation and the examination of previous research studies. Six landslide related variables were analyzed, namely: lithology, elevation, slope, aspect, distance to rivers and distance to faults. Within the Finikas catchment basin most of the reported landslides were located along the road network and within the residential complexes, classified as rotational and translational slides, and rockfalls, mainly caused due to the physical conditions and the general geotechnical behavior of the geological formation that cover the area. Each landslide susceptibility map was reclassified by applying the Geometric Interval classification technique into five classes, namely: very low susceptibility, low susceptibility, moderate susceptibility, high susceptibility, and very high susceptibility. The comparison and validation of the outcomes of each model were achieved using statistical evaluation measures, the receiving operating characteristic and the area under the success and predictive rate curves. The computation process was carried out using RStudio an integrated development environment for R language and ArcGIS 10.1 for compiling the data and producing the landslide susceptibility maps. From the outcomes of the Logistic Regression analysis it was induced that the highest b coefficient is allocated to lithology and slope, which was 2.8423 and 1.5841, respectively. From the estimation of the mean decrease in Gini coefficient performed during the application of Random Forest and the mean decrease in accuracy the most important variable is slope followed by lithology, aspect, elevation, distance from river network, and distance from faults, while the most used variables during the training phase were the variable aspect (21.45%), slope (20.53%) and lithology (19.84%). The outcomes of the analysis are consistent with previous studies concerning the area of research, which have indicated the high influence of lithology and slope in the manifestation of landslides. High percentage of landslide occurrence has been observed in Plio-Pleistocene sediments, flysch formations, and Cretaceous limestone. Also the presences of landslides have been associated with the degree of weathering and fragmentation, the orientation of the discontinuities surfaces and the intense morphological relief. The most accurate model was Random Forest which identified correctly 92.00% of the instances during the training phase, followed by the Logistic Regression 89.00%. The same pattern of accuracy was calculated during the validation phase, in which the Random Forest achieved a classification accuracy of 93.00%, while the Logistic Regression model achieved an accuracy of 91.00%. In conclusion, the outcomes of the study could be a useful cartographic product to local authorities and government agencies during the implementation of successful decision-making and land use planning strategies. Keywords: Landslide Susceptibility, Logistic Regression, Random Forest, GIS, Greece.
[Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].
Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao
2016-03-01
Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.
Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses. PMID:22457655
Tools to support interpreting multiple regression in the face of multicollinearity.
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Yoneoka, Daisuke; Henmi, Masayuki
2017-11-30
Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta-analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2-step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed-effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts. Copyright © 2017 John Wiley & Sons, Ltd.
A Bayesian ridge regression analysis of congestion's impact on urban expressway safety.
Shi, Qi; Abdel-Aty, Mohamed; Lee, Jaeyoung
2016-03-01
With the rapid growth of traffic in urban areas, concerns about congestion and traffic safety have been heightened. This study leveraged both Automatic Vehicle Identification (AVI) system and Microwave Vehicle Detection System (MVDS) installed on an expressway in Central Florida to explore how congestion impacts the crash occurrence in urban areas. Multiple congestion measures from the two systems were developed. To ensure more precise estimates of the congestion's effects, the traffic data were aggregated into peak and non-peak hours. Multicollinearity among traffic parameters was examined. The results showed the presence of multicollinearity especially during peak hours. As a response, ridge regression was introduced to cope with this issue. Poisson models with uncorrelated random effects, correlated random effects, and both correlated random effects and random parameters were constructed within the Bayesian framework. It was proven that correlated random effects could significantly enhance model performance. The random parameters model has similar goodness-of-fit compared with the model with only correlated random effects. However, by accounting for the unobserved heterogeneity, more variables were found to be significantly related to crash frequency. The models indicated that congestion increased crash frequency during peak hours while during non-peak hours it was not a major crash contributing factor. Using the random parameter model, the three congestion measures were compared. It was found that all congestion indicators had similar effects while Congestion Index (CI) derived from MVDS data was a better congestion indicator for safety analysis. Also, analyses showed that the segments with higher congestion intensity could not only increase property damage only (PDO) crashes, but also more severe crashes. In addition, the issues regarding the necessity to incorporate specific congestion indicator for congestion's effects on safety and to take care of the multicollinearity between explanatory variables were also discussed. By including a specific congestion indicator, the model performance significantly improved. When comparing models with and without ridge regression, the magnitude of the coefficients was altered in the existence of multicollinearity. These conclusions suggest that the use of appropriate congestion measure and consideration of multicolilnearity among the variables would improve the models and our understanding about the effects of congestion on traffic safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wen, L; Bowen, C R; Hartman, G L
2017-10-01
Dispersal of urediniospores by wind is the primary means of spread for Phakopsora pachyrhizi, the cause of soybean rust. Our research focused on the short-distance movement of urediniospores from within the soybean canopy and up to 61 m from field-grown rust-infected soybean plants. Environmental variables were used to develop and compare models including the least absolute shrinkage and selection operator regression, zero-inflated Poisson/regular Poisson regression, random forest, and neural network to describe deposition of urediniospores collected in passive and active traps. All four models identified distance of trap from source, humidity, temperature, wind direction, and wind speed as the five most important variables influencing short-distance movement of urediniospores. The random forest model provided the best predictions, explaining 76.1 and 86.8% of the total variation in the passive- and active-trap datasets, respectively. The prediction accuracy based on the correlation coefficient (r) between predicted values and the true values were 0.83 (P < 0.0001) and 0.94 (P < 0.0001) for the passive and active trap datasets, respectively. Overall, multiple machine learning techniques identified the most important variables to make the most accurate predictions of movement of P. pachyrhizi urediniospores short-distance.
Austin, Peter C.; Stryhn, Henrik; Leckie, George; Merlo, Juan
2017-01-01
Multilevel data occur frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models. These models incorporate cluster‐specific random effects that allow one to partition the total variation in the outcome into between‐cluster variation and between‐individual variation. The magnitude of the effect of clustering provides a measure of the general contextual effect. When outcomes are binary or time‐to‐event in nature, the general contextual effect can be quantified by measures of heterogeneity like the median odds ratio or the median hazard ratio, respectively, which can be calculated from a multilevel regression model. Outcomes that are integer counts denoting the number of times that an event occurred are common in epidemiological and medical research. The median (incidence) rate ratio in multilevel Poisson regression for counts that corresponds to the median odds ratio or median hazard ratio for binary or time‐to‐event outcomes respectively is relatively unknown and is rarely used. The median rate ratio is the median relative change in the rate of the occurrence of the event when comparing identical subjects from 2 randomly selected different clusters that are ordered by rate. We also describe how the variance partition coefficient, which denotes the proportion of the variation in the outcome that is attributable to between‐cluster differences, can be computed with count outcomes. We illustrate the application and interpretation of these measures in a case study analyzing the rate of hospital readmission in patients discharged from hospital with a diagnosis of heart failure. PMID:29114926
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
NASA Astrophysics Data System (ADS)
Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi
2017-03-01
Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Global estimation of long-term persistence in annual river runoff
NASA Astrophysics Data System (ADS)
Markonis, Y.; Moustakis, Y.; Nasika, C.; Sychova, P.; Dimitriadis, P.; Hanel, M.; Máca, P.; Papalexiou, S. M.
2018-03-01
Long-term persistence (LTP) of annual river runoff is a topic of ongoing hydrological research, due to its implications to water resources management. Here, we estimate its strength, measured by the Hurst coefficient H, in 696 annual, globally distributed, streamflow records with at least 80 years of data. We use three estimation methods (maximum likelihood estimator, Whittle estimator and least squares variance) resulting in similar mean values of H close to 0.65. Subsequently, we explore potential factors influencing H by two linear (Spearman's rank correlation, multiple linear regression) and two non-linear (self-organizing maps, random forests) techniques. Catchment area is found to be crucial for medium to larger watersheds, while climatic controls, such as aridity index, have higher impact to smaller ones. Our findings indicate that long-term persistence is weaker than found in other studies, suggesting that enhanced LTP is encountered in large-catchment rivers, were the effect of spatial aggregation is more intense. However, we also show that the estimated values of H can be reproduced by a short-term persistence stochastic model such as an auto-regressive AR(1) process. A direct consequence is that some of the most common methods for the estimation of H coefficient, might not be suitable for discriminating short- and long-term persistence even in long observational records.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271
NASA Astrophysics Data System (ADS)
Zhan, Liwei; Li, Chengwei
2017-02-01
A hybrid PSO-SVM-based model is proposed to predict the friction coefficient between aircraft tire and coating. The presented hybrid model combines a support vector machine (SVM) with particle swarm optimization (PSO) technique. SVM has been adopted to solve regression problems successfully. Its regression accuracy is greatly related to optimizing parameters such as the regularization constant C , the parameter gamma γ corresponding to RBF kernel and the epsilon parameter \\varepsilon in the SVM training procedure. However, the friction coefficient which is predicted based on SVM has yet to be explored between aircraft tire and coating. The experiment reveals that drop height and tire rotational speed are the factors affecting friction coefficient. Bearing in mind, the friction coefficient can been predicted using the hybrid PSO-SVM-based model by the measured friction coefficient between aircraft tire and coating. To compare regression accuracy, a grid search (GS) method and a genetic algorithm (GA) are used to optimize the relevant parameters (C , γ and \\varepsilon ), respectively. The regression accuracy could be reflected by the coefficient of determination ({{R}2} ). The result shows that the hybrid PSO-RBF-SVM-based model has better accuracy compared with the GS-RBF-SVM- and GA-RBF-SVM-based models. The agreement of this model (PSO-RBF-SVM) with experiment data confirms its good performance.
Mookprom, S; Boonkum, W; Kunhareang, S; Siripanya, S; Duangjinda, M
2017-02-01
The objective of this research is to investigate appropriate random regression models with various covariance functions, for the genetic evaluation of test-day egg production. Data included 7,884 monthly egg production records from 657 Thai native chickens (Pradu Hang Dam) that were obtained during the first to sixth generation and were born during 2007 to 2014 at the Research and Development Network Center for Animal Breeding (Native Chickens), Khon Kaen University. Average annual and monthly egg productions were 117 ± 41 and 10.20 ± 6.40 eggs, respectively. Nine random regression models were analyzed using the Wilmink function (WM), Koops and Grossman function (KG), Legendre polynomials functions with second, third, and fourth orders (LG2, LG3, LG4), and spline functions with 4, 5, 6, and 8 knots (SP4, SP5, SP6, and SP8). All covariance functions were nested within the same additive genetic and permanent environmental random effects, and the variance components were estimated by Restricted Maximum Likelihood (REML). In model comparisons, mean square error (MSE) and the coefficient of detemination (R 2 ) calculated the goodness of fit; and the correlation between observed and predicted values [Formula: see text] was used to calculate the cross-validated predictive abilities. We found that the covariance functions of SP5, SP6, and SP8 proved appropriate for the genetic evaluation of the egg production curves for Thai native chickens. The estimated heritability of monthly egg production ranged from 0.07 to 0.39, and the highest heritability was found during the first to third months of egg production. In conclusion, the spline functions within monthly egg production can be applied to breeding programs for the improvement of both egg number and persistence of egg production. © 2016 Poultry Science Association Inc.
Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable
ERIC Educational Resources Information Center
du Toit, Stephen H. C.; Cudeck, Robert
2009-01-01
A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…
Kawalilak, C E; Lanovaz, J L; Johnston, J D; Kontulainen, S A
2014-09-01
To assess the linearity and sex-specificity of damping coefficients used in a single-damper-model (SDM) when predicting impact forces during the worst-case falling scenario from fall heights up to 25 cm. Using 3-dimensional motion tracking and an integrated force plate, impact forces and impact velocities were assessed from 10 young adults (5 males; 5 females), falling from planted knees onto outstretched arms, from a random order of drop heights: 3, 5, 7, 10, 15, 20, and 25 cm. We assessed the linearity and sex-specificity between impact forces and impact velocities across all fall heights using analysis of variance linearity test and linear regression, respectively. Significance was accepted at P<0.05. Association between impact forces and impact velocities up to 25 cm was linear (P=0.02). Damping coefficients appeared sex-specific (males: 627 Ns/m, R(2)=0.70; females: 421 Ns/m; R(2)=0.81; sex combined: 532 Ns/m, R(2)=0.61). A linear damping coefficient used in the SDM proved valid for predicting impact forces from fall heights up to 25 cm. RESULTS suggested the use of sex-specific damping coefficients when estimating impact force using the SDM and calculating the factor-of-risk for wrist fractures.
Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.
Adaptive threshold shearlet transform for surface microseismic data denoising
NASA Astrophysics Data System (ADS)
Tang, Na; Zhao, Xian; Li, Yue; Zhu, Dan
2018-06-01
Random noise suppression plays an important role in microseismic data processing. The microseismic data is often corrupted by strong random noise, which would directly influence identification and location of microseismic events. Shearlet transform is a new multiscale transform, which can effectively process the low magnitude of microseismic data. In shearlet domain, due to different distributions of valid signals and random noise, shearlet coefficients can be shrunk by threshold. Therefore, threshold is vital in suppressing random noise. The conventional threshold denoising algorithms usually use the same threshold to process all coefficients, which causes noise suppression inefficiency or valid signals loss. In order to solve above problems, we propose the adaptive threshold shearlet transform (ATST) for surface microseismic data denoising. In the new algorithm, we calculate the fundamental threshold for each direction subband firstly. In each direction subband, the adjustment factor is obtained according to each subband coefficient and its neighboring coefficients, in order to adaptively regulate the fundamental threshold for different shearlet coefficients. Finally we apply the adaptive threshold to deal with different shearlet coefficients. The experimental denoising results of synthetic records and field data illustrate that the proposed method exhibits better performance in suppressing random noise and preserving valid signal than the conventional shearlet denoising method.
The Mycotic Ulcer Treatment Trial
Prajna, N. Venkatesh; Krishnan, Tiruvengada; Mascarenhas, Jeena; Rajaraman, Revathi; Prajna, Lalitha; Srinivasan, Muthiah; Raghavan, Anita; Oldenburg, Catherine E.; Ray, Kathryn J.; Zegans, Michael E.; McLeod, Stephen D.; Porco, Travis C.; Acharya, Nisha R.; Lietman, Thomas M.
2013-01-01
Objective To compare topical natamycin vs voriconazole in the treatment of filamentous fungal keratitis. Methods This phase 3, double-masked, multicenter trial was designed to randomize 368 patients to voriconazole (1%) or natamycin (5%), applied topically every hour while awake until reepithelialization, then 4 times daily for at least 3 weeks. Eligibility included smear-positive filamentous fungal ulcer and visual acuity of 20/40 to 20/400. Main Outcome Measures The primary outcome was best spectacle-corrected visual acuity at 3 months; secondary outcomes included corneal perforation and/or therapeutic penetrating keratoplasty. Results A total of 940 patients were screened and 323 were enrolled. Causative organisms included Fusarium (128 patients [40%]), Aspergillus (54 patients [17%]), and other filamentous fungi (141 patients [43%]). Natamycin-treated cases had significantly better 3-month best spectacle-corrected visual acuity than voriconazole-treated cases (regression coefficient=−0.18 logMAR; 95% CI, −0.30 to −0.05; P=.006). Natamycin-treated cases were less likely to have perforation or require therapeutic penetrating keratoplasty (odds ratio=0.42; 95% CI, 0.22 to 0.80; P=.009). Fusarium cases fared better with natamycin than with voriconazole (regression coefficient=−0.41 logMAR; 95% CI, −0.61 to −0.20; P<.001; odds ratio for perforation=0.06; 95% CI, 0.01 to 0.28; P<.001), while non-Fusarium cases fared similarly (regression coefficient=−0.02 logMAR; 95% CI, −0.17 to 0.13; P=.81; odds ratio for perforation=1.08; 95% CI, 0.48 to 2.43; P=.86). Conclusions Natamycin treatment was associated with significantly better clinical and microbiological outcomes than voriconazole treatment for smear-positive filamentous fungal keratitis, with much of the difference attributable to improved results in Fusarium cases. Application to Clinical Practice Voriconazole should not be used as monotherapy in filamentous keratitis. Trial Registration clinicaltrials.gov Identifier: NCT00996736 PMID:23710492
NASA Astrophysics Data System (ADS)
Sanchez Rivera, Yamil
The purpose of this study is to add to what we know about the affective domain and to create a valid instrument for future studies. The Motivation to Learn Science (MLS) Inventory is based on Krathwohl's Taxonomy of Affective Behaviors (Krathwohl et al., 1964). The results of the Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) demonstrated that the MLS Inventory is a valid and reliable instrument. Therefore, the MLS Inventory is a uni-dimensional instrument composed of 9 items with convergent validity (no divergence). The instrument had a high Chronbach Alpha value of .898 during the EFA analysis and .919 with the CFA analysis. Factor loadings on the 9 items ranged from .617 to .800. Standardized regression weights ranged from .639 to .835 in the CFA analysis. Various indices (RMSEA = .033; NFI = .987; GFI = .985; CFI = 1.000) demonstrated a good fitness of the proposed model. Hierarchical linear modeling was used to statistical analyze data where students' motivation to learn science scores (level-1) were nested within teachers (level-2). The analysis was geared toward identifying if teachers' use of affective behavior (a level-2 classroom variable) was significantly related with students' MLS scores (level-1 criterion variable). Model testing proceeded in three phases: intercept-only model, means-as-outcome model, and a random-regression coefficient model. The intercept-only model revealed an intra-class correlation coefficient of .224 with an estimated reliability of .726. Therefore, data suggested that only 22.4% of the variance in MLS scores is between-classes and the remaining 77.6% is at the student-level. Due to the significant variance in MLS scores, X2(62.756, p<.0001), teachers' TAB scores were added as a level-2 predictor. The regression coefficient was non-significant (p>.05). Therefore, the teachers' self-reported use of affective behaviors was not a significant predictor of students' motivation to learn science.
Threshold regression to accommodate a censored covariate.
Qian, Jing; Chiou, Sy Han; Maye, Jacqueline E; Atem, Folefac; Johnson, Keith A; Betensky, Rebecca A
2018-06-22
In several common study designs, regression modeling is complicated by the presence of censored covariates. Examples of such covariates include maternal age of onset of dementia that may be right censored in an Alzheimer's amyloid imaging study of healthy subjects, metabolite measurements that are subject to limit of detection censoring in a case-control study of cardiovascular disease, and progressive biomarkers whose baseline values are of interest, but are measured post-baseline in longitudinal neuropsychological studies of Alzheimer's disease. We propose threshold regression approaches for linear regression models with a covariate that is subject to random censoring. Threshold regression methods allow for immediate testing of the significance of the effect of a censored covariate. In addition, they provide for unbiased estimation of the regression coefficient of the censored covariate. We derive the asymptotic properties of the resulting estimators under mild regularity conditions. Simulations demonstrate that the proposed estimators have good finite-sample performance, and often offer improved efficiency over existing methods. We also derive a principled method for selection of the threshold. We illustrate the approach in application to an Alzheimer's disease study that investigated brain amyloid levels in older individuals, as measured through positron emission tomography scans, as a function of maternal age of dementia onset, with adjustment for other covariates. We have developed an R package, censCov, for implementation of our method, available at CRAN. © 2018, The International Biometric Society.
Association between economic fluctuations and road mortality in OECD countries.
Chen, Gang
2014-08-01
Using longitudinal data from 32 Organization for Economic Co-operation and Development (OECD) countries (1970-2010), this article investigates association between annual variations in road mortality and the economic fluctuations. Two regression models (fixed-effects and random-coefficients) were adopted for estimation. The cross-country data analyses suggested that road mortality is pro-cyclical and that the cyclicality is symmetric. Based on data from 32 OECD countries, an increase of on average 1% in economic growth is associated with a 1.1% increase in road mortality, and vice versa. © The Author 2014. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.
Detection of Cutting Tool Wear using Statistical Analysis and Regression Model
NASA Astrophysics Data System (ADS)
Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin
2010-10-01
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.
SCI model structure determination program (OSR) user's guide. [optimal subset regression
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program, OSR (Optimal Subset Regression) which estimates models for rotorcraft body and rotor force and moment coefficients is described. The technique used is based on the subset regression algorithm. Given time histories of aerodynamic coefficients, aerodynamic variables, and control inputs, the program computes correlation between various time histories. The model structure determination is based on these correlations. Inputs and outputs of the program are given.
A Structural Modeling Approach to a Multilevel Random Coefficients Model.
ERIC Educational Resources Information Center
Rovine, Michael J.; Molenaar, Peter C. M.
2000-01-01
Presents a method for estimating the random coefficients model using covariance structure modeling and allowing one to estimate both fixed and random effects. The method is applied to real and simulated data, including marriage data from J. Belsky and M. Rovine (1990). (SLD)
Ecotoxicology of phenylphosphonothioates.
Francis, B M; Hansen, L G; Fukuto, T R; Lu, P Y; Metcalf, R L
1980-01-01
The phenylphosphonothioate insecticides EPN and leptophos, and several analogs, were evaluated with respect to their delayed neurotoxic effects in hens and their environmental behavior in a terrestrial-aquatic model ecosystem. Acute toxicity to insects was highly correlated with sigma sigma of the substituted phenyl group (regression coefficient r = -0.91) while acute toxicity to mammals was slightly less well correlated (regression coefficient r = -0.71), and neurotoxicity was poorly correlated with sigma sigma (regression coefficient r = -0.35). Both EPN and leptophos were markedly more persistent and bioaccumulative in the model ecosystem than parathion. Desbromoleptophos, a contaminant and metabolite of leptophos, was seen to be a highly stable and persistent terminal residue of leptophos. PMID:6159210
Feedback on oral presentations during pediatric clerkships: a randomized controlled trial.
Sox, Colin M; Dell, Michael; Phillipi, Carrie A; Cabral, Howard J; Vargas, Gabriela; Lewin, Linda O
2014-11-01
To measure the effects of participating in structured oral presentation evaluation sessions early in pediatric clerkships on students' subsequent presentations. We conducted a single-blind, 3-arm, cluster randomized controlled trial during pediatric clerkships at Boston University School of Medicine, University of Maryland School of Medicine, Oregon Health & Science University, and Case Western Reserve University School of Medicine. Blocks of students at each school were randomly assigned to experience either (1) no formal presentation feedback (control) or a small-group presentation feedback session early in pediatric clerkships in which students gave live presentations and received feedback from faculty who rated their presentations by using a (2) single-item (simple) or (3) 18-item (detailed) evaluation form. At the clerkship end, overall quality of subjects' presentations was rated by faculty blinded to randomization status, and subjects reported whether their presentations had improved. Analyses included multivariable linear and logistic regressions clustered on clerkship block that controlled for medical school. A total of 476 participants were evenly divided into the 3 arms, which had similar characteristics. Compared with controls, presentation quality was significantly associated with participating in detailed (coefficient: 0.38; 95% confidence interval [CI]: 0.07-0.69) but not simple (coefficient: 0.16; 95% CI: -0.12-0.43) feedback sessions. Similarly, student self-report of presentation improvement was significantly associated with participating in detailed (odds ratio: 2.16; 95% CI: 1.11-4.18] but not simple (odds ratio: 1.89; 95% CI: 0.91-3.93) feedback sessions. Small-group presentation feedback sessions led by faculty using a detailed evaluation form resulted in clerkship students delivering oral presentations of higher quality compared with controls. Copyright © 2014 by the American Academy of Pediatrics.
Kitagawa, Yasuhisa; Teramoto, Tamio; Daida, Hiroyuki
2012-01-01
We evaluated the impact of adherence to preferable behavior on serum lipid control assessed by a self-reported questionnaire in high-risk patients taking pravastatin for primary prevention of coronary artery disease. High-risk patients taking pravastatin were followed for 2 years. Questionnaire surveys comprising 21 questions, including 18 questions concerning awareness of health, and current status of diet, exercise, and drug therapy, were conducted at baseline and after 1 year. Potential domains were established by factor analysis from the results of questionnaires, and adherence scores were calculated in each domain. The relationship between adherence scores and lipid values during the 1-year treatment period was analyzed by each domain using multiple regression analysis. A total of 5,792 patients taking pravastatin were included in the analysis. Multiple regression analysis showed a significant correlation in terms of "Intake of high fat/cholesterol/sugar foods" (regression coefficient -0.58, p=0.0105) and "Adherence to instructions for drug therapy" (regression coefficient -6.61, p<0.0001). Low-density lipoprotein cholesterol (LDL-C) values were significantly lower in patients who had an increase in the adherence score in the "Awareness of health" domain compared with those with a decreased score. There was a significant correlation between high-density lipoprotein (HDL-C) values and "Awareness of health" (regression coefficient 0.26; p= 0.0037), "Preferable dietary behaviors" (regression coefficient 0.75; p<0.0001), and "Exercise" (regression coefficient 0.73; p= 0.0002). Similar relations were seen with triglycerides. In patients who have a high awareness of their health, a positive attitude toward lipid-lowering treatment including diet, exercise, and high adherence to drug therapy, is related with favorable overall lipid control even in patients under treatment with pravastatin.
Tsai, Alexander C.; Tomlinson, Mark; Comulada, W. Scott; Rotheram-Borus, Mary Jane
2016-01-01
Background Violence against women by intimate partners remains unacceptably common worldwide. The evidence base for the assumed psychological impacts of intimate partner violence (IPV) is derived primarily from studies conducted in high-income countries. A recently published systematic review identified 13 studies linking IPV to incident depression, none of which were conducted in sub-Saharan Africa. To address this gap in the literature, we analyzed longitudinal data collected during the course of a 3-y cluster-randomized trial with the aim of estimating the association between IPV and depression symptom severity. Methods and Findings We conducted a secondary analysis of population-based, longitudinal data collected from 1,238 pregnant women during a 3-y cluster-randomized trial of a home visiting intervention in Cape Town, South Africa. Surveys were conducted at baseline, 6 mo, 18 mo, and 36 mo (85% retention). The primary explanatory variable of interest was exposure to four types of physical IPV in the past year. Depression symptom severity was measured using the Xhosa version of the ten-item Edinburgh Postnatal Depression Scale. In a pooled cross-sectional multivariable regression model adjusting for potentially confounding time-fixed and time-varying covariates, lagged IPV intensity had a statistically significant association with depression symptom severity (regression coefficient b = 1.04; 95% CI, 0.61–1.47), with estimates from a quantile regression model showing greater adverse impacts at the upper end of the conditional depression distribution. Fitting a fixed effects regression model accounting for all time-invariant confounding (e.g., history of childhood sexual abuse) yielded similar findings (b = 1.54; 95% CI, 1.13–1.96). The magnitudes of the coefficients indicated that a one–standard-deviation increase in IPV intensity was associated with a 12.3% relative increase in depression symptom severity over the same time period. The most important limitations of our study include exposure assessment that lacked measurement of sexual violence, which could have caused us to underestimate the severity of exposure; the extended latency period in the lagged analysis, which could have caused us to underestimate the strength of the association; and outcome assessment that was limited to the use of a screening instrument for depression symptom severity. Conclusions In this secondary analysis of data from a population-based, 3-y cluster-randomized controlled trial, IPV had a statistically significant association with depression symptom severity. The estimated associations were relatively large in magnitude, consistent with findings from high-income countries, and robust to potential confounding by time-invariant factors. Intensive health sector responses to reduce IPV and improve women’s mental health should be explored. PMID:26784110
ERIC Educational Resources Information Center
Kong, Nan
2007-01-01
In multivariate statistics, the linear relationship among random variables has been fully explored in the past. This paper looks into the dependence of one group of random variables on another group of random variables using (conditional) entropy. A new measure, called the K-dependence coefficient or dependence coefficient, is defined using…
Zhao, Yang; Zhang, Xue Qing; Bian, Xiao Dong
2018-01-01
To investigate the early supplementary processes of fishre sources in the Bohai Sea, the geographically weighted regression (GWR) was introduced to the habitat suitability index (HSI) model. The Bohai Sea larval Japanese Halfbeak HSI GWR model was established with four environmental variables, including sea surface temperature (SST), sea surface salinity (SSS), water depth (DEP), and chlorophyll a concentration (Chl a). Results of the simulation showed that the four variables had different performances in August 2015. SST and Chl a were global variables, and had little impacts on HSI, with the regression coefficients of -0.027 and 0.006, respectively. SSS and DEP were local variables, and had larger impacts on HSI, while the average values of absolute values of their regression coefficients were 0.075 and 0.129, respectively. In the central Bohai Sea, SSS showed a negative correlation with HSI, and the most negative correlation coefficient was -0.3. In contrast, SSS was correlated positively but weakly with HSI in the three bays of Bohai Sea, and the largest correlation coefficient was 0.1. In particular, DEP and HSI were negatively correlated in the entire Bohai Sea, while they were more negatively correlated in the three bays of Bohai than in the central Bohai Sea, and the most negative correlation coefficient was -0.16 in the three bays. The Poisson regression coefficient of the HSI GWR model was 0.705, consistent with field measurements. Therefore, it could provide a new method for the research on fish habitats in the future.
Efficient sampling of complex network with modified random walk strategies
NASA Astrophysics Data System (ADS)
Xie, Yunya; Chang, Shuhua; Zhang, Zhipeng; Zhang, Mi; Yang, Lei
2018-02-01
We present two novel random walk strategies, choosing seed node (CSN) random walk and no-retracing (NR) random walk. Different from the classical random walk sampling, the CSN and NR strategies focus on the influences of the seed node choice and path overlap, respectively. Three random walk samplings are applied in the Erdös-Rényi (ER), Barabási-Albert (BA), Watts-Strogatz (WS), and the weighted USAir networks, respectively. Then, the major properties of sampled subnets, such as sampling efficiency, degree distributions, average degree and average clustering coefficient, are studied. The similar conclusions can be reached with these three random walk strategies. Firstly, the networks with small scales and simple structures are conducive to the sampling. Secondly, the average degree and the average clustering coefficient of the sampled subnet tend to the corresponding values of original networks with limited steps. And thirdly, all the degree distributions of the subnets are slightly biased to the high degree side. However, the NR strategy performs better for the average clustering coefficient of the subnet. In the real weighted USAir networks, some obvious characters like the larger clustering coefficient and the fluctuation of degree distribution are reproduced well by these random walk strategies.
Genetic parameters of legendre polynomials for first parity lactation curves.
Pool, M H; Janss, L L; Meuwissen, T H
2000-11-01
Variance components of the covariance function coefficients in a random regression test-day model were estimated by Legendre polynomials up to a fifth order for first-parity records of Dutch dairy cows using Gibbs sampling. Two Legendre polynomials of equal order were used to model the random part of the lactation curve, one for the genetic component and one for permanent environment. Test-day records from cows registered between 1990 to 1996 and collected by regular milk recording were available. For the data set, 23,700 complete lactations were selected from 475 herds sired by 262 sires. Because the application of a random regression model is limited by computing capacity, we investigated the minimum order needed to fit the variance structure in the data sufficiently. Predictions of genetic and permanent environmental variance structures were compared with bivariate estimates on 30-d intervals. A third-order or higher polynomial modeled the shape of variance curves over DIM with sufficient accuracy for the genetic and permanent environment part. Also, the genetic correlation structure was fitted with sufficient accuracy by a third-order polynomial, but, for the permanent environmental component, a fourth order was needed. Because equal orders are suggested in the literature, a fourth-order Legendre polynomial is recommended in this study. However, a rank of three for the genetic covariance matrix and of four for permanent environment allows a simpler covariance function with a reduced number of parameters based on the eigenvalues and eigenvectors.
Han, Min; Zhang, Yong; Sun, Shujuan; Wang, Zhongsu; Wang, Jiangrong; Xie, Xinxing; Gao, Mei; Yin, Xiangcui; Hou, Yinglong
2013-10-01
This study was designed to assess whether angiotensin-converting enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs) could prevent the recurrence of atrial fibrillation (AF). A systemic literature search of PubMed, EMBASE, and Cochrane Controlled Trials Register till 2012 was performed to identify randomized controlled trials involving the prevention of recurrence of AF with renin-angiotensin system blockade therapy. Subgroup analysis and meta-regression were performed. Publication bias was checked through funnel plot and Egger's test. Twenty-one randomized controlled trials including 13,184 patients with AF were identified. Overall, the recurrence of AF was significantly reduced in patients using ACEI/ARBs [odds ratio (OR), 0.43; 95% confidence interval (CI), 0.32-0.56; P < 0.00001], especially both in irbesartan subgroup (OR, 0.38; 95% CI, 0.21-0.68; P = 0.001) and in patients receiving antiarrhythmic drug (AAD) (OR, 0.37; 95% CI, 0.29-0.48; P < 0.00001), and there was no significant difference between ACEIs and ARBs (ACEIs: OR, 0.42; 95% CI, 0.31-0.57 and ARBs: OR, 0.42; 95% CI, 0.31-0.57). Moreover, it was found that the benefits of ACEI/ARBs revealed positive correlation to systolic blood pressure (regression coefficient: -0.0700257, P = 0.000) in no-AAD users. ACEI/ARBs are effective on the secondary prevention of AF, especially in patients receiving AAD and suffering from hypertension.
Strategic Use of Random Subsample Replication and a Coefficient of Factor Replicability
ERIC Educational Resources Information Center
Katzenmeyer, William G.; Stenner, A. Jackson
1975-01-01
The problem of demonstrating replicability of factor structure across random variables is addressed. Procedures are outlined which combine the use of random subsample replication strategies with the correlations between factor score estimates across replicate pairs to generate a coefficient of replicability and confidence intervals associated with…
Austin, Peter C; Wagner, Philippe; Merlo, Juan
2017-03-15
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster-specific random effects which allow one to partition the total individual variance into between-cluster variation and between-individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time-to-event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., 'frailty') Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Wagner, Philippe; Merlo, Juan
2016-01-01
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster‐specific random effects which allow one to partition the total individual variance into between‐cluster variation and between‐individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time‐to‐event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., ‘frailty’) Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27885709
Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks
2016-01-01
Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
The reliability of the Australasian Triage Scale: a meta-analysis
Ebrahimi, Mohsen; Heydari, Abbas; Mazlom, Reza; Mirhaghi, Amir
2015-01-01
BACKGROUND: Although the Australasian Triage Scale (ATS) has been developed two decades ago, its reliability has not been defined; therefore, we present a meta-analyis of the reliability of the ATS in order to reveal to what extent the ATS is reliable. DATA SOURCES: Electronic databases were searched to March 2014. The included studies were those that reported samples size, reliability coefficients, and adequate description of the ATS reliability assessment. The guidelines for reporting reliability and agreement studies (GRRAS) were used. Two reviewers independently examined abstracts and extracted data. The effect size was obtained by the z-transformation of reliability coefficients. Data were pooled with random-effects models, and meta-regression was done based on the method of moment’s estimator. RESULTS: Six studies were included in this study at last. Pooled coefficient for the ATS was substantial 0.428 (95%CI 0.340–0.509). The rate of mis-triage was less than fifty percent. The agreement upon the adult version is higher than the pediatric version. CONCLUSION: The ATS has shown an acceptable level of overall reliability in the emergency department, but it needs more development to reach an almost perfect agreement. PMID:26056538
Note on coefficient matrices from stochastic Galerkin methods for random diffusion equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou Tao, E-mail: tzhou@lsec.cc.ac.c; Tang Tao, E-mail: ttang@hkbu.edu.h
2010-11-01
In a recent work by Xiu and Shen [D. Xiu, J. Shen, Efficient stochastic Galerkin methods for random diffusion equations, J. Comput. Phys. 228 (2009) 266-281], the Galerkin methods are used to solve stochastic diffusion equations in random media, where some properties for the coefficient matrix of the resulting system are provided. They also posed an open question on the properties of the coefficient matrix. In this work, we will provide some results related to the open question.
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes predictors from a MGLMM are always preferable to scatterplots of empirical Bayes predictors generated by separate models, unless the true association between outcomes is zero.
Using a Grocery List Is Associated With a Healthier Diet and Lower BMI Among Very High-Risk Adults.
Dubowitz, Tamara; Cohen, Deborah A; Huang, Christina Y; Beckman, Robin A; Collins, Rebecca L
2015-01-01
Examine whether use of a grocery list is associated with healthier diet and weight among food desert residents. Cross-sectional analysis of in-person interview data from randomly selected household food shoppers in 2 low-income, primarily African American urban neighborhoods in Pittsburgh, PA with limited access to healthy foods. Multivariate ordinary least-square regressions conducted among 1,372 participants and controlling for sociodemographic factors and other potential confounding variables indicated that although most of the sample (78%) was overweight or obese, consistently using a list was associated with lower body mass index (based on measured height and weight) (adjusted multivariant coefficient = 0.095) and higher dietary quality (based on the Healthy Eating Index-2005) (adjusted multivariant coefficient = 0.103) (P < .05). Shopping with a list may be a useful tool for low-income individuals to improve diet or decrease body mass index. Copyright © 2015 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Baba, Hiromi; Takahara, Jun-ichi; Yamashita, Fumiyoshi; Hashida, Mitsuru
2015-11-01
The solvent effect on skin permeability is important for assessing the effectiveness and toxicological risk of new dermatological formulations in pharmaceuticals and cosmetics development. The solvent effect occurs by diverse mechanisms, which could be elucidated by efficient and reliable prediction models. However, such prediction models have been hampered by the small variety of permeants and mixture components archived in databases and by low predictive performance. Here, we propose a solution to both problems. We first compiled a novel large database of 412 samples from 261 structurally diverse permeants and 31 solvents reported in the literature. The data were carefully screened to ensure their collection under consistent experimental conditions. To construct a high-performance predictive model, we then applied support vector regression (SVR) and random forest (RF) with greedy stepwise descriptor selection to our database. The models were internally and externally validated. The SVR achieved higher performance statistics than RF. The (externally validated) determination coefficient, root mean square error, and mean absolute error of SVR were 0.899, 0.351, and 0.268, respectively. Moreover, because all descriptors are fully computational, our method can predict as-yet unsynthesized compounds. Our high-performance prediction model offers an attractive alternative to permeability experiments for pharmaceutical and cosmetic candidate screening and optimizing skin-permeable topical formulations.
Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2010-01-01
The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Thermal requirements of Dermanyssus gallinae (De Geer, 1778) (Acari: Dermanyssidae).
Tucci, Edna Clara; do Prado, Angelo P; de Araújo, Raquel Pires
2008-01-01
The thermal requirements for development of Dermanyssus gallinae were studied under laboratory conditions at 15, 20, 25, 30 and 35 degrees C, a 12h photoperiod and 60-85% RH. The thermal requirements for D. gallinae were as follows. Preoviposition: base temperature 3.4 degrees C, thermal constant (k) 562.85 degree-hours, determination coefficient (R(2)) 0.59, regression equation: Y= -0.006035 + 0.001777x. Egg: base temperature 10.60 degrees C, thermal constant (k) 689.65 degree-hours, determination coefficient (R(2)) 0.94, regression equation: Y= -0.015367 + 0.001450x. Larva: base temperature 9.82 degrees C, thermal constant (k) 464.91 degree-hours, determination coefficient (R(2)) 0.87, regression equation: Y= -0.021123 + 0.002151x. Protonymph: base temperature 10.17 degrees C, thermal constant (k) 504.49 degree-hours, determination coefficient (R(2)) 0.90, regression equation: Y= -0.020152 + 0.001982x. Deutonymph: base temperature 11.80 degrees C, thermal constant (k) 501.11 degree-hours, determination coefficient (R(2)) 0.99, regression equation: Y= -0.023555 + 0.001996x. The results obtained showed that 15 to 42 generations of Dermanyssus gallinae may occur during the year in the State of São Paulo, as estimated based on isotherm charts. Dermanyssus gallinae may develop continually in the State of São Paulo, with a population decrease in the winter. There were differences between the developmental stages of D. gallinae in relation to thermal requirements.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
Machine Learning Estimation of Atom Condensed Fukui Functions.
Zhang, Qingyou; Zheng, Fangfang; Zhao, Tanfeng; Qu, Xiaohui; Aires-de-Sousa, João
2016-02-01
To enable the fast estimation of atom condensed Fukui functions, machine learning algorithms were trained with databases of DFT pre-calculated values for ca. 23,000 atoms in organic molecules. The problem was approached as the ranking of atom types with the Bradley-Terry (BT) model, and as the regression of the Fukui function. Random Forests (RF) were trained to predict the condensed Fukui function, to rank atoms in a molecule, and to classify atoms as high/low Fukui function. Atomic descriptors were based on counts of atom types in spheres around the kernel atom. The BT coefficients assigned to atom types enabled the identification (93-94 % accuracy) of the atom with the highest Fukui function in pairs of atoms in the same molecule with differences ≥0.1. In whole molecules, the atom with the top Fukui function could be recognized in ca. 50 % of the cases and, on the average, about 3 of the top 4 atoms could be recognized in a shortlist of 4. Regression RF yielded predictions for test sets with R(2) =0.68-0.69, improving the ability of BT coefficients to rank atoms in a molecule. Atom classification (as high/low Fukui function) was obtained with RF with sensitivity of 55-61 % and specificity of 94-95 %. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Thompson, Andrew; Sullivan, Sarah; Barley, Maddi; Moore, Laurence; Rogers, Paul; Sipos, Attila; Harrison, Glynn
2010-06-01
Educational workbooks have been used in psychiatry to influence patient but not clinician behaviour. Targeted education interventions to change prescribing practice in other areas of medicine have only looked at changes in prescribing and not attitudes or beliefs related to the prescribing. We aimed to examine whether clinicians' beliefs about a common prescribing issue in psychiatry (antipsychotic polypharmacy prescription) changed alongside behaviour as a result of a complex intervention. Medical and nursing staff were recruited from 19 general adult psychiatry units in the south-west of the UK as part of a cluster randomized controlled trial. A questionnaire was used to assess beliefs on the prescribing of antipsychotic polypharmacy as a secondary outcome before and after completion of a cognitive behavioural 'self-help' style workbook (one part of a complex intervention). A factor analysis suggested three dimensions of the questionnaire that corresponded to predetermined themes. The data were analysed using a random-effects regression model (adjusting for clustering) controlling for possible confounders. There was a significant change in beliefs on both of the factors: antipsychotic polypharmacy (coefficient = -0.89, P < 0.01) and rapid tranquilization (coefficient = -0.68, P = 0.01) specifically targeted by the workbook. There was a modest but statistically significant change in antipsychotic polypharmacy prescribing (odds ratio 0.43, 95% confidence intervals 0.21-0.90). The workbook appeared to change staff beliefs about antipsychotic polypharmacy, but achieving substantial changes in clinician behaviour may require further exploration of other factors important in complex prescribing issues.
Auvinen, Anssi; Moss, Sue M; Tammela, Teuvo L J; Taari, Kimmo; Roobol, Monique J; Schröder, Fritz H; Bangma, Chris H; Carlsson, Sigrid; Aus, Gunnar; Zappa, Marco; Puliti, Donella; Denis, Louis J; Nelen, Vera; Kwiatkowski, Maciej; Randazzo, Marco; Paez, Alvaro; Lujan, Marcos; Hugosson, Jonas
2016-01-01
Purpose The balance of benefits and harms in prostate cancer screening has not been sufficiently characterized. We related indicators of mortality reduction and overdetection by center within the European Randomized Study of Prostate Cancer Screening. Experimental Design We analyzed the absolute mortality reduction expressed as number needed to invite (NNI=1/absolute risk reduction; indicating how many men had to be randomized to screening arm to avert a prostate cancer death) for screening and the absolute excess of prostate cancer detection as number needed for overdetection (NNO=1/absolute excess incidence; indicating the number of men invited per additional prostate cancer case), and compared their relationship across the seven ERSPC centers. Results Both absolute mortality reduction (NNI) and absolute overdetection (NNO) varied widely between the centers: NNI 200-7000 and NNO 16-69. Extent of overdiagnosis and mortality reduction were closely associated (correlation coefficient r=0.76, weighted linear regression coefficient β=33, 95% 5-62, R2=0.72). For an averted prostate cancer death at 13 years of follow-up, 12-36 excess cases had to be detected in various centers. Conclusions The differences between the ERSPC centers likely reflect variations in prostate cancer incidence and mortality, as well as in screening protocol and performance. The strong interrelation between the benefits and harms suggests that efforts to maximize the mortality effect are bound to increase overdiagnosis, and might be improved by focusing on high-risk populations. The optimal balance between screening intensity and risk of overdiagnosis remains unclear. PMID:26289069
Prevalence and Consequences of the Proximal Junctional Kyphosis After Spinal Deformity Surgery
Yan, Chunda; Li, Yong; Yu, Zhange
2016-01-01
Abstract The aim of this study was to estimate the prevalence and patient outcomes of proximal junctional kyphosis (PJK) in pediatric patients and adolescents who received surgical interventions for the treatment of a spinal deformity. Literature was searched in electronic databases, and studies were selected by following précised eligibility criteria. Percent prevalence values of the PJK in individual studies were pooled to achieve a weighted effect size under the random effects model. Subgroup and meta-regression analyses were performed to appraise the factors affecting PJK prevalence. Twenty-six studies (2024 patients) were included in this meta-analysis. Average age of the patients was 13.8 ± 2.75 years of which 32 ± 20 % were males. Average follow-up was 51.6 ± 38.8 (range 17 ± 13 to 218 ± 60) months. Overall, the percent prevalence of PJK (95% confidence interval) was 11.02 (10.5, 11.5) %; P < 0.00001 which was inversely associated with age (meta-regression coefficient: –1.607 [–2.86, –0.36]; 0.014). Revision surgery rate in the patients with PJK was 10%. The prevalence of PJK was positively associated with the proximal junctional angle at last follow-up (coefficient: 2.248; P = 0.012) and the change in the proximal junctional angle from surgery to last follow-up (coefficient: 2.139; P = 0.014) but not with preoperative proximal junctional angle. The prevalence of PJK in the children and adolescent patients is 11%. About 10% of those affected require revision surgery. PMID:27196453
Genetic analyses of stillbirth in relation to litter size using random regression models.
Chen, C Y; Misztal, I; Tsuruta, S; Herring, W O; Holl, J; Culbertson, M
2010-12-01
Estimates of genetic parameters for number of stillborns (NSB) in relation to litter size (LS) were obtained with random regression models (RRM). Data were collected from 4 purebred Duroc nucleus farms between 2004 and 2008. Two data sets with 6,575 litters for the first parity (P1) and 6,259 litters for the second to fifth parity (P2-5) with a total of 8,217 and 5,066 animals in the pedigree were analyzed separately. Number of stillborns was studied as a trait on sow level. Fixed effects were contemporary groups (farm-year-season) and fixed cubic regression coefficients on LS with Legendre polynomials. Models for P2-5 included the fixed effect of parity. Random effects were additive genetic effects for both data sets with permanent environmental effects included for P2-5. Random effects modeled with Legendre polynomials (RRM-L), linear splines (RRM-S), and degree 0 B-splines (RRM-BS) with regressions on LS were used. For P1, the order of polynomial, the number of knots, and the number of intervals used for respective models were quadratic, 3, and 3, respectively. For P2-5, the same parameters were linear, 2, and 2, respectively. Heterogeneous residual variances were considered in the models. For P1, estimates of heritability were 12 to 15%, 5 to 6%, and 6 to 7% in LS 5, 9, and 13, respectively. For P2-5, estimates were 15 to 17%, 4 to 5%, and 4 to 6% in LS 6, 9, and 12, respectively. For P1, average estimates of genetic correlations between LS 5 to 9, 5 to 13, and 9 to 13 were 0.53, -0.29, and 0.65, respectively. For P2-5, same estimates averaged for RRM-L and RRM-S were 0.75, -0.21, and 0.50, respectively. For RRM-BS with 2 intervals, the correlation was 0.66 between LS 5 to 7 and 8 to 13. Parameters obtained by 3 RRM revealed the nonlinear relationship between additive genetic effect of NSB and the environmental deviation of LS. The negative correlations between the 2 extreme LS might possibly indicate different genetic bases on incidence of stillbirth.
Factor Scores, Structure Coefficients, and Communality Coefficients
ERIC Educational Resources Information Center
Goodwyn, Fara
2012-01-01
This paper presents heuristic explanations of factor scores, structure coefficients, and communality coefficients. Common misconceptions regarding these topics are clarified. In addition, (a) the regression (b) Bartlett, (c) Anderson-Rubin, and (d) Thompson methods for calculating factor scores are reviewed. Syntax necessary to execute all four…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Pavlik, Valory N; Chan, Wenyaw; Hyman, David J; Feldman, Penny; Ogedegbe, Gbenga; Schwartz, Joseph E; McDonald, Margaret; Einhorn, Paula; Tobin, Jonathan N
2015-01-01
African-Americans (AAs) have a high prevalence of hypertension and their blood pressure (BP) control on treatment still lags behind other groups. In 2004, NHLBI funded five projects that aimed to evaluate clinically feasible interventions to effect changes in medical care delivery leading to an increased proportion of AA patients with controlled BP. Three of the groups performed a pooled analysis of trial results to determine: 1) the magnitude of the combined intervention effect; and 2) how the pooled results could inform the methodology for future health-system level BP interventions. Using a cluster randomized design, the trials enrolled AAs with uncontrolled hypertension to test interventions targeting a combination of patient and clinician behaviors. The 12-month Systolic BP (SBP) and Diastolic BP (DBP) effects of intervention or control cluster assignment were assessed using mixed effects longitudinal regression modeling. 2,015 patients representing 352 clusters participated across the three trials. Pooled BP slopes followed a quadratic pattern, with an initial decline, followed by a rise toward baseline, and did not differ significantly between intervention and control clusters: SBP linear coefficient = -2.60±0.21 mmHg per month, p<0.001; quadratic coefficient = 0.167± 0.02 mmHg/month, p<0.001; group by time interaction group by time group x linear time coefficient=0.145 ± 0.293, p=0.622; group x quadratic time coefficient= -0.017 ± 0.026, p=0.525). RESULTS were similar for DBP. The individual sites did not have significant intervention effects when analyzed separately. Investigators planning behavioral trials to improve BP control in health systems serving AAs should plan for small effect sizes and employ a "run-in" period in which BP can be expected to improve in both experimental and control clusters.
Buhl, Sussi F; Andersen, Aino L; Andersen, Jens R; Andersen, Ove; Jensen, Jens-Erik B; Rasmussen, Anne Mette L; Pedersen, Mette M; Damkjær, Lars; Gilkes, Hanne; Petersen, Janne
2016-02-01
Stress metabolism is associated with accelerated loss of muscle that has large consequences for the old medical patient. The aim of this study was to investigate if an intervention combining protein and resistance training was more effective in counteracting loss of muscle than standard care. Secondary outcomes were changes in muscle strength, functional ability and body weight. 29 acutely admitted old (>65 years) patients were randomly assigned to the intervention (n = 14) or to standard care (n = 15). The Intervention Group received 1.7 g protein/kg/day during admission and a daily protein supplement (18.8 g protein) and resistance training 3 times per week the 12 weeks following discharge. Muscle mass was assessed by Dual-energy X-ray Absorptiometry. Muscle strength was assessed by Hand Grip Strength and Chair Stand Test. Functional ability was assessed by the de Morton Mobility Index, the Functional Recovery Score and the New Mobility Score. Changes in outcomes from time of admission to three-months after discharge were analysed by linear regression analysis. The intention-to-treat analysis showed no significant effect of the intervention on lean mass (unadjusted: β-coefficient = -1.28 P = 0.32, adjusted for gender: β-coefficient = -0.02 P = 0.99, adjusted for baseline lean mass: β-coefficient = -0.31 P = 0.80). The de Morton Mobility Index significantly increased in the Control Group (β-coefficient = -11.43 CI: 0.72-22.13, P = 0.04). No other differences were found. No significant effect on muscle mass was observed in this group of acutely ill old medical patients. High compliance was achieved with the dietary intervention, but resistance training was challenging. Clinical trials identifier NCT02077491. Copyright © 2015 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
The prediction of food additives in the fruit juice based on electronic nose with chemometrics.
Qiu, Shanshan; Wang, Jun
2017-09-01
Food additives are added to products to enhance their taste, and preserve flavor or appearance. While their use should be restricted to achieve a technological benefit, the contents of food additives should be also strictly controlled. In this study, E-nose was applied as an alternative to traditional monitoring technologies for determining two food additives, namely benzoic acid and chitosan. For quantitative monitoring, support vector machine (SVM), random forest (RF), extreme learning machine (ELM) and partial least squares regression (PLSR) were applied to establish regression models between E-nose signals and the amount of food additives in fruit juices. The monitoring models based on ELM and RF reached higher correlation coefficients (R 2 s) and lower root mean square errors (RMSEs) than models based on PLSR and SVM. This work indicates that E-nose combined with RF or ELM can be a cost-effective, easy-to-build and rapid detection system for food additive monitoring. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ozone and sulfur dioxide effects on three tall fescue cultivars
DOE Office of Scientific and Technical Information (OSTI.GOV)
Flagler, R.B.; Youngner, V.B.
Although many reports have been published concerning differential susceptibility of various crops and/or cultivars to air pollutants, most have used foliar injury instead of the marketable yield as the factor that determined susceptibility for the crop. In an examination of screening in terms of marketable yield, three cultivars of tall fescue (Festuca arundinacea Schreb.), 'Alta,' 'Fawn,' and 'Kentucky 31,' were exposed to 0-0.40 ppm O/sub 3/ or 0-0.50 ppm SO/sub 2/ 6 h/d, once a week, for 7 and 9 weeks, respectively. Experimental design was a randomized complete block with three replications. Statistical analysis was by standard analysis of variancemore » and regression techniques. Three variables were analyzed: top dry weight (yield), tiller number, and weight per tiller. Ozone had a significant effect on all three variables. Significant linear decreases in yield and weight per tiller occurred with increasing O/sub 3/ concentrations. Linear regressions of these variables on O/sub 3/ concentration produced significantly different regression coefficients. The coefficient for Kentucky 31 was significantly greater than Alta or Fawn, which did not differ from each other. This indicated that Kentucky 31 was more susceptible to O/sub 3/ than either of the other cultivars. Percent reductions in dry weight for the three cultivars at highest O/sub 3/ level were 35, 44, and 53%, respectively, for Fawn, Alta, and Kentucky 31. For weight per tiller, Kentucky 31 had a higher percent reduction than the other cultivars (59 vs. 46 and 44%). Tiller number was generally increased by O/sub 3/, but this variable was not useful for determining differential susceptibility to the pollutant. Sulfur dioxide treatments produced no significant effects on any of the variables analyzed.« less
Metal ion levels and lymphocyte counts: ASR hip resurfacing prosthesis vs. standard THA
2013-01-01
Background and purpose Wear particles from metal–on–metal arthroplasties are under suspicion for adverse effects both locally and systemically, and the DePuy ASR Hip Resurfacing System (RHA) has above–average failure rates. We compared lymphocyte counts in RHA and total hip arthroplasty (THA) and investigated whether cobalt and chromium ions affected the lymphocyte counts. Method In a randomized controlled trial, we followed 19 RHA patients and 19 THA patients. Lymphocyte subsets and chromium and cobalt ion concentrations were measured at baseline, at 8 weeks, at 6 months, and at 1 and 2 years. Results The T–lymphocyte counts for both implant types declined over the 2–year period. This decline was statistically significant for CD3+CD8+ in the THA group, with a regression coefficient of –0.04 × 109cells/year (95% CI: –0.08 to –0.01). Regression analysis indicated a depressive effect of cobalt ions in particular on T–cells with 2–year whole–blood cobalt regression coefficients for CD3+ of –0.10 (95% CI: –0.16 to –0.04) × 109 cells/parts per billion (ppb), for CD3+CD4+ of –0.06 (–0.09 to –0.03) × 109 cells/ppb, and for CD3+CD8+ of –0.02 (–0.03 to –0.00) × 109 cells/ppb. Interpretation Circulating T–lymphocyte levels may decline after surgery, regardless of implant type. Metal ions—particularly cobalt—may have a general depressive effect on T– and B–lymphocyte levels. Registered with ClinicalTrials.gov under # NCT01113762 PMID:23597114
Melbourne Chambers, R; Morrison-Levy, N; Chang, S; Tapper, J; Walker, S; Tulloch-Reid, M
2014-04-01
We conducted a case-control study of 33 Jamaican children 7 to 12years old with uncomplicated epilepsy and 33 of their classroom peers matched for age and gender to determine whether epilepsy resulted in differences in cognitive ability and school achievement and if socioeconomic status or the environment had a moderating effect on any differences. Intelligence, language, memory, attention, executive function, and mathematics ability were assessed using selected tests from NEPSY, WISCR, TeaCh, WRAT3 - expanded, and Raven's Coloured Progressive Matrices. The child's environment at home was measured using the Middle Childhood HOME inventory. Socioeconomic status was determined from a combination of household, crowding, possessions, and sanitation. We compared the characteristics of the cases and controls and used random effects regression models (using the matched pair as the cluster) to examine the relationship between cognition and epilepsy. We found that there was no significant difference in IQ, but children with epilepsy had lower scores on tests of memory (p<0.05), language (p<0.05), and attention (p<0.01) compared with their controls. In random effects models, epilepsy status had a significant effect on memory (coefficient=-0.14, CI: -0.23, -0.05), language (coefficient=-0.13, CI: -0.23, -0.04), and mathematics ability (coefficient=-0.01, CI: -0.02, -0.00). Adjustment for the home environment and socioeconomic status and inclusion of interaction terms for these variables did not alter these effects. In conclusion, we found that epilepsy status in Jamaican children has a significant effect on performance on tests of memory, language, and mathematics and that this effect is not modified or explained by socioeconomic status or the child's home environment. Copyright © 2014 Elsevier Inc. All rights reserved.
Testing for gene-environment interaction under exposure misspecification.
Sun, Ryan; Carroll, Raymond J; Christiani, David C; Lin, Xihong
2017-11-09
Complex interplay between genetic and environmental factors characterizes the etiology of many diseases. Modeling gene-environment (GxE) interactions is often challenged by the unknown functional form of the environment term in the true data-generating mechanism. We study the impact of misspecification of the environmental exposure effect on inference for the GxE interaction term in linear and logistic regression models. We first examine the asymptotic bias of the GxE interaction regression coefficient, allowing for confounders as well as arbitrary misspecification of the exposure and confounder effects. For linear regression, we show that under gene-environment independence and some confounder-dependent conditions, when the environment effect is misspecified, the regression coefficient of the GxE interaction can be unbiased. However, inference on the GxE interaction is still often incorrect. In logistic regression, we show that the regression coefficient is generally biased if the genetic factor is associated with the outcome directly or indirectly. Further, we show that the standard robust sandwich variance estimator for the GxE interaction does not perform well in practical GxE studies, and we provide an alternative testing procedure that has better finite sample properties. © 2017, The International Biometric Society.
Hayashi, K; Yamada, T; Sawa, T
2015-03-01
The return or Poincaré plot is a non-linear analytical approach in a two-dimensional plane, where a timed signal is plotted against itself after a time delay. Its scatter pattern reflects the randomness and variability in the signals. Quantification of a Poincaré plot of the electroencephalogram has potential to determine anaesthesia depth. We quantified the degree of dispersion (i.e. standard deviation, SD) along the diagonal line of the electroencephalogram-Poincaré plot (named as SD1/SD2), and compared SD1/SD2 values with spectral edge frequency 95 (SEF95) and bispectral index values. The regression analysis showed a tight linear regression equation with a coefficient of determination (R(2) ) value of 0.904 (p < 0.0001) between the Poincaré index (SD1/SD2) and SEF95, and a moderate linear regression equation between SD1/SD2 and bispectral index (R(2) = 0.346, p < 0.0001). Quantification of the Poincaré plot tightly correlates with SEF95, reflecting anaesthesia-dependent changes in electroencephalogram oscillation. © 2014 The Association of Anaesthetists of Great Britain and Ireland.
Heritability estimations for inner muscular fat in Hereford cattle using random regressions
USDA-ARS?s Scientific Manuscript database
Random regressions make possible to make genetic predictions and parameters estimation across a gradient of environments, allowing a more accurate and beneficial use of animals as breeders in specific environments. The objective of this study was to use random regression models to estimate heritabil...
ERIC Educational Resources Information Center
Mugrage, Beverly; And Others
Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…
Is the Non-Dipole Magnetic Field Random?
NASA Technical Reports Server (NTRS)
Walker, Andrew D.; Backus, George E.
1996-01-01
Statistical modelling of the Earth's magnetic field B has a long history. In particular, the spherical harmonic coefficients of scalar fields derived from B can be treated as Gaussian random variables. In this paper, we give examples of highly organized fields whose spherical harmonic coefficients pass tests for independent Gaussian random variables. The fact that coefficients at some depth may be usefully summarized as independent samples from a normal distribution need not imply that there really is some physical, random process at that depth. In fact, the field can be extremely structured and still be regarded for some purposes as random. In this paper, we examined the radial magnetic field B(sub r) produced by the core, but the results apply to any scalar field on the core-mantle boundary (CMB) which determines B outside the CMB.
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
Shrinkage regression-based methods for microarray missing value imputation.
Wang, Hsiuying; Chiu, Chia-Chun; Wu, Yi-Ching; Wu, Wei-Sheng
2013-01-01
Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.
NASA Astrophysics Data System (ADS)
Mangla, Rohit; Kumar, Shashi; Nandy, Subrata
2016-05-01
SAR and LiDAR remote sensing have already shown the potential of active sensors for forest parameter retrieval. SAR sensor in its fully polarimetric mode has an advantage to retrieve scattering property of different component of forest structure and LiDAR has the capability to measure structural information with very high accuracy. This study was focused on retrieval of forest aboveground biomass (AGB) using Terrestrial Laser Scanner (TLS) based point clouds and scattering property of forest vegetation obtained from decomposition modelling of RISAT-1 fully polarimetric SAR data. TLS data was acquired for 14 plots of Timli forest range, Uttarakhand, India. The forest area is dominated by Sal trees and random sampling with plot size of 0.1 ha (31.62m*31.62m) was adopted for TLS and field data collection. RISAT-1 data was processed to retrieve SAR data based variables and TLS point clouds based 3D imaging was done to retrieve LiDAR based variables. Surface scattering, double-bounce scattering, volume scattering, helix and wire scattering were the SAR based variables retrieved from polarimetric decomposition. Tree heights and stem diameters were used as LiDAR based variables retrieved from single tree vertical height and least square circle fit methods respectively. All the variables obtained for forest plots were used as an input in a machine learning based Random Forest Regression Model, which was developed in this study for forest AGB estimation. Modelled output for forest AGB showed reliable accuracy (RMSE = 27.68 t/ha) and a good coefficient of determination (0.63) was obtained through the linear regression between modelled AGB and field-estimated AGB. The sensitivity analysis showed that the model was more sensitive for the major contributed variables (stem diameter and volume scattering) and these variables were measured from two different remote sensing techniques. This study strongly recommends the integration of SAR and LiDAR data for forest AGB estimation.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Perturbed effects at radiation physics
NASA Astrophysics Data System (ADS)
Külahcı, Fatih; Şen, Zekâi
2013-09-01
Perturbation methodology is applied in order to assess the linear attenuation coefficient, mass attenuation coefficient and cross-section behavior with random components in the basic variables such as the radiation amounts frequently used in the radiation physics and chemistry. Additionally, layer attenuation coefficient (LAC) and perturbed LAC (PLAC) are proposed for different contact materials. Perturbation methodology provides opportunity to obtain results with random deviations from the average behavior of each variable that enters the whole mathematical expression. The basic photon intensity variation expression as the inverse exponential power law (as Beer-Lambert's law) is adopted for perturbation method exposition. Perturbed results are presented not only in terms of the mean but additionally the standard deviation and the correlation coefficients. Such perturbation expressions provide one to assess small random variability in basic variables.
Lowery, Michael G; Calfin, Brenda; Yeh, Shu-Jen; Doan, Tao; Shain, Eric; Hanna, Charles; Hohs, Ronald; Kantor, Stan; Lindberg, John; Khalil, Omar S
2006-01-01
We used the effect of temperature on the localized reflectance of human skin to assess the role of noise sources on the correlation between temperature-induced fractional change in optical density of human skin (DeltaOD(T)) and blood glucose concentration [BG]. Two temperature-controlled optical probes at 30 degrees C contacted the skin, one was then cooled by -10 degrees C; the other was heated by +10 degrees C. DeltaOD(T) upon cooling or heating was correlated with capillary [BG] of diabetic volunteers over a period of three days. Calibration models in the first two days were used to predict [BG] in the third day. We examined the conditions where the correlation coefficient (R2) for predicting [BG] in a third day ranked higher than R2 values resulting from fitting permutations of randomized [BG] to the same DeltaOD(T) values. It was possible to establish a four-term linear regression correlation between DeltaOD(T) upon cooling and [BG] with a correlation coefficient higher than that of an established noise threshold in diabetic patients that were mostly females with less than 20 years of diabetes duration. The ability to predict [BG] values with a correlation coefficient above biological and body-interface noise varied between the cases of cooling and heating.
Innovating patient care delivery: DSRIP's interrupted time series analysis paradigm.
Shenoy, Amrita G; Begley, Charles E; Revere, Lee; Linder, Stephen H; Daiger, Stephen P
2017-12-08
Adoption of Medicaid Section 1115 waiver is one of the many ways of innovating healthcare delivery system. The Delivery System Reform Incentive Payment (DSRIP) pool, one of the two funding pools of the waiver has four categories viz. infrastructure development, program innovation and redesign, quality improvement reporting and lastly, bringing about population health improvement. A metric of the fourth category, preventable hospitalization (PH) rate was analyzed in the context of eight conditions for two time periods, pre-reporting years (2010-2012) and post-reporting years (2013-2015) for two hospital cohorts, DSRIP participating and non-participating hospitals. The study explains how DSRIP impacted Preventable Hospitalization (PH) rates of eight conditions for both hospital cohorts within two time periods. Eight PH rates were regressed as the dependent variable with time, intervention and post-DSRIP Intervention as independent variables. PH rates of eight conditions were then consolidated into one rate for regressing with the above independent variables to evaluate overall impact of DSRIP. An interrupted time series regression was performed after accounting for auto-correlation, stationarity and seasonality in the dataset. In the individual regression model, PH rates showed statistically significant coefficients for seven out of eight conditions in DSRIP participating hospitals. In the combined regression model, the coefficient of the PH rate showed a statistically significant decrease with negative p-values for regression coefficients in DSRIP participating hospitals compared to positive/increased p-values for regression coefficients in DSRIP non-participating hospitals. Several macro- and micro-level factors may have likely contributed DSRIP hospitals outperforming DSRIP non-participating hospitals. Healthcare organization/provider collaboration, support from healthcare professionals, DSRIP's design, state reimbursement and coordination in care delivery methods may have led to likely success of DSRIP. IV, a retrospective cohort study based on longitudinal data. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Lai, Xiaoming; Zhu, Qing; Zhou, Zhiwen; Liao, Kaihua
2017-12-01
In this study, seven random combination sampling strategies were applied to investigate the uncertainties in estimating the hillslope mean soil water content (SWC) and correlation coefficients between the SWC and soil/terrain properties on a tea + bamboo hillslope. One of the sampling strategies is the global random sampling and the other six are the stratified random sampling on the top, middle, toe, top + mid, top + toe and mid + toe slope positions. When each sampling strategy was applied, sample sizes were gradually reduced and each sampling size contained 3000 replicates. Under each sampling size of each sampling strategy, the relative errors (REs) and coefficients of variation (CVs) of the estimated hillslope mean SWC and correlation coefficients between the SWC and soil/terrain properties were calculated to quantify the accuracy and uncertainty. The results showed that the uncertainty of the estimations decreased as the sampling size increasing. However, larger sample sizes were required to reduce the uncertainty in correlation coefficient estimation than in hillslope mean SWC estimation. Under global random sampling, 12 randomly sampled sites on this hillslope were adequate to estimate the hillslope mean SWC with RE and CV ≤10%. However, at least 72 randomly sampled sites were needed to ensure the estimated correlation coefficients with REs and CVs ≤10%. Comparing with all sampling strategies, reducing sampling sites on the middle slope had the least influence on the estimation of hillslope mean SWC and correlation coefficients. Under this strategy, 60 sites (10 on the middle slope and 50 on the top and toe slopes) were enough to ensure the estimated correlation coefficients with REs and CVs ≤10%. This suggested that when designing the SWC sampling, the proportion of sites on the middle slope can be reduced to 16.7% of the total number of sites. Findings of this study will be useful for the optimal SWC sampling design.
Improved estimates of partial volume coefficients from noisy brain MRI using spatial context.
Manjón, José V; Tohka, Jussi; Robles, Montserrat
2010-11-01
This paper addresses the problem of accurate voxel-level estimation of tissue proportions in the human brain magnetic resonance imaging (MRI). Due to the finite resolution of acquisition systems, MRI voxels can contain contributions from more than a single tissue type. The voxel-level estimation of this fractional content is known as partial volume coefficient estimation. In the present work, two new methods to calculate the partial volume coefficients under noisy conditions are introduced and compared with current similar methods. Concretely, a novel Markov Random Field model allowing sharp transitions between partial volume coefficients of neighbouring voxels and an advanced non-local means filtering technique are proposed to reduce the errors due to random noise in the partial volume coefficient estimation. In addition, a comparison was made to find out how the different methodologies affect the measurement of the brain tissue type volumes. Based on the obtained results, the main conclusions are that (1) both Markov Random Field modelling and non-local means filtering improved the partial volume coefficient estimation results, and (2) non-local means filtering was the better of the two strategies for partial volume coefficient estimation. Copyright 2010 Elsevier Inc. All rights reserved.
Poor methodological quality and reporting standards of systematic reviews in burn care management.
Wasiak, Jason; Tyack, Zephanie; Ware, Robert; Goodwin, Nicholas; Faggion, Clovis M
2017-10-01
The methodological and reporting quality of burn-specific systematic reviews has not been established. The aim of this study was to evaluate the methodological quality of systematic reviews in burn care management. Computerised searches were performed in Ovid MEDLINE, Ovid EMBASE and The Cochrane Library through to February 2016 for systematic reviews relevant to burn care using medical subject and free-text terms such as 'burn', 'systematic review' or 'meta-analysis'. Additional studies were identified by hand-searching five discipline-specific journals. Two authors independently screened papers, extracted and evaluated methodological quality using the 11-item A Measurement Tool to Assess Systematic Reviews (AMSTAR) tool and reporting quality using the 27-item Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Characteristics of systematic reviews associated with methodological and reporting quality were identified. Descriptive statistics and linear regression identified features associated with improved methodological quality. A total of 60 systematic reviews met the inclusion criteria. Six of the 11 AMSTAR items reporting on 'a priori' design, duplicate study selection, grey literature, included/excluded studies, publication bias and conflict of interest were reported in less than 50% of the systematic reviews. Of the 27 items listed for PRISMA, 13 items reporting on introduction, methods, results and the discussion were addressed in less than 50% of systematic reviews. Multivariable analyses showed that systematic reviews associated with higher methodological or reporting quality incorporated a meta-analysis (AMSTAR regression coefficient 2.1; 95% CI: 1.1, 3.1; PRISMA regression coefficient 6·3; 95% CI: 3·8, 8·7) were published in the Cochrane library (AMSTAR regression coefficient 2·9; 95% CI: 1·6, 4·2; PRISMA regression coefficient 6·1; 95% CI: 3·1, 9·2) and included a randomised control trial (AMSTAR regression coefficient 1·4; 95%CI: 0·4, 2·4; PRISMA regression coefficient 3·4; 95% CI: 0·9, 5·8). The methodological and reporting quality of systematic reviews in burn care requires further improvement with stricter adherence by authors to the PRISMA checklist and AMSTAR tool. © 2016 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
Possibility of modifying the growth trajectory in Raeini Cashmere goat.
Ghiasi, Heydar; Mokhtari, M S
2018-03-27
The objective of this study was to investigate the possibility of modifying the growth trajectory in Raeini Cashmere goat breed. In total, 13,193 records on live body weight collected from 4788 Raeini Cashmere goats were used. According to Akanke's information criterion (AIC), the sing-trait random regression model included fourth-order Legendre polynomial for direct and maternal genetic effect; maternal and individual permanent environmental effect was the best model for estimating (co)variance components. The matrices of eigenvectors for (co)variances between random regression coefficients of direct additive genetic were used to calculate eigenfunctions, and different eigenvector indices were also constructed. The obtained results showed that the first eigenvalue explained 79.90% of total genetic variance. Therefore, changing the body weights applying the first eigenfunction will be obtained rapidly. Selection based on the first eigenvector will cause favorable positive genetic gains for all body weight considered from birth to 12 months of age. For modifying the growth trajectory in Raeini Cashmere goat, the selection should be based on the second eigenfunction. The second eigenvalue accounted for 14.41% of total genetic variance for body weights that is low in comparison with genetic variance explained by the first eigenvalue. The complex patterns of genetic change in growth trajectory observed under the third and fourth eigenfunction and low amount of genetic variance explained by the third and fourth eigenvalues.
Wilson, Kathryn M.; Vesper, Hubert W.; Tocco, Paula; Sampson, Laura; Rosén, Johan; Hellenäs, Karl-Erik; Törnqvist, Margareta; Willett, Walter C.
2011-01-01
Objective Acrylamide, a probable human carcinogen, is formed during high-heat cooking of many common foods. The validity of food frequency questionnaire (FFQ) measures of acrylamide intake has not been established. We assessed the validity of acrylamide intake calculated from an FFQ using a biomarker of acrylamide exposure. Methods We calculated acrylamide intake from an FFQ in the Nurses' Health Study II. We measured hemoglobin adducts of acrylamide and its metabolite, glycidamide, in a random sample of 296 women. Correlation and regression analyses were used to assess the relationship between acrylamide intake and adducts. Results The correlation between acrylamide intake and the sum of acrylamide and glycidamide adducts was 0.31 (95% CI: 0.20 – 0.41), adjusted for laboratory batch, energy intake, and age. Further adjustment for BMI, alcohol intake, and correction for random within-person measurement error in adducts gave a correlation of 0.34 (CI: 0.23 – 0.45). The intraclass correlation coefficient for the sum of adducts was 0.77 in blood samples collected 1 to 3 years apart in a subset of 45 women. Intake of several foods significantly predicted adducts in multiple regression. Conclusions Acrylamide intake and hemoglobin adducts of acrylamide and glycidamide were moderately correlated. Within-person consistency in adducts was high over time. PMID:18855107
Kheirabadi, Khabat; Rashidi, Amir; Alijani, Sadegh; Imumorin, Ikhide
2014-11-01
We compared the goodness of fit of three mathematical functions (including: Legendre polynomials, Lidauer-Mäntysaari function and Wilmink function) for describing the lactation curve of primiparous Iranian Holstein cows by using multiple-trait random regression models (MT-RRM). Lactational submodels provided the largest daily additive genetic (AG) and permanent environmental (PE) variance estimates at the end and at the onset of lactation, respectively, as well as low genetic correlations between peripheral test-day records. For all models, heritability estimates were highest at the end of lactation (245 to 305 days) and ranged from 0.05 to 0.26, 0.03 to 0.12 and 0.04 to 0.24 for milk, fat and protein yields, respectively. Generally, the genetic correlations between traits depend on how far apart they are or whether they are on the same day in any two traits. On average, genetic correlations between milk and fat were the lowest and those between fat and protein were intermediate, while those between milk and protein were the highest. Results from all criteria (Akaike's and Schwarz's Bayesian information criterion, and -2*logarithm of the likelihood function) suggested that a model with 2 and 5 coefficients of Legendre polynomials for AG and PE effects, respectively, was the most adequate for fitting the data. © 2014 Japanese Society of Animal Science.
Solid harmonic wavelet scattering for predictions of molecule properties
NASA Astrophysics Data System (ADS)
Eickenberg, Michael; Exarchakis, Georgios; Hirn, Matthew; Mallat, Stéphane; Thiry, Louis
2018-06-01
We present a machine learning algorithm for the prediction of molecule properties inspired by ideas from density functional theory (DFT). Using Gaussian-type orbital functions, we create surrogate electronic densities of the molecule from which we compute invariant "solid harmonic scattering coefficients" that account for different types of interactions at different scales. Multilinear regressions of various physical properties of molecules are computed from these invariant coefficients. Numerical experiments show that these regressions have near state-of-the-art performance, even with relatively few training examples. Predictions over small sets of scattering coefficients can reach a DFT precision while being interpretable.
Kupek, Emil
2006-03-15
Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.
Gene expression models for prediction of longitudinal dispersion coefficient in streams
NASA Astrophysics Data System (ADS)
Sattar, Ahmed M. A.; Gharabaghi, Bahram
2015-05-01
Longitudinal dispersion is the key hydrologic process that governs transport of pollutants in natural streams. It is critical for spill action centers to be able to predict the pollutant travel time and break-through curves accurately following accidental spills in urban streams. This study presents a novel gene expression model for longitudinal dispersion developed using 150 published data sets of geometric and hydraulic parameters in natural streams in the United States, Canada, Europe, and New Zealand. The training and testing of the model were accomplished using randomly-selected 67% (100 data sets) and 33% (50 data sets) of the data sets, respectively. Gene expression programming (GEP) is used to develop empirical relations between the longitudinal dispersion coefficient and various control variables, including the Froude number which reflects the effect of reach slope, aspect ratio, and the bed material roughness on the dispersion coefficient. Two GEP models have been developed, and the prediction uncertainties of the developed GEP models are quantified and compared with those of existing models, showing improved prediction accuracy in favor of GEP models. Finally, a parametric analysis is performed for further verification of the developed GEP models. The main reason for the higher accuracy of the GEP models compared to the existing regression models is that exponents of the key variables (aspect ratio and bed material roughness) are not constants but a function of the Froude number. The proposed relations are both simple and accurate and can be effectively used to predict the longitudinal dispersion coefficients in natural streams.
Factors in Variability of Serial Gabapentin Concentrations in Elderly Patients with Epilepsy.
Conway, Jeannine M; Eberly, Lynn E; Collins, Joseph F; Macias, Flavia M; Ramsay, R Eugene; Leppik, Ilo E; Birnbaum, Angela K
2017-10-01
To characterize and quantify the variability of serial gabapentin concentrations in elderly patients with epilepsy. This study included 83 patients (age ≥ 60 yrs) from an 18-center randomized double-blind double-dummy parallel study from the Veterans Affairs Cooperative 428 Study. All patients were taking 1500 mg/day gabapentin. Within-person coefficient of variation (CV) in gabapentin concentrations, measured weekly to bimonthly for up to 52 weeks, then quarterly, was computed. Impact of patient characteristics on gabapentin concentrations (linear mixed model) and CV (linear regression) were estimated. A total of 482 gabapentin concentration measurements were available for analysis. Gabapentin concentrations and intrapatient CVs ranged from 0.5 to 22.6 μg/ml (mean 7.9 μg/ml, standard deviation [SD] 4.1 μg/ml) and 2% to 79% (mean 27.9%, SD 15.3%), respectively, across all visits. Intrapatient CV was higher by 7.3% for those with a body mass index of ≥ 30 kg/m 2 (coefficient = 7.3, p=0.04). CVs were on average 0.5% higher for each 1-unit higher CV in creatinine clearance (coefficient = 0.5, p=0.03) and 1.2% higher for each 1-hour longer mean time after dose (coefficient = 1.2, p=0.04). Substantial intrapatient variability in serial gabapentin concentration was noted in elderly patients with epilepsy. Creatinine clearance, time of sampling relative to dose, and obesity were found to be positively associated with variability. © 2017 Pharmacotherapy Publications, Inc.
Manafiazar, G; McFadden, T; Goonewardene, L; Okine, E; Basarab, J; Li, P; Wang, Z
2013-01-01
Residual Feed Intake (RFI) is a measure of energy efficiency. Developing an appropriate model to predict expected energy intake while accounting for multifunctional energy requirements of metabolic body weight (MBW), empty body weight (EBW), milk production energy requirements (MPER), and their nonlinear lactation profiles, is the key to successful prediction of RFI in dairy cattle. Individual daily actual energy intake and monthly body weight of 281 first-lactation dairy cows from 1 to 305 d in milk were recorded at the Dairy Research and Technology Centre of the University of Alberta (Edmonton, AB, Canada); individual monthly milk yield and compositions were obtained from the Dairy Herd Improvement Program. Combinations of different orders (1-5) of fixed (F) and random (R) factors were fitted using Legendre polynomial regression to model the nonlinear lactation profiles of MBW, EBW, and MPER over 301 d. The F5R3, F5R3, and F5R2 (subscripts indicate the order fitted) models were selected, based on the combination of the log-likelihood ratio test and the Bayesian information criterion, as the best prediction equations for MBW, EBW, and MPER, respectively. The selected models were used to predict daily individual values for these traits. To consider the body reserve changes, the differences of predicted EBW between 2 consecutive days were considered as the EBW change between these days. The smoothed total 301-d actual energy intake was then linearly regressed on the total 301-d predicted traits of MBW, EBW change, and MPER to obtain the first-lactation RFI (coefficient of determination=0.68). The mean of predicted daily average lactation RFI was 0 and ranged from -6.58 to 8.64 Mcal of NE(L)/d. Fifty-one percent of the animals had an RFI value below the mean (efficient) and 49% of them had an RFI value above the mean (inefficient). These results indicate that the first-lactation RFI can be predicted from its component traits with a reasonable coefficient of determination. The predicted RFI could be used in the dairy breeding program to increase profitability by selecting animals that are genetically superior in energy efficiency based on RFI, or through routinely measured traits, which are genetically correlated with RFI. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time component also was a statistically significant explanatory variable for estimating chloride. The regression equations for chloride at the Red River at Fargo provided a fair relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.66 and the equation for the Red River at Grand Forks provided a relatively good relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.77. Turbidity and streamflow were statistically significant explanatory variables for estimating nitrate plus nitrite concentrations at the Red River at Fargo and turbidity was the only statistically significant explanatory variable for estimating nitrate plus nitrite concentrations at Grand Forks. The regression equation for the Red River at Fargo provided a relatively poor relation between nitrate plus nitrite concentrations, turbidity, and streamflow, with an adjusted coefficient of determination of 0.46. The regression equation for the Red River at Grand Forks provided a fair relation between nitrate plus nitrite concentrations and turbidity, with an adjusted coefficient of determination of 0.73. Some of the variability that was not explained by the equations might be attributed to different sources contributing nitrates to the stream at different times. Turbidity, streamflow, and a seasonal component were statistically significant explanatory variables for estimating total phosphorus at the Red River at Fargo and Grand Forks. The regression equation for the Red River at Fargo provided a relatively fair relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.74. The regression equation for the Red River at Grand Forks provided a good relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.87. For the Red River at Fargo, turbidity and streamflow were statistically significant explanatory variables for estimating suspended-sediment concentrations. For the Red River at Grand Forks, turbidity was the only statistically significant explanatory variable for estimating suspended-sediment concentration. The regression equation at the Red River at Fargo provided a good relation between suspended-sediment concentration, turbidity, and streamflow, with an adjusted coefficient of determination of 0.95. The regression equation for the Red River at Grand Forks provided a good relation between suspended-sediment concentration and turbidity, with an adjusted coefficient of determination of 0.96.
NASA Astrophysics Data System (ADS)
Yoshida, Kenichiro; Nishidate, Izumi; Ojima, Nobutoshi; Iwata, Kayoko
2014-01-01
To quantitatively evaluate skin chromophores over a wide region of curved skin surface, we propose an approach that suppresses the effect of the shading-derived error in the reflectance on the estimation of chromophore concentrations, without sacrificing the accuracy of that estimation. In our method, we use multiple regression analysis, assuming the absorbance spectrum as the response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as the predictor variables. The concentrations of melanin and total hemoglobin are determined from the multiple regression coefficients using compensation formulae (CF) based on the diffuse reflectance spectra derived from a Monte Carlo simulation. To suppress the shading-derived error, we investigated three different combinations of multiple regression coefficients for the CF. In vivo measurements with the forearm skin demonstrated that the proposed approach can reduce the estimation errors that are due to shading-derived errors in the reflectance. With the best combination of multiple regression coefficients, we estimated that the ratio of the error to the chromophore concentrations is about 10%. The proposed method does not require any measurements or assumptions about the shape of the subjects; this is an advantage over other studies related to the reduction of shading-derived errors.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
Random diffusion and leverage effect in financial markets.
Perelló, Josep; Masoliver, Jaume
2003-03-01
We prove that Brownian market models with random diffusion coefficients provide an exact measure of the leverage effect [J-P. Bouchaud et al., Phys. Rev. Lett. 87, 228701 (2001)]. This empirical fact asserts that past returns are anticorrelated with future diffusion coefficient. Several models with random diffusion have been suggested but without a quantitative study of the leverage effect. Our analysis lets us to fully estimate all parameters involved and allows a deeper study of correlated random diffusion models that may have practical implications for many aspects of financial markets.
NASA Astrophysics Data System (ADS)
Wheeler, David C.; Waller, Lance A.
2009-03-01
In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.
Genetic background in partitioning of metabolizable energy efficiency in dairy cows.
Mehtiö, T; Negussie, E; Mäntysaari, P; Mäntysaari, E A; Lidauer, M H
2018-05-01
The main objective of this study was to assess the genetic differences in metabolizable energy efficiency and efficiency in partitioning metabolizable energy in different pathways: maintenance, milk production, and growth in primiparous dairy cows. Repeatability models for residual energy intake (REI) and metabolizable energy intake (MEI) were compared and the genetic and permanent environmental variations in MEI were partitioned into its energy sinks using random regression models. We proposed 2 new feed efficiency traits: metabolizable energy efficiency (MEE), which is formed by modeling MEI fitting regressions on energy sinks [metabolic body weight (BW 0.75 ), energy-corrected milk, body weight gain, and body weight loss] directly; and partial MEE (pMEE), where the model for MEE is extended with regressions on energy sinks nested within additive genetic and permanent environmental effects. The data used were collected from Luke's experimental farms Rehtijärvi and Minkiö between 1998 and 2014. There were altogether 12,350 weekly MEI records on 495 primiparous Nordic Red dairy cows from wk 2 to 40 of lactation. Heritability estimates for REI and MEE were moderate, 0.33 and 0.26, respectively. The estimate of the residual variance was smaller for MEE than for REI, indicating that analyzing weekly MEI observations simultaneously with energy sinks is preferable. Model validation based on Akaike's information criterion showed that pMEE models fitted the data even better and also resulted in smaller residual variance estimates. However, models that included random regression on BW 0.75 converged slowly. The resulting genetic standard deviation estimate from the pMEE coefficient for milk production was 0.75 MJ of MEI/kg of energy-corrected milk. The derived partial heritabilities for energy efficiency in maintenance, milk production, and growth were 0.02, 0.06, and 0.04, respectively, indicating that some genetic variation may exist in the efficiency of using metabolizable energy for different pathways in dairy cows. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Resonance energy transfer process in nanogap-based dual-color random lasing
NASA Astrophysics Data System (ADS)
Shi, Xiaoyu; Tong, Junhua; Liu, Dahe; Wang, Zhaona
2017-04-01
The resonance energy transfer (RET) process between Rhodamine 6G and oxazine in the nanogap-based random systems is systematically studied by revealing the variations and fluctuations of RET coefficients with pump power density. Three working regions stable fluorescence, dynamic laser, and stable laser are thus demonstrated in the dual-color random systems. The stable RET coefficients in fluorescence and lasing regions are generally different and greatly dependent on the donor concentration and the donor-acceptor ratio. These results may provide a way to reveal the energy distribution regulars in the random system and to design the tunable multi-color coherent random lasers for colorful imaging.
Chung, Sang M; Lee, David J; Hand, Austin; Young, Philip; Vaidyanathan, Jayabharathi; Sahajwalla, Chandrahas
2015-12-01
The study evaluated whether the renal function decline rate per year with age in adults varies based on two primary statistical analyses: cross-section (CS), using one observation per subject, and longitudinal (LT), using multiple observations per subject over time. A total of 16628 records (3946 subjects; age range 30-92 years) of creatinine clearance and relevant demographic data were used. On average, four samples per subject were collected for up to 2364 days (mean: 793 days). A simple linear regression and random coefficient models were selected for CS and LT analyses, respectively. The renal function decline rates per year were 1.33 and 0.95 ml/min/year for CS and LT analyses, respectively, and were slower when the repeated individual measurements were considered. The study confirms that rates are different based on statistical analyses, and that a statistically robust longitudinal model with a proper sampling design provides reliable individual as well as population estimates of the renal function decline rates per year with age in adults. In conclusion, our findings indicated that one should be cautious in interpreting the renal function decline rate with aging information because its estimation was highly dependent on the statistical analyses. From our analyses, a population longitudinal analysis (e.g. random coefficient model) is recommended if individualization is critical, such as a dose adjustment based on renal function during a chronic therapy. Copyright © 2015 John Wiley & Sons, Ltd.
SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.
Weaver, Bruce; Wuensch, Karl L
2013-09-01
Several procedures that use summary data to test hypotheses about Pearson correlations and ordinary least squares regression coefficients have been described in various books and articles. To our knowledge, however, no single resource describes all of the most common tests. Furthermore, many of these tests have not yet been implemented in popular statistical software packages such as SPSS and SAS. In this article, we describe all of the most common tests and provide SPSS and SAS programs to perform them. When they are applicable, our code also computes 100 × (1 - α)% confidence intervals corresponding to the tests. For testing hypotheses about independent regression coefficients, we demonstrate one method that uses summary data and another that uses raw data (i.e., Potthoff analysis). When the raw data are available, the latter method is preferred, because use of summary data entails some loss of precision due to rounding.
NASA Astrophysics Data System (ADS)
Zhai, Mengting; Chen, Yan; Li, Jing; Zhou, Jun
2017-12-01
The molecular electrongativity distance vector (MEDV-13) was used to describe the molecular structure of benzyl ether diamidine derivatives in this paper, Based on MEDV-13, The three-parameter (M 3, M 15, M 47) QSAR model of insecticidal activity (pIC 50) for 60 benzyl ether diamidine derivatives was constructed by leaps-and-bounds regression (LBR) . The traditional correlation coefficient (R) and the cross-validation correlation coefficient (R CV ) were 0.975 and 0.971, respectively. The robustness of the regression model was validated by Jackknife method, the correlation coefficient R were between 0.971 and 0.983. Meanwhile, the independent variables in the model were tested to be no autocorrelation. The regression results indicate that the model has good robust and predictive capabilities. The research would provide theoretical guidance for the development of new generation of anti African trypanosomiasis drugs with efficiency and low toxicity.
Zheng, Qi; Peng, Limin
2016-01-01
Quantile regression provides a flexible platform for evaluating covariate effects on different segments of the conditional distribution of response. As the effects of covariates may change with quantile level, contemporaneously examining a spectrum of quantiles is expected to have a better capacity to identify variables with either partial or full effects on the response distribution, as compared to focusing on a single quantile. Under this motivation, we study a general adaptively weighted LASSO penalization strategy in the quantile regression setting, where a continuum of quantile index is considered and coefficients are allowed to vary with quantile index. We establish the oracle properties of the resulting estimator of coefficient function. Furthermore, we formally investigate a BIC-type uniform tuning parameter selector and show that it can ensure consistent model selection. Our numerical studies confirm the theoretical findings and illustrate an application of the new variable selection procedure. PMID:28008212
Statistical Analysis for Multisite Trials Using Instrumental Variables with Random Coefficients
ERIC Educational Resources Information Center
Raudenbush, Stephen W.; Reardon, Sean F.; Nomi, Takako
2012-01-01
Multisite trials can clarify the average impact of a new program and the heterogeneity of impacts across sites. Unfortunately, in many applications, compliance with treatment assignment is imperfect. For these applications, we propose an instrumental variable (IV) model with person-specific and site-specific random coefficients. Site-specific IV…
Lenselink, Eelke B; Ten Dijke, Niels; Bongers, Brandon; Papadatos, George; van Vlijmen, Herman W T; Kowalczyk, Wojtek; IJzerman, Adriaan P; van Westen, Gerard J P
2017-08-14
The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method ('DNN_PCM') performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized 'DNN_PCM'). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .
2014-01-01
Background Exposure measurement error is a concern in long-term PM2.5 health studies using ambient concentrations as exposures. We assessed error magnitude by estimating calibration coefficients as the association between personal PM2.5 exposures from validation studies and typically available surrogate exposures. Methods Daily personal and ambient PM2.5, and when available sulfate, measurements were compiled from nine cities, over 2 to 12 days. True exposure was defined as personal exposure to PM2.5 of ambient origin. Since PM2.5 of ambient origin could only be determined for five cities, personal exposure to total PM2.5 was also considered. Surrogate exposures were estimated as ambient PM2.5 at the nearest monitor or predicted outside subjects’ homes. We estimated calibration coefficients by regressing true on surrogate exposures in random effects models. Results When monthly-averaged personal PM2.5 of ambient origin was used as the true exposure, calibration coefficients equaled 0.31 (95% CI:0.14, 0.47) for nearest monitor and 0.54 (95% CI:0.42, 0.65) for outdoor home predictions. Between-city heterogeneity was not found for outdoor home PM2.5 for either true exposure. Heterogeneity was significant for nearest monitor PM2.5, for both true exposures, but not after adjusting for city-average motor vehicle number for total personal PM2.5. Conclusions Calibration coefficients were <1, consistent with previously reported chronic health risks using nearest monitor exposures being under-estimated when ambient concentrations are the exposure of interest. Calibration coefficients were closer to 1 for outdoor home predictions, likely reflecting less spatial error. Further research is needed to determine how our findings can be incorporated in future health studies. PMID:24410940
Yoneoka, Daisuke; Henmi, Masayuki
2017-06-01
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.
Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C
2014-03-01
To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing
2013-01-01
According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Mills, Britain A.; Caetano, Raul; Bernstein, Ira H.
2011-01-01
This study compares the demographic predictors of items assessing attitudes towards drinking across Hispanic national groups. Data were from the 2006 Hispanic Americans Baseline Alcohol Survey (HABLAS), which used a multistage cluster sample design to interview 5,224 individuals randomly selected from the household population in Miami, New York, Philadelphia, Houston, and Los Angeles. Predictive invariance of demographic predictors of alcohol attitudes over four Hispanic national groups (Puerto Rican, Cuban, Mexican, and South/Central Americans) was examined using multiple-group seemingly unrelated probit regression. The analyses examined whether the influence of various demographic predictors varied across the Hispanic national groups in their regression coefficients, item intercepts, and error correlations. The hypothesis of predictive invariance was supported. Hispanic groups did not differ in how demographic predictors related to individual attitudinal items (regression slopes were invariant). In addition, the groups did not differ in attitudinal endorsement rates once demographic covariates were taken into account (item intercepts were invariant). Although Hispanic groups have different attitudes about alcohol, the influence of multiple demographic characteristics on alcohol attitudes operates similarly across Hispanic groups. Future models of drinking behavior in adult Hispanics need not posit moderating effects of group on the relation between these background characteristics and attitudes. PMID:25379120
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Wong, Ken; Smalarz, Amy; Wu, Ning; Boulanger, Luke; Wogen, Jenifer
2011-01-01
Care management processes (CMP) may be implemented in health systems to improve chronic disease quality of care. The objective of this study was to assess the relationship between the presence of hypertension-specific CMP and blood pressure (BP) control among hypertensive patients within selected physician organizations in the USA-modified version of the Physician Practice Connection Readiness Survey (PPC-RS), developed by The National Committee for Quality Assurance (NCQA), was administered to chief medical officers at 28 US-based physician organizations in 2010. Hypertension-specific survey items were added to the PPC-RS and focused on medication fill compliance, chronic disease management, and patient self-management. Demographic and clinical cross-sectional data from a random sample of 300 hypertensive patients age 18 years or older were collected at each site. Physician site and patient characteristics were reported. Regression models were used to assess the relationship between hypertension-specific physician practices and patient BP control. Eligible patients had at least a 1-year history of care with the physician organization and had an encounter within the past year of data collection. Of the 28 participating sites, most had electronic medical records that handle total functionality (71.4%) and had more than 50 staff members (78.6%). Across all sites, approximately 61% of patients had controlled BP. Regression analyses found that practices that used physician education as an effort to improve medication fill compliance demonstrated improvement in BP control (changes in systolic BP: beta coefficient = -1.366, P = .034; changes in diastolic BP: beta coefficient = -0.859, P = .056). The use of a systematic process to screen or assess patients for hypertension as a risk factor was also found to be associated with improvements in BP control (changes in diastolic BP: beta coefficient = -0.860, P = .006). In addition, physician practices that maintained a list of hypertensive patients along with the patients' associated clinical data demonstrated better BP control (currently controlled BP: beta coefficient = 0.282, P = .034; currently uncontrolled BP: beta coefficient = -0.292, P = .023). However, use of the following practices had a negative correlation with BP control: case management (changes in systolic BP: beta coefficient 1.649, P = .022; changes in diastolic BP: beta coefficient = 0.910, P = .078), follow-up for missed appointments (changes in systolic BP: beta coefficient = 0.937, P = .041; changes in diastolic BP: beta coefficient = 0.165, P = .627), adopted written evidence-based standards of care to treat hypertension (changes in systolic BP: beta coefficient = 0.985, P = .032; changes in diastolic BP: beta coefficient = 0.346, P = .305), and checklists for tests and interventions (changes in systolic BP: beta coefficient = 1.586, P = .004; changes in diastolic BP: beta coefficient = 0.938, P = .019). Findings from this multisite study provide evidence that the presence of some hypertension-specific CMP in physician organizations may be associated with better BP outcomes among hypertensive patients. In particular, patients may benefit from physician efforts to improve medication fill compliance as well as organizational monitoring of hypertensive patients and their clinical data. Further research is warranted to better assess the relationship between CMP and treatment of chronic diseases such as hypertension over time. Copyright © 2011 American Society of Hypertension. Published by Elsevier Inc. All rights reserved.
Yao, Xin; Niu, Yandong; Li, Youzhi; Zou, Dongsheng; Ding, Xiaohui; Bian, Hualin
2018-05-09
Bioaccumulation of five heavy metals (Cd, Cu, Mn, Pb, and Zn) in six plant organs (panicle, leaf, stem, root, rhizome, and bud) of the emergent and perennial plant species, Miscanthus sacchariflorus, were investigated to estimate the plant's potential for accumulating heavy metals in the wetlands of Dongting Lake. We found the highest Cd concentrations in the panicles and leaves; while the highest Cu and Mn were observed in the roots, the highest Pb in the panicles, and the highest Zn in the panicles and buds. In contrast, the lowest Cd concentrations were detected in the stem, roots, and buds; the lowest Cu concentrations in the leaves and stems; the lowest Mn concentrations in the panicles, rhizomes, and buds; the lowest Pb concentrations in the stems; and the lowest Zn concentrations in the leaves, stems, and rhizomes. Mean Cu concentration in the plant showed a positive regression coefficient with plot elevation, soil organic matter content, and soil Cu concentration, whereas it showed a negative regression coefficient with soil moisture and electrolyte leakage. Mean Mn concentration showed positive and negative regression coefficients with soil organic matter and soil moisture, respectively. Mean Pb concentration exhibited positive regression coefficient with plot elevation and soil total P concentration, and Zn concentration showed a positive regression coefficient with soil available P and total P concentrations. However, there was no significant regression coefficient between mean Cd concentration in the plant and the investigated environmental parameters. Stems and roots were the main organs involved in heavy metal accumulation from the environment. The mean quantities of heavy metals accumulated in the plant tissues were 2.2 mg Cd, 86.7 mg Cu, 290.3 mg Mn, 15.9 mg Pb, and 307 mg Zn per square meter. In the Dongting Lake wetlands, 0.7 × 10 3 kg Cd, 22.9 × 10 3 kg Cu, 77.5 × 10 3 kg Mn, 3.1 × 10 3 kg Pb, and 95.9 × 10 3 kg Zn per year were accumulated by aboveground organs and removed from the lake through harvesting for paper manufacture.
Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields
NASA Astrophysics Data System (ADS)
Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.
2018-01-01
This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.
Guertler, Diana; Vandelanotte, Corneel; Short, Camille; Alley, Stephanie; Schoeppe, Stephanie; Duncan, Mitch J.
2015-01-01
Objective: This study aims to examine the relationship of lifestyle behaviors (physical activity, work and non-work sitting time, sleep quality, and sleep duration) with presenteeism while controlling for sociodemographics, work- and health-related variables. Methods: Data were collected from 710 workers (aged 20 to 76 years; 47.9% women) from randomly selected Australian adults who completed an online survey. Linear regression was used to examine the relationship between lifestyle behaviors and presenteeism. Results: Poorer sleep quality (standardized regression coefficients [B] = 0.112; P < 0.05), suboptimal duration (B = 0.081; P < 0.05), and lower work sitting time (B = −0.086; P < 0.05) were significantly associated with higher presenteeism when controlling for all lifestyle behaviors. Engaging in three risky lifestyle behaviors was associated with higher presenteeism (B = 0.150; P < 0.01) compared with engaging in none or one. Conclusions: The results of this study highlight the importance of sleep behaviors for presenteeism and call for behavioral interventions that simultaneously address sleep in conjunction with other activity-related behaviors. PMID:25742538
Singer, Donald A.; Menzie, W.D.; Cheng, Qiuming; Bonham-Carter, G. F.
2005-01-01
Estimating numbers of undiscovered mineral deposits is a fundamental part of assessing mineral resources. Some statistical tools can act as guides to low variance, unbiased estimates of the number of deposits. The primary guide is that the estimates must be consistent with the grade and tonnage models. Another statistical guide is the deposit density (i.e., the number of deposits per unit area of permissive rock in well-explored control areas). Preliminary estimates and confidence limits of the number of undiscovered deposits in a tract of given area may be calculated using linear regression and refined using frequency distributions with appropriate parameters. A Poisson distribution leads to estimates having lower relative variances than the regression estimates and implies a random distribution of deposits. Coefficients of variation are used to compare uncertainties of negative binomial, Poisson, or MARK3 empirical distributions that have the same expected number of deposits as the deposit density. Statistical guides presented here allow simple yet robust estimation of the number of undiscovered deposits in permissive terranes.
NASA Astrophysics Data System (ADS)
Hammud, Hassan H.; Ghannoum, Amer; Masoud, Mamdouh S.
2006-02-01
Sixteen Schiff bases obtained from the condensation of benzaldehyde or salicylaldehyde with various amines (aniline, 4-carboxyaniline, phenylhydrazine, 2,4-dinitrophenylhydrazine, ethylenediamine, hydrazine, o-phenylenediamine and 2,6-pyridinediamine) are studied with UV-vis spectroscopy to observe the effect of solvents, substituents and other structural factors on the spectra. The bands involving different electronic transitions are interpreted. Computerized analysis and multiple regression techniques were applied to calculate the regression and correlation coefficients based on the equation that relates peak position λmax to the solvent parameters that depend on the H-bonding ability, refractive index and dielectric constant of solvents.
ERIC Educational Resources Information Center
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw
2006-01-01
We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
Multicollinearity and Regression Analysis
NASA Astrophysics Data System (ADS)
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
Cortés-Castell, Ernesto; Juste, Mercedes; Palazón-Bru, Antonio; Monge, Laura; Sánchez-Ferrer, Francisco; Rizo-Baeza, María Mercedes
2017-01-01
Dual-energy X-ray absorptiometry (DXA) provides separate measurements of fat mass, fat-free mass and bone mass, and is a quick, accurate, and safe technique, yet one that is not readily available in routine clinical practice. Consequently, we aimed to develop statistical formulas to predict fat mass (%) and fat mass index (FMI) with simple parameters (age, sex, weight and height). We conducted a retrospective observational cross-sectional study in 416 overweight or obese patients aged 4-18 years that involved assessing adiposity by DXA (fat mass percentage and FMI), body mass index (BMI), sex and age. We randomly divided the sample into two parts (construction and validation). In the construction sample, we developed formulas to predict fat mass and FMI using linear multiple regression models. The formulas were validated in the other sample, calculating the intraclass correlation coefficient via bootstrapping. The fat mass percentage formula had a coefficient of determination of 0.65. This value was 0.86 for FMI. In the validation, the constructed formulas had an intraclass correlation coefficient of 0.77 for fat mass percentage and 0.92 for FMI. Our predictive formulas accurately predicted fat mass and FMI with simple parameters (BMI, sex and age) in children with overweight and obesity. The proposed methodology could be applied in other fields. Further studies are needed to externally validate these formulas.
Parametric regression model for survival data: Weibull regression model as an example
2016-01-01
Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan
2014-02-10
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
NASA Astrophysics Data System (ADS)
Suciu, N.; Vamos, C.; Vereecken, H.; Vanderborght, J.; Hardelauf, H.
2003-04-01
When the small scale transport is modeled by a Wiener process and the large scale heterogeneity by a random velocity field, the effective coefficients, Deff, can be decomposed as sums between the local coefficient, D, a contribution of the random advection, Dadv, and a contribution of the randomness of the trajectory of plume center of mass, Dcm: Deff=D+Dadv-Dcm. The coefficient Dadv is similar to that introduced by Taylor in 1921, and more recent works associate it with the thermodynamic equilibrium. The ``ergodic hypothesis'' says that over large time intervals Dcm vanishes and the effect of the heterogeneity is described by Dadv=Deff-D. In this work we investigate numerically the long time behavior of the effective coefficients as well as the validity of the ergodic hypothesis. The transport in every realization of the velocity field is modeled with the Global Random Walk Algorithm, which is able to track as many particles as necessary to achieve a statistically reliable simulation of the process. Averages over realizations are further used to estimate mean coefficients and standard deviations. In order to remain in the frame of most of the theoretical approaches, the velocity field was generated in a linear approximation and the logarithm of the hydraulic conductivity was taken to be exponential decaying correlated with variance equal to 0.1. Our results show that even in these idealized conditions, the effective coefficients tend to asymptotic constant values only when the plume travels thousands of correlations lengths (while the first order theories usually predict Fickian behavior after tens of correlations lengths) and that the ergodicity conditions are still far from being met.
The solar wind effect on cosmic rays and solar activity
NASA Technical Reports Server (NTRS)
Fujimoto, K.; Kojima, H.; Murakami, K.
1985-01-01
The relation of cosmic ray intensity to solar wind velocity is investigated, using neutron monitor data from Kiel and Deep River. The analysis shows that the regression coefficient of the average intensity for a time interval to the corresponding average velocity is negative and that the absolute effect increases monotonously with the interval of averaging, tau, that is, from -0.5% per 100km/s for tau = 1 day to -1.1% per 100km/s for tau = 27 days. For tau 27 days the coefficient becomes almost constant independently of the value of tau. The analysis also shows that this tau-dependence of the regression coefficiently is varying with the solar activity.
NASA Astrophysics Data System (ADS)
Wang, Gang-Jin; Xie, Chi; Chen, Shou; Yang, Jiao-Jiao; Yang, Ming-Yan
2013-09-01
In this study, we first build two empirical cross-correlation matrices in the US stock market by two different methods, namely the Pearson’s correlation coefficient and the detrended cross-correlation coefficient (DCCA coefficient). Then, combining the two matrices with the method of random matrix theory (RMT), we mainly investigate the statistical properties of cross-correlations in the US stock market. We choose the daily closing prices of 462 constituent stocks of S&P 500 index as the research objects and select the sample data from January 3, 2005 to August 31, 2012. In the empirical analysis, we examine the statistical properties of cross-correlation coefficients, the distribution of eigenvalues, the distribution of eigenvector components, and the inverse participation ratio. From the two methods, we find some new results of the cross-correlations in the US stock market in our study, which are different from the conclusions reached by previous studies. The empirical cross-correlation matrices constructed by the DCCA coefficient show several interesting properties at different time scales in the US stock market, which are useful to the risk management and optimal portfolio selection, especially to the diversity of the asset portfolio. It will be an interesting and meaningful work to find the theoretical eigenvalue distribution of a completely random matrix R for the DCCA coefficient because it does not obey the Marčenko-Pastur distribution.
Measuring multivariate association and beyond
Josse, Julie; Holmes, Susan
2017-01-01
Simple correlation coefficients between two variables have been generalized to measure association between two matrices in many ways. Coefficients such as the RV coefficient, the distance covariance (dCov) coefficient and kernel based coefficients are being used by different research communities. Scientists use these coefficients to test whether two random vectors are linked. Once it has been ascertained that there is such association through testing, then a next step, often ignored, is to explore and uncover the association’s underlying patterns. This article provides a survey of various measures of dependence between random vectors and tests of independence and emphasizes the connections and differences between the various approaches. After providing definitions of the coefficients and associated tests, we present the recent improvements that enhance their statistical properties and ease of interpretation. We summarize multi-table approaches and provide scenarii where the indices can provide useful summaries of heterogeneous multi-block data. We illustrate these different strategies on several examples of real data and suggest directions for future research. PMID:29081877
Kaitaniemi, Pekka
2008-04-09
Allometric equations are widely used in many branches of biological science. The potential information content of the normalization constant b in allometric equations of the form Y = bX(a) has, however, remained largely neglected. To demonstrate the potential for utilizing this information, I generated a large number of artificial datasets that resembled those that are frequently encountered in biological studies, i.e., relatively small samples including measurement error or uncontrolled variation. The value of X was allowed to vary randomly within the limits describing different data ranges, and a was set to a fixed theoretical value. The constant b was set to a range of values describing the effect of a continuous environmental variable. In addition, a normally distributed random error was added to the values of both X and Y. Two different approaches were then used to model the data. The traditional approach estimated both a and b using a regression model, whereas an alternative approach set the exponent a at its theoretical value and only estimated the value of b. Both approaches produced virtually the same model fit with less than 0.3% difference in the coefficient of determination. Only the alternative approach was able to precisely reproduce the effect of the environmental variable, which was largely lost among noise variation when using the traditional approach. The results show how the value of b can be used as a source of valuable biological information if an appropriate regression model is selected.
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin
2013-10-15
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Aarestrup, Cecilie; Bonnesen, Camilla T; Thygesen, Lau C; Krarup, Anne F; Waagstein, Anne B; Jensen, Poul D; Bentzen, Joan
2014-02-01
To examine the effect of an educational intervention on sunbed use and intentions and attitudes toward sunbed use in 14- to 18-year-olds at continuation schools. We randomized 33 continuation schools either to receive the educational intervention (n = 16) or to be controls (n = 17). Intervention schools received an e-magazine addressing the health risks of sunbed use. Information on behavior and intentions and attitudes toward sunbed use was gathered through self-administrated questionnaires before the intervention and at 6 months as a follow-up. The effect of the intervention was examined by multilevel linear regression and logistic regression. Sunbed use was significantly lower at follow-up among pupils at intervention schools versus pupils at control schools (girls: odds ratio .60, 95% confidence interval .42-.86; Boys: odds ratio .58, 95% confidence interval .35-.96). The intervention had no effect on intention to use sunbeds or attitudes toward sunbed use. The analyses revealed a significant impact of school on attitudes toward sunbed; the intraclass correlation coefficient was estimated to be 6.0% and 7.8% for girls and boys, respectively. The findings from the present study provide new evidence of a positive effect of an educational intervention on sunbed use among pupils aged 14-18 years at continuation schools. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-01-01
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375
Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-03-29
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.
Interpretation of the Coefficients in the Fit y = at + bx + c
ERIC Educational Resources Information Center
Farnsworth, David L.
2006-01-01
The goals of this note are to derive formulas for the coefficients a and b in the least-squares regression plane y = at + bx + c for observations (t[subscript]i,x[subscript]i,y[subscript]i), i = 1, 2, ..., n, and to present meanings for the coefficients a and b. In this note, formulas for the coefficients a and b in the least-squares fit are…
Li, Jin-ming; Zheng, Huai-jing; Wang, Lu-nan; Deng, Wei
2003-04-01
To establish a model for one choosing controls with a suitable concentration for internal quality control (IQC) with qualitative ELISA detection, and a consecutive plotting method on Levey-Jennings control chart when reagent kit lot is changed. First, a series of control serum with 0.2, 0.5, 1.0, 2.0 and 5.0ng/ml HBsAg respectively were assessed for within-run and between-run precision according to NCCLs EP5 document. Then, a linear regression equation (y=bx + a) with best correlation coefficient (r > 0.99) was established based on S/CO values of the series of control serum. Finally, one could choose controls with S/CO value calculated from the equation (y = bx + a) minus the product of the S/CO value multiplying three-fold between-run CV to be still more than 1.0 for IQC use. For consecutive plotting on Levey-Jennings control chart when ELISA kit lot was changed, the new lot kits were used to detect the same series of HBsAg control serum as above. Then, a new linear regression equation (y2 = b2x2 + a2) with best correlation coefficient was obtained. The old one (y1 =b1x1 + a1) could be obtained based on the mean values from above precision assessment. The S/CO value of a control serum detected by new lot kit could be changed to that detected by old kit lot based on the factor of y2/y1. Therefore, the plotting on primary Levey-Jennings control chart could be continued. The within-run coefficient of variation CV of the ELISA method for control serum with 0.2, 0.5, 1.0, 2.0 and 5.0ng/ml HBsAg were 11.08%, 9.49%, 9.83%, 9.18% and 7.25%, respectively, and between-run CV were 13.25%, 14.03%, 15.11%, 13.29% and 9.92%. The linear regression equation with best correlation coefficient from a test at random was y = 3.509x + 0.180. The suitable concentration of control serum for IQC could be 0.5ng/ml or 1.0ng/ml. The linear regression equation from the old lot and other two new lots of the ELISA kits were y1 = 3.550(x1) + 0.226, y2 = 3.238(x2) +0.388, and y3 =3.428(x3) + 0.148, respectively. Then, the transferring factors of 0.960 (y2/y1) and 0.908 (y3/y1) were obtained. The results shows that the model established for IQC control serum concentration selecting and for consecutive plotting on control chart when the reagent lot is changed is effective and practical.
Mixed models, linear dependency, and identification in age-period-cohort models.
O'Brien, Robert M
2017-07-20
This paper examines the identification problem in age-period-cohort models that use either linear or categorically coded ages, periods, and cohorts or combinations of these parameterizations. These models are not identified using the traditional fixed effect regression model approach because of a linear dependency between the ages, periods, and cohorts. However, these models can be identified if the researcher introduces a single just identifying constraint on the model coefficients. The problem with such constraints is that the results can differ substantially depending on the constraint chosen. Somewhat surprisingly, age-period-cohort models that specify one or more of ages and/or periods and/or cohorts as random effects are identified. This is the case without introducing an additional constraint. I label this identification as statistical model identification and show how statistical model identification comes about in mixed models and why which effects are treated as fixed and which are treated as random can substantially change the estimates of the age, period, and cohort effects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Roso, V M; Schenkel, F S; Miller, S P; Schaeffer, L R
2005-08-01
Breed additive, dominance, and epistatic loss effects are of concern in the genetic evaluation of a multibreed population. Multiple regression equations used for fitting these effects may show a high degree of multicollinearity among predictor variables. Typically, when strong linear relationships exist, the regression coefficients have large SE and are sensitive to changes in the data file and to the addition or deletion of variables in the model. Generalized ridge regression methods were applied to obtain stable estimates of direct and maternal breed additive, dominance, and epistatic loss effects in the presence of multicollinearity among predictor variables. Preweaning weight gains of beef calves in Ontario, Canada, from 1986 to 1999 were analyzed. The genetic model included fixed direct and maternal breed additive, dominance, and epistatic loss effects, fixed environmental effects of age of the calf, contemporary group, and age of the dam x sex of the calf, random additive direct and maternal genetic effects, and random maternal permanent environment effect. The degree and the nature of the multicollinearity were identified and ridge regression methods were used as an alternative to ordinary least squares (LS). Ridge parameters were obtained using two different objective methods: 1) generalized ridge estimator of Hoerl and Kennard (R1); and 2) bootstrap in combination with cross-validation (R2). Both ridge regression methods outperformed the LS estimator with respect to mean squared error of predictions (MSEP) and variance inflation factors (VIF) computed over 100 bootstrap samples. The MSEP of R1 and R2 were similar, and they were 3% less than the MSEP of LS. The average VIF of LS, R1, and R2 were equal to 26.81, 6.10, and 4.18, respectively. Ridge regression methods were particularly effective in decreasing the multicollinearity involving predictor variables of breed additive effects. Because of a high degree of confounding between estimates of maternal dominance and direct epistatic loss effects, it was not possible to compare the relative importance of these effects with a high level of confidence. The inclusion of epistatic loss effects in the additive-dominance model did not cause noticeable reranking of sires, dams, and calves based on across-breed EBV. More precise estimates of breed effects as a result of this study may result in more stable across-breed estimated breeding values over the years.
Equations of prediction for abdominal fat in brown egg-laying hens fed different diets.
Souza, C; Jaimes, J J B; Gewehr, C E
2017-06-01
The objective was to use noninvasive measurements to formulate equations for predicting the abdominal fat weight of laying hens in a noninvasive manner. Hens were fed with different diets; the external body measurements of birds were used as regressors. We used 288 Hy-Line Brown laying hens, distributed in a completely randomized design in a factorial arrangement, submitted for 16 wk to 2 metabolizable energy levels (2,550 and 2,800 kcal/kg) and 3 levels of crude protein in the diet (150, 160, and 170 g/kg), totaling 6 treatments, with 48 hens each. Sixteen hens per treatment of 92 wk age were utilized to evaluate body weight, bird length, tarsus and sternum, greater and lesser diameter of the tarsus, and abdominal fat weight, after slaughter. The equations were obtained by using measures evaluated with regressors through simple and multiple linear regression with the stepwise method of indirect elimination (backward), with P < 0.10 for all variables remaining in the model. The weight of abdominal fat as predicted by the equations and observed values for each bird were subjected to Pearson's correlation analysis. The equations generated by energy levels showed coefficients of determination of 0.50 and 0.74 for 2,800 and 2,550 kcal/kg of metabolizable energy, respectively, with correlation coefficients of 0.71 and 0.84, with a highly significant correlation between the calculated and observed values of abdominal fat. For protein levels of 150, 160, and 170 g/kg in the diet, it was possible to obtain coefficients of determination of 0.75, 0.57, and 0.61, with correlation coefficients of 0.86, 0.75, and 0.78, respectively. Regarding the general equation for predicting abdominal fat weight, the coefficient of determination was 0.62; the correlation coefficient was 0.79. The equations for predicting abdominal fat weight in laying hens, based on external measurements of the birds, showed positive coefficients of determination and correlation coefficients, thus allowing researchers to determine abdominal fat weight in vivo. © 2016 Poultry Science Association Inc.
Is the Professional Satisfaction of General Internists Associated with Patient Satisfaction?
Haas, Jennifer S; Cook, E Francis; Puopolo, Ann Louise; Burstin, Helen R; Cleary, Paul D; Brennan, Troyen A
2000-01-01
BACKGROUND The growth of managed care has raised a number of concerns about patient and physician satisfaction. An association between physicians' professional satisfaction and the satisfaction of their patients could suggest new types of organizational interventions to improve the satisfaction of both. OBJECTIVE To examine the relation between the satisfaction of general internists and their patients. DESIGN Cross-sectional surveys of patients and physicians. SETTING Eleven academically affiliated general internal medicine practices in the greater-Boston area. PARTICIPANTS A random sample of English-speaking and Spanish-speaking patients (n = 2,620) with at least one visit to their physician (n = 166) during the preceding year. MEASUREMENTS Patients' overall satisfaction with their health care, and their satisfaction with their most recent physician visit. MAIN RESULTS After adjustment, the patients of physicians who rated themselves to be very or extremely satisfied with their work had higher scores for overall satisfaction with their health care (regression coefficient 2.10; 95% confidence interval 0.73–3.48), and for satisfaction with their most recent physician visit (regression coefficient 1.23; 95% confidence interval 0.26–2.21). In addition, younger patients, those with better overall health status, and those cared for by a physician who worked part-time were significantly more likely to report better satisfaction with both measures. Minority patients and those with managed care insurance also reported lower overall satisfaction. CONCLUSIONS The patients of physicians who have higher professional satisfaction may themselves be more satisfied with their care. Further research will need to consider factors that may mediate the relation between patient and physician satisfaction. PMID:10672116
NASA Astrophysics Data System (ADS)
Ghadiriyan Arani, M.; Pahlavani, P.; Effati, M.; Noori Alamooti, F.
2017-09-01
Today, one of the social problems influencing on the lives of many people is the road traffic crashes especially the highway ones. In this regard, this paper focuses on highway of capital and the most populous city in the U.S. state of Georgia and the ninth largest metropolitan area in the United States namely Atlanta. Geographically weighted regression and general centrality criteria are the aspects of traffic used for this article. In the first step, in order to estimate of crash intensity, it is needed to extract the dual graph from the status of streets and highways to use general centrality criteria. With the help of the graph produced, the criteria are: Degree, Pageranks, Random walk, Eccentricity, Closeness, Betweenness, Clustering coefficient, Eigenvector, and Straightness. The intensity of crash point is counted for every highway by dividing the number of crashes in that highway to the total number of crashes. Intensity of crash point is calculated for each highway. Then, criteria and crash point were normalized and the correlation between them was calculated to determine the criteria that are not dependent on each other. The proposed hybrid approach is a good way to regression issues because these effective measures result to a more desirable output. R2 values for geographically weighted regression using the Gaussian kernel was 0.539 and also 0.684 was obtained using a triple-core cube. The results showed that the triple-core cube kernel is better for modeling the crash intensity.
NASA Astrophysics Data System (ADS)
Webb, Mathew A.; Hall, Andrew; Kidd, Darren; Minansy, Budiman
2016-05-01
Assessment of local spatial climatic variability is important in the planning of planting locations for horticultural crops. This study investigated three regression-based calibration methods (i.e. traditional versus two optimized methods) to relate short-term 12-month data series from 170 temperature loggers and 4 weather station sites with data series from nearby long-term Australian Bureau of Meteorology climate stations. The techniques trialled to interpolate climatic temperature variables, such as frost risk, growing degree days (GDDs) and chill hours, were regression kriging (RK), regression trees (RTs) and random forests (RFs). All three calibration methods produced accurate results, with the RK-based calibration method delivering the most accurate validation measures: coefficients of determination ( R 2) of 0.92, 0.97 and 0.95 and root-mean-square errors of 1.30, 0.80 and 1.31 °C, for daily minimum, daily maximum and hourly temperatures, respectively. Compared with the traditional method of calibration using direct linear regression between short-term and long-term stations, the RK-based calibration method improved R 2 and reduced root-mean-square error (RMSE) by at least 5 % and 0.47 °C for daily minimum temperature, 1 % and 0.23 °C for daily maximum temperature and 3 % and 0.33 °C for hourly temperature. Spatial modelling indicated insignificant differences between the interpolation methods, with the RK technique tending to be the slightly better method due to the high degree of spatial autocorrelation between logger sites.
Shafizadeh-Moghadam, Hossein; Valavi, Roozbeh; Shahabi, Himan; Chapi, Kamran; Shirzadi, Ataollah
2018-07-01
In this research, eight individual machine learning and statistical models are implemented and compared, and based on their results, seven ensemble models for flood susceptibility assessment are introduced. The individual models included artificial neural networks, classification and regression trees, flexible discriminant analysis, generalized linear model, generalized additive model, boosted regression trees, multivariate adaptive regression splines, and maximum entropy, and the ensemble models were Ensemble Model committee averaging (EMca), Ensemble Model confidence interval Inferior (EMciInf), Ensemble Model confidence interval Superior (EMciSup), Ensemble Model to estimate the coefficient of variation (EMcv), Ensemble Model to estimate the mean (EMmean), Ensemble Model to estimate the median (EMmedian), and Ensemble Model based on weighted mean (EMwmean). The data set covered 201 flood events in the Haraz watershed (Mazandaran province in Iran) and 10,000 randomly selected non-occurrence points. Among the individual models, the Area Under the Receiver Operating Characteristic (AUROC), which showed the highest value, belonged to boosted regression trees (0.975) and the lowest value was recorded for generalized linear model (0.642). On the other hand, the proposed EMmedian resulted in the highest accuracy (0.976) among all models. In spite of the outstanding performance of some models, nevertheless, variability among the prediction of individual models was considerable. Therefore, to reduce uncertainty, creating more generalizable, more stable, and less sensitive models, ensemble forecasting approaches and in particular the EMmedian is recommended for flood susceptibility assessment. Copyright © 2018 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Bloom, Howard S.; Raudenbush, Stephen W.; Weiss, Michael J.; Porter, Kristin
2017-01-01
The present article considers a fundamental question in evaluation research: "By how much do program effects vary across sites?" The article first presents a theoretical model of cross-site impact variation and a related estimation model with a random treatment coefficient and fixed site-specific intercepts. This approach eliminates…
ERIC Educational Resources Information Center
Waller, Niels; Jones, Jeff
2011-01-01
We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…
MATIN: a random network coding based framework for high quality peer-to-peer live video streaming.
Barekatain, Behrang; Khezrimotlagh, Dariush; Aizaini Maarof, Mohd; Ghaeini, Hamid Reza; Salleh, Shaharuddin; Quintana, Alfonso Ariza; Akbari, Behzad; Cabrera, Alicia Triviño
2013-01-01
In recent years, Random Network Coding (RNC) has emerged as a promising solution for efficient Peer-to-Peer (P2P) video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay.
Govindarajan, Parameswari; Schlewitz, Gudrun; Schliefke, Nathalie; Weisweiler, David; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C.; Zahner, Daniel; Hemdan, Nasr Y.; Böcker, Wolfgang; Schnettler, Reinhard; Heiss, Christian
2013-01-01
Background Osteoporosis is a multi-factorial, chronic, skeletal disease highly prevalent in post-menopausal women and is influenced by hormonal and dietary factors. Because animal models are imperative for disease diagnostics, the present study establishes and evaluates enhanced osteoporosis obtained through combined ovariectomy and deficient diet by DEXA (dual-energy X-ray absorptiometry) for a prolonged time period. Material/Methods Sprague-Dawley rats were randomly divided into sham (laparotomized) and OVX-diet (ovariectomized and fed with deficient diet) groups. Different skeletal sites were scanned by DEXA at the following time points: M0 (baseline), M12 (12 months post-surgery), and M14 (14 months post-surgery). Parameters analyzed included BMD (bone mineral density), BMC (bone mineral content), bone area, and fat (%). Regression analysis was performed to determine the interrelationships between BMC, BMD, and bone area from M0 to M14. Results BMD and BMC were significantly lower in OVX-diet rats at M12 and M14 compared to sham rats. The Z-scores were below −5 in OVX-diet rats at M12, but still decreased at M14 in OVX-diet rats. Bone area and percent fat were significantly lower in OVX-diet rats at M14 compared to sham rats. The regression coefficients for BMD vs. bone area, BMC vs. bone area, and BMC vs. BMD of OVX-diet rats increased with time. This is explained by differential percent change in BMD, BMC, and bone area with respect to time and disease progression. Conclusions Combined ovariectomy and deficient diet in rats caused significant reduction of BMD, BMC, and bone area, with nearly 40% bone loss after 14 months, indicating the development of severe osteoporosis. An increasing regression coefficient of BMD vs. bone area with disease progression emphasizes bone area as an important parameter, along with BMD and BMC, for prediction of fracture risk. PMID:23446183
Govindarajan, Parameswari; Schlewitz, Gudrun; Schliefke, Nathalie; Weisweiler, David; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C; Zahner, Daniel; Hemdan, Nasr Y; Böcker, Wolfgang; Schnettler, Reinhard; Heiss, Christian
2013-02-28
Osteoporosis is a multi-factorial, chronic, skeletal disease highly prevalent in post-menopausal women and is influenced by hormonal and dietary factors. Because animal models are imperative for disease diagnostics, the present study establishes and evaluates enhanced osteoporosis obtained through combined ovariectomy and deficient diet by DEXA (dual-energy X-ray absorptiometry) for a prolonged time period. Sprague-Dawley rats were randomly divided into sham (laparotomized) and OVX-diet (ovariectomized and fed with deficient diet) groups. Different skeletal sites were scanned by DEXA at the following time points: M0 (baseline), M12 (12 months post-surgery), and M14 (14 months post-surgery). Parameters analyzed included BMD (bone mineral density), BMC (bone mineral content), bone area, and fat (%). Regression analysis was performed to determine the interrelationships between BMC, BMD, and bone area from M0 to M14. BMD and BMC were significantly lower in OVX-diet rats at M12 and M14 compared to sham rats. The Z-scores were below -5 in OVX-diet rats at M12, but still decreased at M14 in OVX-diet rats. Bone area and percent fat were significantly lower in OVX-diet rats at M14 compared to sham rats. The regression coefficients for BMD vs. bone area, BMC vs. bone area, and BMC vs. BMD of OVX-diet rats increased with time. This is explained by differential percent change in BMD, BMC, and bone area with respect to time and disease progression. Combined ovariectomy and deficient diet in rats caused significant reduction of BMD, BMC, and bone area, with nearly 40% bone loss after 14 months, indicating the development of severe osteoporosis. An increasing regression coefficient of BMD vs. bone area with disease progression emphasizes bone area as an important parameter, along with BMD and BMC, for prediction of fracture risk.
Kopprasch, Steffi; Dheban, Srirangan; Schuhmann, Kai; Xu, Aimin; Schulte, Klaus-Martin; Simeonovic, Charmaine J; Schwarz, Peter E H; Bornstein, Stefan R; Shevchenko, Andrej; Graessler, Juergen
2016-01-01
Glucolipotoxicity is a major pathophysiological mechanism in the development of insulin resistance and type 2 diabetes mellitus (T2D). We aimed to detect subtle changes in the circulating lipid profile by shotgun lipidomics analyses and to associate them with four different insulin sensitivity indices. The cross-sectional study comprised 90 men with a broad range of insulin sensitivity including normal glucose tolerance (NGT, n = 33), impaired glucose tolerance (IGT, n = 32) and newly detected T2D (n = 25). Prior to oral glucose challenge plasma was obtained and quantitatively analyzed for 198 lipid molecular species from 13 different lipid classes including triacylglycerls (TAGs), phosphatidylcholine plasmalogen/ether (PC O-s), sphingomyelins (SMs), and lysophosphatidylcholines (LPCs). To identify a lipidomic signature of individual insulin sensitivity we applied three data mining approaches, namely least absolute shrinkage and selection operator (LASSO), Support Vector Regression (SVR) and Random Forests (RF) for the following insulin sensitivity indices: homeostasis model of insulin resistance (HOMA-IR), glucose insulin sensitivity index (GSI), insulin sensitivity index (ISI), and disposition index (DI). The LASSO procedure offers a high prediction accuracy and and an easier interpretability than SVR and RF. After LASSO selection, the plasma lipidome explained 3% (DI) to maximal 53% (HOMA-IR) variability of the sensitivity indexes. Among the lipid species with the highest positive LASSO regression coefficient were TAG 54:2 (HOMA-IR), PC O- 32:0 (GSI), and SM 40:3:1 (ISI). The highest negative regression coefficient was obtained for LPC 22:5 (HOMA-IR), TAG 51:1 (GSI), and TAG 58:6 (ISI). Although a substantial part of lipid molecular species showed a significant correlation with insulin sensitivity indices we were able to identify a limited number of lipid metabolites of particular importance based on the LASSO approach. These few selected lipids with the closest connection to sensitivity indices may help to further improve disease risk prediction and disease and therapy monitoring.
Macroscopic damping model for structural dynamics with random polycrystalline configurations
NASA Astrophysics Data System (ADS)
Yang, Yantao; Cui, Junzhi; Yu, Yifan; Xiang, Meizhen
2018-06-01
In this paper the macroscopic damping model for dynamical behavior of the structures with random polycrystalline configurations at micro-nano scales is established. First, the global motion equation of a crystal is decomposed into a set of motion equations with independent single degree of freedom (SDOF) along normal discrete modes, and then damping behavior is introduced into each SDOF motion. Through the interpolation of discrete modes, the continuous representation of damping effects for the crystal is obtained. Second, from energy conservation law the expression of the damping coefficient is derived, and the approximate formula of damping coefficient is given. Next, the continuous damping coefficient for polycrystalline cluster is expressed, the continuous dynamical equation with damping term is obtained, and then the concrete damping coefficients for a polycrystalline Cu sample are shown. Finally, by using statistical two-scale homogenization method, the macroscopic homogenized dynamical equation containing damping term for the structures with random polycrystalline configurations at micro-nano scales is set up.
Gonzalez-Vazquez, J P; Anta, Juan A; Bisquert, Juan
2009-11-28
The random walk numerical simulation (RWNS) method is used to compute diffusion coefficients for hopping transport in a fully disordered medium at finite carrier concentrations. We use Miller-Abrahams jumping rates and an exponential distribution of energies to compute the hopping times in the random walk simulation. The computed diffusion coefficient shows an exponential dependence with respect to Fermi-level and Arrhenius behavior with respect to temperature. This result indicates that there is a well-defined transport level implicit to the system dynamics. To establish the origin of this transport level we construct histograms to monitor the energies of the most visited sites. In addition, we construct "corrected" histograms where backward moves are removed. Since these moves do not contribute to transport, these histograms provide a better estimation of the effective transport level energy. The analysis of this concept in connection with the Fermi-level dependence of the diffusion coefficient and the regime of interest for the functioning of dye-sensitised solar cells is thoroughly discussed.
Interpreting Bivariate Regression Coefficients: Going beyond the Average
ERIC Educational Resources Information Center
Halcoussis, Dennis; Phillips, G. Michael
2010-01-01
Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…
Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results
ERIC Educational Resources Information Center
Warne, Russell T.
2011-01-01
Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Dowd, Kieran P.; Harrington, Deirdre M.; Donnelly, Alan E.
2012-01-01
Background The activPAL has been identified as an accurate and reliable measure of sedentary behaviour. However, only limited information is available on the accuracy of the activPAL activity count function as a measure of physical activity, while no unit calibration of the activPAL has been completed to date. This study aimed to investigate the criterion validity of the activPAL, examine the concurrent validity of the activPAL, and perform and validate a value calibration of the activPAL in an adolescent female population. The performance of the activPAL in estimating posture was also compared with sedentary thresholds used with the ActiGraph accelerometer. Methodologies Thirty adolescent females (15 developmental; 15 cross-validation) aged 15–18 years performed 5 activities while wearing the activPAL, ActiGraph GT3X, and the Cosmed K4B2. A random coefficient statistics model examined the relationship between metabolic equivalent (MET) values and activPAL counts. Receiver operating characteristic analysis was used to determine activity thresholds and for cross-validation. The random coefficient statistics model showed a concordance correlation coefficient of 0.93 (standard error of the estimate = 1.13). An optimal moderate threshold of 2997 was determined using mixed regression, while an optimal vigorous threshold of 8229 was determined using receiver operating statistics. The activPAL count function demonstrated very high concurrent validity (r = 0.96, p<0.01) with the ActiGraph count function. Levels of agreement for sitting, standing, and stepping between direct observation and the activPAL and ActiGraph were 100%, 98.1%, 99.2% and 100%, 0%, 100%, respectively. Conclusions These findings suggest that the activPAL is a valid, objective measurement tool that can be used for both the measurement of physical activity and sedentary behaviours in an adolescent female population. PMID:23094069
40 CFR 53.34 - Test procedure for methods for PM10 and Class I methods for PM2.5.
Code of Federal Regulations, 2011 CFR
2011-07-01
... linear regression parameters (slope, intercept, and correlation coefficient) describing the relationship... correlation coefficient. (2) To pass the test for comparability, the slope, intercept, and correlation...
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
NASA Technical Reports Server (NTRS)
Stolzer, Alan J.; Halford, Carl
2007-01-01
In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.
NASA Astrophysics Data System (ADS)
Mitra, Ashis; Majumdar, Prabal Kumar; Bannerjee, Debamalya
2013-03-01
This paper presents a comparative analysis of two modeling methodologies for the prediction of air permeability of plain woven handloom cotton fabrics. Four basic fabric constructional parameters namely ends per inch, picks per inch, warp count and weft count have been used as inputs for artificial neural network (ANN) and regression models. Out of the four regression models tried, interaction model showed very good prediction performance with a meager mean absolute error of 2.017 %. However, ANN models demonstrated superiority over the regression models both in terms of correlation coefficient and mean absolute error. The ANN model with 10 nodes in the single hidden layer showed very good correlation coefficient of 0.982 and 0.929 and mean absolute error of only 0.923 and 2.043 % for training and testing data respectively.
Baldi, F; Alencar, M M; Albuquerque, L G
2010-12-01
The objective of this work was to estimate covariance functions using random regression models on B-splines functions of animal age, for weights from birth to adult age in Canchim cattle. Data comprised 49,011 records on 2435 females. The model of analysis included fixed effects of contemporary groups, age of dam as quadratic covariable and the population mean trend taken into account by a cubic regression on orthogonal polynomials of animal age. Residual variances were modelled through a step function with four classes. The direct and maternal additive genetic effects, and animal and maternal permanent environmental effects were included as random effects in the model. A total of seventeen analyses, considering linear, quadratic and cubic B-splines functions and up to seven knots, were carried out. B-spline functions of the same order were considered for all random effects. Random regression models on B-splines functions were compared to a random regression model on Legendre polynomials and with a multitrait model. Results from different models of analyses were compared using the REML form of the Akaike Information criterion and Schwarz' Bayesian Information criterion. In addition, the variance components and genetic parameters estimated for each random regression model were also used as criteria to choose the most adequate model to describe the covariance structure of the data. A model fitting quadratic B-splines, with four knots or three segments for direct additive genetic effect and animal permanent environmental effect and two knots for maternal additive genetic effect and maternal permanent environmental effect, was the most adequate to describe the covariance structure of the data. Random regression models using B-spline functions as base functions fitted the data better than Legendre polynomials, especially at mature ages, but higher number of parameters need to be estimated with B-splines functions. © 2010 Blackwell Verlag GmbH.
Correlation and prediction of dynamic human isolated joint strength from lean body mass
NASA Technical Reports Server (NTRS)
Pandya, Abhilash K.; Hasson, Scott M.; Aldridge, Ann M.; Maida, James C.; Woolford, Barbara J.
1992-01-01
A relationship between a person's lean body mass and the amount of maximum torque that can be produced with each isolated joint of the upper extremity was investigated. The maximum dynamic isolated joint torque (upper extremity) on 14 subjects was collected using a dynamometer multi-joint testing unit. These data were reduced to a table of coefficients of second degree polynomials, computed using a least squares regression method. All the coefficients were then organized into look-up tables, a compact and convenient storage/retrieval mechanism for the data set. Data from each joint, direction and velocity, were normalized with respect to that joint's average and merged into files (one for each curve for a particular joint). Regression was performed on each one of these files to derive a table of normalized population curve coefficients for each joint axis, direction, and velocity. In addition, a regression table which included all upper extremity joints was built which related average torque to lean body mass for an individual. These two tables are the basis of the regression model which allows the prediction of dynamic isolated joint torques from an individual's lean body mass.
Sabetghadam, Samaneh; Ahmadi-Givi, Farhang
2014-01-01
Light extinction, which is the extent of attenuation of light signal for every distance traveled by light in the absence of special weather conditions (e.g., fog and rain), can be expressed as the sum of scattering and absorption effects of aerosols. In this paper, diurnal and seasonal variations of the extinction coefficient are investigated for the urban areas of Tehran from 2007 to 2009. Cases of visibility impairment that were concurrent with reports of fog, mist, precipitation, or relative humidity above 90% are filtered. The mean value and standard deviation of daily extinction are 0.49 and 0.39 km(-1), respectively. The average is much higher than that in many other large cities in the world, indicating the rather poor air quality over Tehran. The extinction coefficient shows obvious diurnal variations in each season, with a peak in the morning that is more pronounced in the wintertime. Also, there is a very slight increasing trend in the annual variations of atmospheric extinction coefficient, which suggests that air quality has regressed since 2007. The horizontal extinction coefficient decreased from January to July in each year and then increased between July and December, with the maximum value in the winter. Diurnal variation of extinction is often associated with small values for low relative humidity (RH), but increases significantly at higher RH. Annual correlation analysis shows that there is a positive correlation between the extinction coefficient and RH, CO, PM10, SO2, and NO2 concentration, while negative correlation exists between the extinction and T, WS, and O3, implying their unfavorable impact on extinction variation. The extinction budget was derived from multiple regression equations using the regression coefficients. On average, 44% of the extinction is from suspended particles, 3% is from air molecules, about 5% is from NO2 absorption, 0.35% is from RH, and approximately 48% is unaccounted for, which may represent errors in the data as well as contribution of other atmospheric constituents omitted from the analysis. Stronger regression equation is achieved in the summer, meaning that the extinction is more predictable in this season using pollutant concentrations.
Li, Zhenghua; Cheng, Fansheng; Xia, Zhining
2011-01-01
The chemical structures of 114 polycyclic aromatic sulfur heterocycles (PASHs) have been studied by molecular electronegativity-distance vector (MEDV). The linear relationships between gas chromatographic retention index and the MEDV have been established by a multiple linear regression (MLR) model. The results of variable selection by stepwise multiple regression (SMR) and the powerful predictive abilities of the optimization model appraised by leave-one-out cross-validation showed that the optimization model with the correlation coefficient (R) of 0.994 7 and the cross-validated correlation coefficient (Rcv) of 0.994 0 possessed the best statistical quality. Furthermore, when the 114 PASHs compounds were divided into calibration and test sets in the ratio of 2:1, the statistical analysis showed our models possesses almost equal statistical quality, the very similar regression coefficients and the good robustness. The quantitative structure-retention relationship (QSRR) model established may provide a convenient and powerful method for predicting the gas chromatographic retention of PASHs.
Cerebrospinal fluid norepinephrine and cognition in subjects across the adult age span
Wang, Lucy Y.; Murphy, Richard R.; Hanscom, Brett; Li, Ge; Millard, Steven P.; Petrie, Eric C.; Galasko, Douglas R.; Sikkema, Carl; Raskind, Murray A.; Wilkinson, Charles W.; Peskind, Elaine R.
2013-01-01
Adequate central nervous system noradrenergic activity enhances cognition, but excessive noradrenergic activity may have adverse effects on cognition. Previous studies have also demonstrated that noradrenergic activity is higher in older than younger adults. We aimed to determine relationships between cerebrospinal fluid (CSF) norepinephrine (NE) concentration and cognitive performance by using data from a CSF bank that includes samples from 258 cognitively normal participants aged 21–100 years. After adjusting for age, gender, education, and ethnicity, higher CSF NE levels (units of 100 pg/mL) are associated with poorer performance on tests of attention, processing speed, and executive function (Trail Making A: regression coefficient 1.5, standard error [SE] 0.77, p = 0.046; Trail Making B: regression coefficient 5.0, SE 2.2, p = 0.024; Stroop Word-Color Interference task: regression coefficient 6.1, SE 2.0, p = 0.003). Findings are consistent with the earlier literature relating excess noradrenergic activity with cognitive impairment. PMID:23639207
Cerebrospinal fluid norepinephrine and cognition in subjects across the adult age span.
Wang, Lucy Y; Murphy, Richard R; Hanscom, Brett; Li, Ge; Millard, Steven P; Petrie, Eric C; Galasko, Douglas R; Sikkema, Carl; Raskind, Murray A; Wilkinson, Charles W; Peskind, Elaine R
2013-10-01
Adequate central nervous system noradrenergic activity enhances cognition, but excessive noradrenergic activity may have adverse effects on cognition. Previous studies have also demonstrated that noradrenergic activity is higher in older than younger adults. We aimed to determine relationships between cerebrospinal fluid (CSF) norepinephrine (NE) concentration and cognitive performance by using data from a CSF bank that includes samples from 258 cognitively normal participants aged 21-100 years. After adjusting for age, gender, education, and ethnicity, higher CSF NE levels (units of 100 pg/mL) are associated with poorer performance on tests of attention, processing speed, and executive function (Trail Making A: regression coefficient 1.5, standard error [SE] 0.77, p = 0.046; Trail Making B: regression coefficient 5.0, SE 2.2, p = 0.024; Stroop Word-Color Interference task: regression coefficient 6.1, SE 2.0, p = 0.003). Findings are consistent with the earlier literature relating excess noradrenergic activity with cognitive impairment. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Seibert, Mathias; Merz, Bruno; Apel, Heiko
2017-03-01
The Limpopo Basin in southern Africa is prone to droughts which affect the livelihood of millions of people in South Africa, Botswana, Zimbabwe and Mozambique. Seasonal drought early warning is thus vital for the whole region. In this study, the predictability of hydrological droughts during the main runoff period from December to May is assessed using statistical approaches. Three methods (multiple linear models, artificial neural networks, random forest regression trees) are compared in terms of their ability to forecast streamflow with up to 12 months of lead time. The following four main findings result from the study. 1. There are stations in the basin at which standardised streamflow is predictable with lead times up to 12 months. The results show high inter-station differences of forecast skill but reach a coefficient of determination as high as 0.73 (cross validated). 2. A large range of potential predictors is considered in this study, comprising well-established climate indices, customised teleconnection indices derived from sea surface temperatures and antecedent streamflow as a proxy of catchment conditions. El Niño and customised indices, representing sea surface temperature in the Atlantic and Indian oceans, prove to be important teleconnection predictors for the region. Antecedent streamflow is a strong predictor in small catchments (with median 42 % explained variance), whereas teleconnections exert a stronger influence in large catchments. 3. Multiple linear models show the best forecast skill in this study and the greatest robustness compared to artificial neural networks and random forest regression trees, despite their capabilities to represent nonlinear relationships. 4. Employed in early warning, the models can be used to forecast a specific drought level. Even if the coefficient of determination is low, the forecast models have a skill better than a climatological forecast, which is shown by analysis of receiver operating characteristics (ROCs). Seasonal statistical forecasts in the Limpopo show promising results, and thus it is recommended to employ them as complementary to existing forecasts in order to strengthen preparedness for droughts.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
The influence of statistical properties of Fourier coefficients on random Gaussian surfaces.
de Castro, C P; Luković, M; Andrade, R F S; Herrmann, H J
2017-05-16
Many examples of natural systems can be described by random Gaussian surfaces. Much can be learned by analyzing the Fourier expansion of the surfaces, from which it is possible to determine the corresponding Hurst exponent and consequently establish the presence of scale invariance. We show that this symmetry is not affected by the distribution of the modulus of the Fourier coefficients. Furthermore, we investigate the role of the Fourier phases of random surfaces. In particular, we show how the surface is affected by a non-uniform distribution of phases.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.
ERIC Educational Resources Information Center
Algina, James; Olejnik, Stephen
2000-01-01
Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
NASA Astrophysics Data System (ADS)
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s-1. Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit ( r 2 > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s(-1). Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit (r(2) > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Campbell, J Elliott; Moen, Jeremie C; Ney, Richard A; Schnoor, Jerald L
2008-03-01
Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively.
Radon-222 concentrations in ground water and soil gas on Indian reservations in Wisconsin
DeWild, John F.; Krohelski, James T.
1995-01-01
For sites with wells finished in the sand and gravel aquifer, the coefficient of determination (R2) of the regression of concentration of radon-222 in ground water as a function of well depth is 0.003 and the significance level is 0.32, which indicates that there is not a statistically significant relation between radon-222 concentrations in ground water and well depth. The coefficient of determination of the regression of radon-222 in ground water and soil gas is 0.19 and the root mean square error of the regression line is 271 picocuries per liter. Even though the significance level (0.036) indicates a statistical relation, the root mean square error of the regression is so large that the regression equation would not give reliable predictions. Because of an inadequate number of samples, similar statistical analyses could not be performed for sites with wells finished in the crystalline and sedimentary bedrock aquifers.
ERIC Educational Resources Information Center
Kane, Michael T.; Mroch, Andrew A.
2010-01-01
In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…
Incremental Net Effects in Multiple Regression
ERIC Educational Resources Information Center
Lipovetsky, Stan; Conklin, Michael
2005-01-01
A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Naughton, Corina; Feely, John; Bennett, Kathleen
2007-10-01
Interventions to promote prescribing of preventive therapies in patients with cardiovascular disease (CVD) or diabetes have reported variable success. (i) To evaluate the effect of prescribing feedback on GP practice using academic detailing compared to postal bulletin on prescribing of CVD preventive therapies in patients with CVD or diabetes at 3 and 6 months post intervention and (ii) to evaluate the intervention from a GP's perspective. Volunteer GP practices (n = 98) were randomized to receive individualized prescribing feedback via academic detailing (postal bulletin plus outreach visit) (n = 48) or postal bulletin (n = 50). The proportion of CVD or diabetic patients on statins and antiplatelet agents/warfarin pre- and post-intervention was calculated for each GP practice. Multivariate regression with a random effects model was used to compare differences between the groups adjusting for GP clustering and confounding factors. beta-Coefficients and 95% confidence intervals (CIs) are presented. There was a 3% increase in statin prescribing in CVD patients at 6 months post-intervention for both randomized groups, but there was no statistical difference between the groups (beta = 0.004; 95% CI = -0.01 to 0.02). Statin and antiplatelet/warfarin prescribing also increased in the diabetic population; there was no significant differences between the groups. GPs participating in the project expressed a high level of satisfaction with both interventions. Prescribing of preventive therapies increased in both randomized groups over the study period. But academic detailing did not have an additional effect on changing prescribing over the postal bulletin alone.
Bayesian dynamic modeling of time series of dengue disease case counts.
Martínez-Bello, Daniel Adyro; López-Quílez, Antonio; Torres-Prieto, Alexander
2017-07-01
The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful models for decision-making in public health.
MATIN: A Random Network Coding Based Framework for High Quality Peer-to-Peer Live Video Streaming
Barekatain, Behrang; Khezrimotlagh, Dariush; Aizaini Maarof, Mohd; Ghaeini, Hamid Reza; Salleh, Shaharuddin; Quintana, Alfonso Ariza; Akbari, Behzad; Cabrera, Alicia Triviño
2013-01-01
In recent years, Random Network Coding (RNC) has emerged as a promising solution for efficient Peer-to-Peer (P2P) video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay. PMID:23940530
Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
Analysis of longitudinal "time series" data in toxicology.
Cox, C; Cory-Slechta, D A
1987-02-01
Studies focusing on chronic toxicity or on the time course of toxicant effect often involve repeated measurements or longitudinal observations of endpoints of interest. Experimental design considerations frequently necessitate between-group comparisons of the resulting trends. Typically, procedures such as the repeated-measures analysis of variance have been used for statistical analysis, even though the required assumptions may not be satisfied in some circumstances. This paper describes an alternative analytical approach which summarizes curvilinear trends by fitting cubic orthogonal polynomials to individual profiles of effect. The resulting regression coefficients serve as quantitative descriptors which can be subjected to group significance testing. Randomization tests based on medians are proposed to provide a comparison of treatment and control groups. Examples from the behavioral toxicology literature are considered, and the results are compared to more traditional approaches, such as repeated-measures analysis of variance.
Tan, Kok Chooi; Lim, Hwee San; Matjafri, Mohd Zubir; Abdullah, Khiruddin
2012-06-01
Atmospheric corrections for multi-temporal optical satellite images are necessary, especially in change detection analyses, such as normalized difference vegetation index (NDVI) rationing. Abrupt change detection analysis using remote-sensing techniques requires radiometric congruity and atmospheric correction to monitor terrestrial surfaces over time. Two atmospheric correction methods were used for this study: relative radiometric normalization and the simplified method for atmospheric correction (SMAC) in the solar spectrum. A multi-temporal data set consisting of two sets of Landsat images from the period between 1991 and 2002 of Penang Island, Malaysia, was used to compare NDVI maps, which were generated using the proposed atmospheric correction methods. Land surface temperature (LST) was retrieved using ATCOR3_T in PCI Geomatica 10.1 image processing software. Linear regression analysis was utilized to analyze the relationship between NDVI and LST. This study reveals that both of the proposed atmospheric correction methods yielded high accuracy through examination of the linear correlation coefficients. To check for the accuracy of the equation obtained through linear regression analysis for every single satellite image, 20 points were randomly chosen. The results showed that the SMAC method yielded a constant value (in terms of error) to predict the NDVI value from linear regression analysis-derived equation. The errors (average) from both proposed atmospheric correction methods were less than 10%.
Early warnings for suicide attempt among Chinese rural population.
Lyu, Juncheng; Wang, Yingying; Shi, Hong; Zhang, Jie
2018-06-05
This study was to explore the main influencing factors of attempted suicide and establish an early warning model, so as to put forward prevention strategies for attempted suicide. Data came from a large-scale case-control epidemiological survey. A sample of 659 serious suicide attempters was randomly recruited from 13 rural counties in China. Each case was matched by a community control for gender, age, and residence location. Face to face interviews were conducted for all the cases and controls with the same structured questionnaire. Univariate logistic regression was applied to screen the factors and multivariate logistic regression was used to excavate the predictors. There were no statistical differences between suicide attempters and the community controls in gender, age, and residence location. The Cronbach`s coefficients for all the scales used were above 0.675. The multivariate logistic regressions have revealed 12 statistically significant variables predicting attempted suicide, including less education, family history of suicide, poor health, mental problem, aspiration strain, hopelessness, impulsivity, depression, negative life events. On the other hand, social support, coping skills, and healthy community protected the rural residents from suicide attempt. The excavated warning predictors are significant clinical meaning for the clinical psychiatrist. Crisis intervention strategies in rural China should be informed by the findings from this research. Education, social support, healthy community, and strain reduction are all measures to decrease the likelihood of crises. Copyright © 2018. Published by Elsevier B.V.
A Portuguese value set for the SF-6D.
Ferreira, Lara N; Ferreira, Pedro L; Pereira, Luis N; Brazier, John; Rowen, Donna
2010-08-01
The SF-6D is a preference-based measure of health derived from the SF-36 that can be used for cost-effectiveness analysis using cost-per-quality adjusted life-year analysis. This study seeks to estimate a system weight for the SF-6D for Portugal and to compare the results with the UK system weights. A sample of 55 health states defined by the SF-6D has been valued by a representative random sample of the Portuguese population, stratified by sex and age (n = 140), using the Standard Gamble (SG). Several models are estimated at both the individual and aggregate levels for predicting health-state valuations. Models with main effects, with interaction effects and with the constant forced to unity are presented. Random effects (RE) models are estimated using generalized least squares (GLS) regressions. Generalized estimation equations (GEE) are used to estimate RE models with the constant forced to unity. Estimations at the individual level were performed using 630 health-state valuations. Alternative functional forms are considered to account for the skewed distribution of health-state valuations. The models are analyzed in terms of their coefficients, overall fit, and the ability for predicting the SG-values. The RE models estimated using GLS and through GEE produce significant coefficients, which are robust across model specification. However, there are concerns regarding some inconsistent estimates, and so parsimonious consistent models were estimated. There is evidence of under prediction in some states assigned to poor health. The results are consistent with the UK results. The models estimated provide preference-based quality of life weights for the Portuguese population when health status data have been collected using the SF-36. Although the sample was randomly drowned findings should be treated with caution, given the small sample size, even knowing that they have been estimated at the individual level.
Remote sensing of PM2.5 from ground-based optical measurements
NASA Astrophysics Data System (ADS)
Li, S.; Joseph, E.; Min, Q.
2014-12-01
Remote sensing of particulate matter concentration with aerodynamic diameter smaller than 2.5 um(PM2.5) by using ground-based optical measurements of aerosols is investigated based on 6 years of hourly average measurements of aerosol optical properties, PM2.5, ceilometer backscatter coefficients and meteorological factors from Howard University Beltsville Campus facility (HUBC). The accuracy of quantitative retrieval of PM2.5 using aerosol optical depth (AOD) is limited due to changes in aerosol size distribution and vertical distribution. In this study, ceilometer backscatter coefficients are used to provide vertical information of aerosol. It is found that the PM2.5-AOD ratio can vary largely for different aerosol vertical distributions. The ratio is also sensitive to mode parameters of bimodal lognormal aerosol size distribution when the geometric mean radius for the fine mode is small. Using two Angstrom exponents calculated at three wavelengths of 415, 500, 860nm are found better representing aerosol size distributions than only using one Angstrom exponent. A regression model is proposed to assess the impacts of different factors on the retrieval of PM2.5. Compared to a simple linear regression model, the new model combining AOD and ceilometer backscatter can prominently improve the fitting of PM2.5. The contribution of further introducing Angstrom coefficients is apparent. Using combined measurements of AOD, ceilometer backscatter, Angstrom coefficients and meteorological parameters in the regression model can get a correlation coefficient of 0.79 between fitted and expected PM2.5.
Phung, Dung; Connell, Des; Rutherford, Shannon; Chu, Cordia
2017-06-01
A systematic review (SR) and meta-analysis cannot provide the endpoint answer for a chemical risk assessment (CRA). The objective of this study was to apply SR and meta-regression (MR) analysis to address this limitation using a case study in cardiovascular risk from arsenic exposure in Vietnam. Published studies were searched from PubMed using the keywords of arsenic exposure and cardiovascular diseases (CVD). Random-effects meta-regression was applied to model the linear relationship between arsenic concentration in water and risk of CVD, and then the no-observable-adverse-effect level (NOAEL) were identified from the regression function. The probabilistic risk assessment (PRA) technique was applied to characterize risk of CVD due to arsenic exposure by estimating the overlapping coefficient between dose-response and exposure distribution curves. The risks were evaluated for groundwater, treated and drinking water. A total of 8 high quality studies for dose-response and 12 studies for exposure data were included for final analyses. The results of MR suggested a NOAEL of 50 μg/L and a guideline of 5 μg/L for arsenic in water which valued as a half of NOAEL and guidelines recommended from previous studies and authorities. The results of PRA indicated that the observed exposure level with exceeding CVD risk was 52% for groundwater, 24% for treated water, and 10% for drinking water in Vietnam, respectively. The study found that systematic review and meta-regression can be considered as an ideal method to chemical risk assessment due to its advantages to bring the answer for the endpoint question of a CRA. Copyright © 2017 Elsevier Ltd. All rights reserved.
Singh, Preet Mohinder; Borle, Anuradha; Shah, Dipal; Sinha, Ashish; Makkar, Jeetinder Kaur; Trikha, Anjan; Goudra, Basavana Gouda
2016-04-01
Prophylactic continuous positive airway pressure (CPAP) can prevent pulmonary adverse events following upper abdominal surgeries. The present meta-regression evaluates and quantifies the effect of degree/duration of (CPAP) on the incidence of postoperative pulmonary events. Medical databases were searched for randomized controlled trials involving adult patients, comparing the outcome in those receiving prophylactic postoperative CPAP versus no CPAP, undergoing high-risk abdominal surgeries. Our meta-analysis evaluated the relationship between the postoperative pulmonary complications and the use of CPAP. Furthermore, meta-regression was used to quantify the effect of cumulative duration and degree of CPAP on the measured outcomes. Seventy-three potentially relevant studies were identified, of which 11 had appropriate data, allowing us to compare a total of 362 and 363 patients in CPAP and control groups, respectively. Qualitatively, Odds ratio for CPAP showed protective effect for pneumonia [0.39 (0.19-0.78)], atelectasis [0.51 (0.32-0.80)] and pulmonary complications [0.37 (0.24-0.56)] with zero heterogeneity. For prevention of pulmonary complications, odds ratio was better for continuous than intermittent CPAP. Meta-regression demonstrated a positive correlation between the degree of CPAP and the incidence of pneumonia with a regression coefficient of +0.61 (95 % CI 0.02-1.21, P = 0.048, τ (2) = 0.078, r (2) = 7.87 %). Overall, adverse effects were similar with or without the use of CPAP. Prophylactic postoperative use of continuous CPAP significantly reduces the incidence of postoperative pneumonia, atelectasis and pulmonary complications in patients undergoing high-risk abdominal surgeries. Quantitatively, increasing the CPAP levels does not necessarily enhance the protective effect against pneumonia. Instead, protective effect diminishes with increasing degree of CPAP.
Uechi, Ken; Asakura, Keiko; Ri, Yui; Masayasu, Shizuko; Sasaki, Satoshi
2016-02-01
Several estimation methods for 24-h sodium excretion using spot urine sample have been reported, but accurate estimation at the individual level remains difficult. We aimed to clarify the most accurate method of estimating 24-h sodium excretion with different numbers of available spot urine samples. A total of 370 participants from throughout Japan collected multiple 24-h urine and spot urine samples independently. Participants were allocated randomly into a development and a validation dataset. Two estimation methods were established in the development dataset using the two 24-h sodium excretion samples as reference: the 'simple mean method' estimated by multiplying the sodium-creatinine ratio by predicted 24-h creatinine excretion, whereas the 'regression method' employed linear regression analysis. The accuracy of the two methods was examined by comparing the estimated means and concordance correlation coefficients (CCC) in the validation dataset. Mean sodium excretion by the simple mean method with three spot urine samples was closest to that by 24-h collection (difference: -1.62 mmol/day). CCC with the simple mean method increased with an increased number of spot urine samples at 0.20, 0.31, and 0.42 using one, two, and three samples, respectively. This method with three spot urine samples yielded higher CCC than the regression method (0.40). When only one spot urine sample was available for each study participant, CCC was higher with the regression method (0.36). The simple mean method with three spot urine samples yielded the most accurate estimates of sodium excretion. When only one spot urine sample was available, the regression method was preferable.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Bootstrap Methods: A Very Leisurely Look.
ERIC Educational Resources Information Center
Hinkle, Dennis E.; Winstead, Wayland H.
The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…
"L"-Bivariate and "L"-Multivariate Association Coefficients. Research Report. ETS RR-08-40
ERIC Educational Resources Information Center
Kong, Nan; Lewis, Charles
2008-01-01
Given a system of multiple random variables, a new measure called the "L"-multivariate association coefficient is defined using (conditional) entropy. Unlike traditional correlation measures, the L-multivariate association coefficient measures the multiassociations or multirelations among the multiple variables in the given system; that…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dierauf, Timothy; Kurtz, Sarah; Riley, Evan
This paper provides a recommended method for evaluating the AC capacity of a photovoltaic (PV) generating station. It also presents companion guidance on setting the facilitys capacity guarantee value. This is a principles-based approach that incorporates plant fundamental design parameters such as loss factors, module coefficients, and inverter constraints. This method has been used to prove contract guarantees for over 700 MW of installed projects. The method is transparent, and the results are deterministic. In contrast, current industry practices incorporate statistical regression where the empirical coefficients may only characterize the collected data. Though these methods may work well when extrapolationmore » is not required, there are other situations where the empirical coefficients may not adequately model actual performance.This proposed Fundamentals Approach method provides consistent results even where regression methods start to lose fidelity.« less
A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form
ERIC Educational Resources Information Center
Cai, Li; Hayes, Andrew F.
2008-01-01
When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…
Osinga, Rik; Babst, Doris; Bodmer, Elvira S; Link, Bjoern C; Fritsche, Elmar; Hug, Urs
2017-12-01
This work assessed both subjective and objective postoperative parameters after breast reduction surgery and compared between patients and plastic surgeons. After an average postoperative observation period of 6.7 ± 2.7 (2 - 13) years, 159 out of 259 patients (61 %) were examined. The mean age at the time of surgery was 37 ± 14 (15 - 74) years. The postoperative anatomy of the breast and other anthropometric parameters were measured in cm with the patient in an upright position. The visual analogue scale (VAS) values for symmetry, size, shape, type of scar and overall satisfaction both from the patient's and from four plastic surgeons' perspectives were assessed and compared. Patients rated the postoperative result significantly better than surgeons. Good subjective ratings by patients for shape, symmetry and sensitivity correlated with high scores for overall assessment. Shape had the strongest influence on overall satisfaction (regression coefficient 0.357; p < 0.001), followed by symmetry (regression coefficient 0.239; p < 0.001) and sensitivity (regression coefficient 0.109; p = 0.040) of the breast. The better the subjective rating for symmetry by the patient, the smaller the measured difference of the jugulum-mamillary distance between left and right (regression coefficient -0.773; p = 0.002) and the smaller the difference in height of the lowest part of the breast between left and right (regression coefficient -0.465; p = 0.035). There was no significant correlation between age, weight, height, BMI, resected weight of the breast, postoperative breast size or type of scar with overall satisfaction. After breast reduction surgery, long-term outcome is rated significantly better by patients than by plastic surgeons. Good subjective ratings by patients for shape, symmetry and sensitivity correlated with high scores for overall assessment. Shape had the strongest influence on overall satisfaction, followed by symmetry and sensitivity of the breast. Postoperative size of the breast, resection weight, type of scar, age or BMI was not of significant influence. Symmetry was the only assessed subjective parameter of this study that could be objectified by postoperative measurements. Georg Thieme Verlag KG Stuttgart · New York.
Measurement of true ileal digestibility of phosphorus in some feed ingredients for broiler chickens.
Mutucumarana, R K; Ravindran, V; Ravindran, G; Cowieson, A J
2014-12-01
An experiment was conducted to estimate the true ileal digestibility of P in wheat, sorghum, soybean meal, and corn distiller's dried grains with solubles (DDGS) in broiler chickens. Four semipurified diets were formulated from each ingredient (wheat and sorghum: 236.5, 473, 709.5, and 946 g/kg; soybean meal and corn DDGS: 135, 270, 405, and 540 g/kg) to contain graded concentrations of nonphytate P. The experiment was conducted as a randomized complete block design with 4 weight blocks of 16 cages each (5 birds per cage). A total of 320 21-d-old broilers (Ross 308) were assigned to the 16 test diets with 4 replicates per diet. Apparent ileal digestibility coefficients of P were determined by the indicator method and the linear regression method was used to determine the true P digestibility coefficients. The results showed that the apparent ileal P digestibility coefficients of wheat-based diets were not influenced (P>0.05) by increasing dietary P concentrations, whereas those of diets based on sorghum, soybean meal, and corn DDGS differed (P<0.05) at different P concentrations. Apparent ileal P digestibility in broilers fed diets with soybean meal and corn DDGS linearly (P<0.001) increased with increasing P concentrations. True ileal P digestibility coefficients of wheat, sorghum, soybean meal, and corn DDGS were determined to be 0.464, 0.331, 0.798, and 0.727, respectively. Ileal endogenous P losses in birds fed diets with wheat, soybean meal, and corn DDGS were estimated to be 0.080, 0.609, and 0.418 g/kg DMI, respectively. In birds fed sorghum-based diets, endogenous P losses were estimated to be negative (-0.087 g/kg DMI). True digestible P contents of wheat, sorghum, soybean meal, and corn DDGS were determined to be 1.49, 0.78, 5.16, and 5.94 g/kg, respectively. The corresponding nonphytate P contents in wheat, sorghum, soybean meal, and corn DDGS were 1.11, 0.55, 2.15, and 4.36 g/kg, respectively. These differences between digestible P and nonphytate P contents may be suggestive, at least in part, of overestimation of P digestibility under the calcium-deficient conditions used in the regression method.
Saavoss, Josh D; Koenig, Lane; Cher, Daniel J
2016-01-01
Sacroiliac joint (SIJ) dysfunction is associated with a marked decrease in quality of life. Increasing evidence supports minimally invasive SIJ fusion as a safe and effective procedure for the treatment of chronic SIJ dysfunction. The impact of SIJ fusion on worker productivity is not known. Regression modeling using data from the National Health Interview Survey was applied to determine the relationship between responses to selected interview questions related to function and economic outcomes. Regression coefficients were then applied to prospectively collected, individual patient data in a randomized trial of SIJ fusion (INSITE, NCT01681004) to estimate expected differences in economic outcomes across treatments. Patients who receive SIJ fusion using iFuse Implant System(®) have an expected increase in the probability of working of 16% (95% confidence interval [CI] 11%-21%) relative to nonsurgical patients. The expected change in earnings across groups was US $3,128 (not statistically significant). Combining the two metrics, the annual increase in worker productivity given surgical vs nonsurgical care was $6,924 (95% CI $1,890-$11,945). For employees with chronic, severe SIJ dysfunction, minimally invasive SIJ fusion may improve worker productivity compared to nonsurgical treatment.
Li, Qian-Qian; Zhang, Da-Jun; Guo, Lan-Ting; Feng, Zheng-Zhi; Wu, Ming-Xia
2007-09-01
To explore the status and influencing factors on anxiety sensitivity among middle school students in Chongqing. 58 classes from 12 schools were randomly selected in four administrative districts of Chongqing city. A total number of 2700 students was included for final analysis including 48.5% from junior high school and 51.5% from senior high school students with 49.2% boys and 50.8% girls. The Chinese version of the Anxiety Sensitivity Index-Revision, Adolescent Self-Rating Life Events Check List (ASLEC) and State-Trait Anxiety Inventory (STAI) were used. (1) There was no significant difference between grade groups (P = 0.49). (2) The level of girl's anxiety sensitivity was always higher than boy's (P < 0.001). (3) Data from multiple linear regression showed that the influential factors to the degree of anxiety sensitivity were: state of anxiety, trait anxiety, life events, sex, stress from learning, etc (standard coefficients of regression were 0.258, 0.163, 0.112, 0.093, 0.124, -0.096, 0.096). The major influential factors of anxiety sensitivity would include: sex, stress from learning, life events, interpersonal relationship, state of anxiety and trait anxiety.
Fan, Xiaochong; Ma, Minyu; Li, Zhisong; Gong, Shengkai; Zhang, Wei; Wen, Yuanyuan
2015-01-01
Objective: To study the relationship between the target effective site concentration (Ce) of rocuronium and the degree of recovery from neuromuscular blockade in elderly patients. Methods: 50 elderly patients (ASA grade II) scheduled for selective surgical procedure under general anaesthesia were randomly divided into two groups, A and B, with 25 cases in each group. The Ce of rocuronium for intubation was 3 μg·ml-1 in both groups, and the Ce during operation were 0.8 and 1.0 μg·ml-1 in group A and B, respectively. When target controlled infusion of rocuronium was stopped, without the administration of reversal agents for neuromuscular blockade, the relationship between Ce and the first twitch height (T1) was studied by regression analysis. Results: There was a significant linear relationship between Ce and T1, and there was no statistical difference in regression coefficient and interception between group A and B (P>0.05). Conclusion: The degree of recovery from neuromuscular blockade could be judged by the target effective site concentration of rocuronium at the time of reversal from neuromuscular blockade in the elderly patients. PMID:26629159
Estimating individual benefits of medical or behavioral treatments in severely ill patients.
Diaz, Francisco J
2017-01-01
There is a need for statistical methods appropriate for the analysis of clinical trials from a personalized-medicine viewpoint as opposed to the common statistical practice that simply examines average treatment effects. This article proposes an approach to quantifying, reporting and analyzing individual benefits of medical or behavioral treatments to severely ill patients with chronic conditions, using data from clinical trials. The approach is a new development of a published framework for measuring the severity of a chronic disease and the benefits treatments provide to individuals, which utilizes regression models with random coefficients. Here, a patient is considered to be severely ill if the patient's basal severity is close to one. This allows the derivation of a very flexible family of probability distributions of individual benefits that depend on treatment duration and the covariates included in the regression model. Our approach may enrich the statistical analysis of clinical trials of severely ill patients because it allows investigating the probability distribution of individual benefits in the patient population and the variables that influence it, and we can also measure the benefits achieved in specific patients including new patients. We illustrate our approach using data from a clinical trial of the anti-depressant imipramine.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...
Irvine, Kathryn M.; Thornton, Jamie; Backus, Vickie M.; Hohmann, Matthew G.; Lehnhoff, Erik A.; Maxwell, Bruce D.; Michels, Kurt; Rew, Lisa
2013-01-01
Commonly in environmental and ecological studies, species distribution data are recorded as presence or absence throughout a spatial domain of interest. Field based studies typically collect observations by sampling a subset of the spatial domain. We consider the effects of six different adaptive and two non-adaptive sampling designs and choice of three binary models on both predictions to unsampled locations and parameter estimation of the regression coefficients (species–environment relationships). Our simulation study is unique compared to others to date in that we virtually sample a true known spatial distribution of a nonindigenous plant species, Bromus inermis. The census of B. inermis provides a good example of a species distribution that is both sparsely (1.9 % prevalence) and patchily distributed. We find that modeling the spatial correlation using a random effect with an intrinsic Gaussian conditionally autoregressive prior distribution was equivalent or superior to Bayesian autologistic regression in terms of predicting to un-sampled areas when strip adaptive cluster sampling was used to survey B. inermis. However, inferences about the relationships between B. inermis presence and environmental predictors differed between the two spatial binary models. The strip adaptive cluster designs we investigate provided a significant advantage in terms of Markov chain Monte Carlo chain convergence when trying to model a sparsely distributed species across a large area. In general, there was little difference in the choice of neighborhood, although the adaptive king was preferred when transects were randomly placed throughout the spatial domain.
Kocher, Katharina; Kowalski, Piotr; Kolokitha, Olga-Elpis; Katsaros, Christos; Fudalej, Piotr S
2016-05-01
To determine whether judgment of nasolabial esthetics in cleft lip and palate (CLP) is influenced by overall facial attractiveness. Experimental study. University of Bern, Switzerland. Seventy-two fused images (36 of boys, 36 of girls) were constructed. Each image comprised (1) the nasolabial region of a treated child with complete unilateral CLP (UCLP) and (2) the external facial features, i.e., the face with masked nasolabial region, of a noncleft child. Photographs of the nasolabial region of six boys and six girls with UCLP representing a wide range of esthetic outcomes, i.e., from very good to very poor appearance, were randomly chosen from a sample of 60 consecutively treated patients in whom nasolabial esthetics had been rated in a previous study. Photographs of external facial features of six boys and six girls without UCLP with various esthetics were randomly selected from patients' files. Eight lay raters evaluated the fused images using a 100-mm visual analogue scale. Method reliability was assessed by reevaluation of fused images after >1 month. A regression model was used to analyze which elements of facial esthetics influenced the perception of nasolabial appearance. Method reliability was good. A regression analysis demonstrated that only the appearance of the nasolabial area affected the esthetic scores of fused images (coefficient = -11.44; P < .001; R(2) = 0.464). The appearance of the external facial features did not influence perceptions of fused images. Cropping facial images for assessment of nasolabial appearance in CLP seems unnecessary. Instead, esthetic evaluation can be performed on images of full faces.
Jilani, Tanveer; Azam, Iqbal; Moiz, Bushra; Mehboobali, Naseema; Perwaiz Iqbal, Mohammad
2015-01-01
Hemoglobin levels slightly below the lower limit of normal are common in adults in the general population in developing countries. A few human studies have suggested the use of antioxidant vitamins in the correction of mild anemia. The objective of the present study was to investigate the association of vitamin E supplementation in mildly anemic healthy adults with post-supplemental blood hemoglobin levels in the general population of Karachi, Pakistan. In a single-blinded and placebo-controlled randomized trial, 124 mildly anemic subjects from the General Practitioners' Clinics and personnel of the Aga Khan University were randomized into intervention (n = 82) and control (n = 42) group. In the intervention group, each subject was given vitamin E (400 mg) everyday for a period of three months, while control group subjects received a placebo. Eighty six subjects completed the trial. Fasting venous blood was collected at baseline and after three months of supplementation. Hemoglobin levels and serum/plasma concentrations of vitamin E, vitamin B12, folate, ferritin, serum transferrin receptor (sTfR), glucose, total cholesterol, triglycerides, LDL-cholesterol, HDL-cholesterol, creatinine, total-antioxidant-status and erythropoietin were measured and analyzed using repeated measures ANOVA and multiple linear regression. The adjusted regression coefficients (β) and standard error [SE(β)] of the significant predictors of post-supplemental hemoglobin levels were serum concentration of vitamin E (0.983[0.095]), gender (- 0.656[0.244]), sTfR (- 0.06[0.02]) and baseline hemoglobin levels (0.768[0.077]). The study showed a positive association between vitamin E supplementation and enhanced hemoglobin levels in mildly anemic adults.
Characteristics of low-slope streams that affect O2 transfer rates
Parker, Gene W.; Desimone, Leslie A.
1991-01-01
Multiple-regression techniques were used to derive the reaeration coefficients estimating equation for low sloped streams: K2 = 3.83 MBAS-0.41 SL0.20 H-0.76, where K2 is the reaeration coefficient in base e units per day; MBAS is the methylene blue active substances concentration in milligrams per liter; SL is the water-surface slope in foot per foot; and H is the mean-flow depth in feet. Fourteen hydraulic, physical, and water-quality characteristics were regressed against 29 measured-reaeration coefficients for low-sloped (water surface slopes less than 0.002 foot per foot) streams in Massachusetts and New York. Reaeration coefficients measured from May 1985 to October 1988 ranged from 0.2 to 11.0 base e units per day for 29 low-sloped tracer studies. Concentration of methylene blue active substances is significant because it is thought to be an indicator of concentration of surfactants which could change the surface tension at the air-water interface.
Ho, S C; Chan, S G; Yip, Y B; Chan, C S Y; Woo, J L F; Sham, A
2008-12-01
This 30-month study investigating bone change and its determinants in 438 perimenopausal Chinese women revealed that the fastest bone loss occurred in women undergoing menopausal transition but maintenance of body weight and physical fitness were beneficial for bone health. Soy protein intake also seemed to exert a protective effect. This 30-month follow-up study aims to investigate change in bone mineral density and its determinants in Hong Kong Chinese perimenopausal women. Four hundred and thirty-eight women aged 45 to 55 years were recruited through random telephone dialing and primary care clinic. Bone mass, body composition, lifestyle measurements were obtained at baseline and at 9-, 18- and 30-month follow-ups. Univariate and stepwise multiple regression analyses were performed with the regression coefficients of BMD/C (derived from baseline and follow-up measurements) as the outcome variables. Menopausal status was classified as pre- or postmenopausal or transitional. Menopausal status was the strongest determinant of bone changes. An annual bone loss of about 0.5% was observed among premenopausal, 2% to 2.5% among transitional, and about 1.5% in postmenopausal women. Multiple regression analyses, revealed that a positive regression slope of body weight was protective for follow-up bone loss at all sites. Number of pregnancy, soy protein intake and walking were protective for total body BMC. Higher baseline LM was also protective for neck of femur BMD. Maintenance of body weight and physical fitness were observed to have a protective effect on for bone loss in Chinese perimenopausal women.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Determinants of serum cadmium levels in a Northern Italy community: A cross-sectional study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Filippini, Tommaso
Introduction: Cadmium (Cd) is a heavy metal and a serious environmental hazard to humans. Some uncertainties still exist about major sources of Cd exposure in non-occupationally exposed subjects in addition to cigarette smoking, such as diet and outdoor air pollution. We sought to determine the influence of these sources on a biomarker of exposure, serum Cd concentration. Methods: We recruited 51 randomly selected residents from an Italian urban community, from whom we obtained detailed information about dietary habits and smoking habits, and a blood sample for serum Cd determination. We also assessed outdoor air Cd exposure, by modeling outdoor airmore » levels of particulate matter ≤10 µm (PM{sub 10}) from motorized traffic at geocoded subjects’ residence. Results: In crude analysis, regression beta coefficients for dietary Cd, smoking and PM10 on serum Cd levels were 0.03 (95% CI -0.83 to 0.88), 6.96 (95% CI -0.02 to 13.95) and 0.62 (95% CI -0.19 to 1.43), respectively. In the adjusted analysis, regression beta coefficients were -0.34 (95% CI -1-40 to 0.71), 5.81 (95% CI -1.43 to 13.04) and 0.47 (95% CI -0.35 to 1.29), respectively. Conclusion: Cigarette smoking was the most important factor influencing serum Cd in our non-occupationally exposed population, as expected, while dietary Cd was not associated with this biomarker. Outdoor air pollution, as assessed through exposure to particulate matter generated by motorized traffic, was an additional source of Cd exposure. - Highlights: • Smoking markedly increases serum Cd levels in non-occupationally exposed individuals. • Overall dietary Cd intake shows little association with serum Cd levels. • Air pollution from motorized traffic increases serum Cd levels.« less
Fantaguzzi, Catherine; Allen, Elizabeth; Miners, Alec; Christie, Deborah; Opondo, Charles; Sadique, Zia; Fletcher, Adam; Grieve, Richard; Bonell, Chris; Viner, Russell M; Legood, Rosa
2018-06-01
Associations between adolescent health-related quality of life (HRQoL), bullying, and aggression are not well understood. We used baseline data from a large-cluster randomized school trial to study the relationship between HRQoL, bullying experience, and other demographic factors. Cross-sectional self-reported questionnaires collected pre-randomization from the on-going INCLUSIVE trial. The questionnaires were completed in the classroom. The Gatehouse Bullying Scale measured bullying victimization and the Edinburgh Study of Youth Transitions and Crime school misbehavior subscale (ESYTC) measured aggressive behaviors. HRQoL was assessed using the Child Health Utility 9 Dimensions (CHU-9D) and general quality of life using the Pediatric Quality of Life Inventory (PedsQL). Participants were a cohort of year 7 students (age 11-12 years) from 40 state secondary schools in England. Descriptive statistics for the CHU-9D and PedsQL were calculated using standard methods with tests for differences in median scores by sex assessed using quantile regression. Correlation between HRQoL measures was conducted using Spearman's rank correlation coefficients. Predictors of HRQoL were identified using univariate and multiple regressions. A total of 6667 students filled out the questionnaire. The CHU-9D was correlated with the PedsQL (0.63, p < 0.001). The multivariable regression results suggest that if students were bullied frequently and upset it resulted in a decrement in CHU-9D scores of (-0.108) and fall in PedsQL score of (-16.2). The impact of the antisocial/aggressive behavior on the ESYTC scale resulted in a utility decrement of -0.004 and fall of -.5 on the PedsQL. Adolescents' involvement in bullying and aggression is a strong correlate of HRQoL. These data have important implications for the potential cost-effectiveness of reducing bullying and aggression in schools.
Search for Directed Networks by Different Random Walk Strategies
NASA Astrophysics Data System (ADS)
Zhu, Zi-Qi; Jin, Xiao-Ling; Huang, Zhi-Long
2012-03-01
A comparative study is carried out on the efficiency of five different random walk strategies searching on directed networks constructed based on several typical complex networks. Due to the difference in search efficiency of the strategies rooted in network clustering, the clustering coefficient in a random walker's eye on directed networks is defined and computed to be half of the corresponding undirected networks. The search processes are performed on the directed networks based on Erdös—Rényi model, Watts—Strogatz model, Barabási—Albert model and clustered scale-free network model. It is found that self-avoiding random walk strategy is the best search strategy for such directed networks. Compared to unrestricted random walk strategy, path-iteration-avoiding random walks can also make the search process much more efficient. However, no-triangle-loop and no-quadrangle-loop random walks do not improve the search efficiency as expected, which is different from those on undirected networks since the clustering coefficient of directed networks are smaller than that of undirected networks.
Measurement of effective air diffusion coefficients for trichloroethene in undisturbed soil cores.
Bartelt-Hunt, Shannon L; Smith, James A
2002-06-01
In this study, we measure effective diffusion coefficients for trichloroethene in undisturbed soil samples taken from Picatinny Arsenal, New Jersey. The measured effective diffusion coefficients ranged from 0.0053 to 0.0609 cm2/s over a range of air-filled porosity of 0.23-0.49. The experimental data were compared to several previously published relations that predict diffusion coefficients as a function of air-filled porosity and porosity. A multiple linear regression analysis was developed to determine if a modification of the exponents in Millington's [Science 130 (1959) 100] relation would better fit the experimental data. The literature relations appeared to generally underpredict the effective diffusion coefficient for the soil cores studied in this work. Inclusion of a particle-size distribution parameter, d10, did not significantly improve the fit of the linear regression equation. The effective diffusion coefficient and porosity data were used to recalculate estimates of diffusive flux through the subsurface made in a previous study performed at the field site. It was determined that the method of calculation used in the previous study resulted in an underprediction of diffusive flux from the subsurface. We conclude that although Millington's [Science 130 (1959) 100] relation works well to predict effective diffusion coefficients in homogeneous soils with relatively uniform particle-size distributions, it may be inaccurate for many natural soils with heterogeneous structure and/or non-uniform particle-size distributions.
NASA Astrophysics Data System (ADS)
Yuste, S. B.; Abad, E.; Baumgaertner, A.
2016-07-01
We address the problem of diffusion on a comb whose teeth display varying lengths. Specifically, the length ℓ of each tooth is drawn from a probability distribution displaying power law behavior at large ℓ ,P (ℓ ) ˜ℓ-(1 +α ) (α >0 ). To start with, we focus on the computation of the anomalous diffusion coefficient for the subdiffusive motion along the backbone. This quantity is subsequently used as an input to compute concentration recovery curves mimicking fluorescence recovery after photobleaching experiments in comblike geometries such as spiny dendrites. Our method is based on the mean-field description provided by the well-tested continuous time random-walk approach for the random-comb model, and the obtained analytical result for the diffusion coefficient is confirmed by numerical simulations of a random walk with finite steps in time and space along the backbone and the teeth. We subsequently incorporate retardation effects arising from binding-unbinding kinetics into our model and obtain a scaling law characterizing the corresponding change in the diffusion coefficient. Finally, we show that recovery curves obtained with the help of the analytical expression for the anomalous diffusion coefficient cannot be fitted perfectly by a model based on scaled Brownian motion, i.e., a standard diffusion equation with a time-dependent diffusion coefficient. However, differences between the exact curves and such fits are small, thereby providing justification for the practical use of models relying on scaled Brownian motion as a fitting procedure for recovery curves arising from particle diffusion in comblike systems.
Measurement of true ileal phosphorus digestibility in meat and bone meal for broiler chickens.
Mutucumarana, R K; Ravindran, V; Ravindran, G; Cowieson, A J
2015-07-01
An experiment was conducted to estimate true ileal phosphorus (P:) digestibility of 3 meat and bone meal samples (MBM-1, MBM-2: , and MBM-3:) for broiler chickens. Four semipurified diets were formulated from each sample to contain graded concentrations of P. The experiment was conducted as a completely randomized design with 6 replicates (6 birds per replicate) per dietary treatment. A total of 432 Ross 308 broilers were assigned at 21 d of age to the 12 test diets. The apparent ileal digestibility coefficient of P was determined by the indicator method, and the linear regression method was used to determine the true P digestibility coefficient. The apparent ileal digestibility coefficient of P in birds fed diets containing MBM-1 and MBM-2 was unaffected by increasing dietary concentrations of P (P > 0.05). The apparent ileal digestibility coefficient of P in birds fed the MBM-3 diets decreased with increasing P concentrations (linear, P < 0.001; quadratic, P < 0. 01). In birds fed the MBM-1 and MBM-2 diets, ileal endogenous P losses were estimated to be 0.049 and 0.142 g/kg DM intake (DMI:), respectively. In birds fed the MBM-3 diets, endogenous P loss was estimated to be negative (-0.370 g/kg DMI). True ileal P digestibility of MBM-1, MBM-2, and MBM-3 was determined to be 0.693, 0.608, and 0.420, respectively. True ileal P digestibility coefficients determined for MBM-1 and MBM-2 were similar (P < 0.05), but were higher (P < 0.05) than that for MBM-3. Total P and true digestible P contents of MBM-1, MBM-2, and MBM-3 were determined to be 37.5 and 26.0; 60.2 and 36.6; and 59.8 and 25.1 g/kg, respectively, on an as-fed basis. © 2015 Poultry Science Association Inc.
Oliveira, Thiara Castro de; Silva, Antônio Augusto Moura da; Santos, Cristiane de Jesus Nunes dos; Silva, Josenilde Sousa e; Conceição, Sueli Ismael Oliveira da
2010-12-01
To analyze factors associated with physical activity and the mean time spent in some sedentary activities among school-aged children. A cross-sectional study was carried out in a random sample of 592 schoolchildren aged nine to 16 years in 2005, in São Luís, Northern Brazil. Data were collected by means of a 24-Hour Physical Activity Recall Questionnaire, concerning demographic and socioeconomic variables, physical activities practiced and time spent in certain sedentary activities. Physical activities were classified according to their metabolic equivalents (MET), and a physical activity index was estimated for each child. Sedentary lifestyle was estimated based on time spent watching television, playing videogames and on the computer/internet. Chi square test was used to compare proportions. Linear regression analysis was used to establish associations. Estimates were adjusted for the effect of the sampling design. The mean of the physical activity index was 605.73 MET-min/day (SD = 509.45). School children that were male (coefficient=134.57; 95%CI 50.77; 218.37), from public schools (coefficient.= 94.08; 95%CI 12.54; 175.62 and in the 5th to 7th grade (coefficient.=95.01; 95%CI 8.10;181.92 presented higher indices than females, children from private schools and in the 8th to the 9th grade (p<0.05). On average, students spent 2.66 hours/day in sedentary activities. Time spent in sedentary activities was significantly lower for children aged nine to 11 years (coefficient.= -0.49 hr/day; 95%CI -0.88; -0.10) and in lower socioeconomic classes (coefficient.=-0.87; 95%CI -1.45;-0.30). Domestic chores (59.43%) and walking to school (58.43%) were the most common physical activities. Being female, in private schools and in the 8th to 9th grade were factors associated with lower levels of physical activity. Younger schoolchildren and those from low economic classes spent less time engaged in sedentary activities.
A Graph Theory Practice on Transformed Image: A Random Image Steganography
Thanikaiselvan, V.; Arulmozhivarman, P.; Subashanthini, S.; Amirtharajan, Rengarajan
2013-01-01
Modern day information age is enriched with the advanced network communication expertise but unfortunately at the same time encounters infinite security issues when dealing with secret and/or private information. The storage and transmission of the secret information become highly essential and have led to a deluge of research in this field. In this paper, an optimistic effort has been taken to combine graceful graph along with integer wavelet transform (IWT) to implement random image steganography for secure communication. The implementation part begins with the conversion of cover image into wavelet coefficients through IWT and is followed by embedding secret image in the randomly selected coefficients through graph theory. Finally stegoimage is obtained by applying inverse IWT. This method provides a maximum of 44 dB peak signal to noise ratio (PSNR) for 266646 bits. Thus, the proposed method gives high imperceptibility through high PSNR value and high embedding capacity in the cover image due to adaptive embedding scheme and high robustness against blind attack through graph theoretic random selection of coefficients. PMID:24453857
ERIC Educational Resources Information Center
Frees, Edward W.; Kim, Jee-Seon
2006-01-01
Multilevel models are proven tools in social research for modeling complex, hierarchical systems. In multilevel modeling, statistical inference is based largely on quantification of random variables. This paper distinguishes among three types of random variables in multilevel modeling--model disturbances, random coefficients, and future response…
Learning accurate and interpretable models based on regularized random forests regression
2014-01-01
Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120
The Geometry of Enhancement in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…
ERIC Educational Resources Information Center
Tong, Fuhui
2006-01-01
Background: An extensive body of researches has favored the use of regression over other parametric analyses that are based on OVA. In case of noteworthy regression results, researchers tend to explore magnitude of beta weights for the respective predictors. Purpose: The purpose of this paper is to examine both beta weights and structure…
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Guenole, Nigel; Brown, Anna
2014-01-01
We report a Monte Carlo study examining the effects of two strategies for handling measurement non-invariance – modeling and ignoring non-invariant items – on structural regression coefficients between latent variables measured with item response theory models for categorical indicators. These strategies were examined across four levels and three types of non-invariance – non-invariant loadings, non-invariant thresholds, and combined non-invariance on loadings and thresholds – in simple, partial, mediated and moderated regression models where the non-invariant latent variable occupied predictor, mediator, and criterion positions in the structural regression models. When non-invariance is ignored in the latent predictor, the focal group regression parameters are biased in the opposite direction to the difference in loadings and thresholds relative to the referent group (i.e., lower loadings and thresholds for the focal group lead to overestimated regression parameters). With criterion non-invariance, the focal group regression parameters are biased in the same direction as the difference in loadings and thresholds relative to the referent group. While unacceptable levels of parameter bias were confined to the focal group, bias occurred at considerably lower levels of ignored non-invariance than was previously recognized in referent and focal groups. PMID:25278911
Houwink, Elisa J.F.; Muijtjens, Arno M.M.; van Teeffelen, Sarah R.; Henneman, Lidewij; Rethans, Jan Joost; van der Jagt, Liesbeth E.J.; van Luijk, Scheltus J.; Dinant, Geert Jan; van der Vleuten, Cees; Cornel, Martina C.
2014-01-01
Purpose: General practitioners are increasingly called upon to deliver genetic services and could play a key role in translating potentially life-saving advancements in oncogenetic technologies to patient care. If general practitioners are to make an effective contribution in this area, their genetics competencies need to be upgraded. The aim of this study was to investigate whether oncogenetics training for general practitioners improves their genetic consultation skills. Methods: In this pragmatic, blinded, randomized controlled trial, the intervention consisted of a 4-h training (December 2011 and April 2012), covering oncogenetic consultation skills (family history, familial risk assessment, and efficient referral), attitude (medical ethical issues), and clinical knowledge required in primary-care consultations. Outcomes were measured using observation checklists by unannounced standardized patients and self-reported questionnaires. Results: Of 88 randomized general practitioners who initially agreed to participate, 56 completed all measurements. Key consultation skills significantly and substantially improved; regression coefficients after intervention were equivalent to 0.34 and 0.28 at 3-month follow-up, indicating a moderate effect size. Satisfaction and perceived applicability of newly learned skills were highly scored. Conclusion: The general practitioner–specific training proved to be a feasible, satisfactory, and clinically applicable method to improve oncogenetics consultation skills and could be used as an educational framework to inform future training activities with the ultimate aim of improving medical care. PMID:23722870
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Marginalized zero-altered models for longitudinal count data.
Tabb, Loni Philip; Tchetgen, Eric J Tchetgen; Wellenius, Greg A; Coull, Brent A
2016-10-01
Count data often exhibit more zeros than predicted by common count distributions like the Poisson or negative binomial. In recent years, there has been considerable interest in methods for analyzing zero-inflated count data in longitudinal or other correlated data settings. A common approach has been to extend zero-inflated Poisson models to include random effects that account for correlation among observations. However, these models have been shown to have a few drawbacks, including interpretability of regression coefficients and numerical instability of fitting algorithms even when the data arise from the assumed model. To address these issues, we propose a model that parameterizes the marginal associations between the count outcome and the covariates as easily interpretable log relative rates, while including random effects to account for correlation among observations. One of the main advantages of this marginal model is that it allows a basis upon which we can directly compare the performance of standard methods that ignore zero inflation with that of a method that explicitly takes zero inflation into account. We present simulations of these various model formulations in terms of bias and variance estimation. Finally, we apply the proposed approach to analyze toxicological data of the effect of emissions on cardiac arrhythmias.
Marginalized zero-altered models for longitudinal count data
Tabb, Loni Philip; Tchetgen, Eric J. Tchetgen; Wellenius, Greg A.; Coull, Brent A.
2015-01-01
Count data often exhibit more zeros than predicted by common count distributions like the Poisson or negative binomial. In recent years, there has been considerable interest in methods for analyzing zero-inflated count data in longitudinal or other correlated data settings. A common approach has been to extend zero-inflated Poisson models to include random effects that account for correlation among observations. However, these models have been shown to have a few drawbacks, including interpretability of regression coefficients and numerical instability of fitting algorithms even when the data arise from the assumed model. To address these issues, we propose a model that parameterizes the marginal associations between the count outcome and the covariates as easily interpretable log relative rates, while including random effects to account for correlation among observations. One of the main advantages of this marginal model is that it allows a basis upon which we can directly compare the performance of standard methods that ignore zero inflation with that of a method that explicitly takes zero inflation into account. We present simulations of these various model formulations in terms of bias and variance estimation. Finally, we apply the proposed approach to analyze toxicological data of the effect of emissions on cardiac arrhythmias. PMID:27867423
Multiscale measurement error models for aggregated small area health data.
Aregay, Mehreteab; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Carroll, Rachel; Watjou, Kevin
2016-08-01
Spatial data are often aggregated from a finer (smaller) to a coarser (larger) geographical level. The process of data aggregation induces a scaling effect which smoothes the variation in the data. To address the scaling problem, multiscale models that link the convolution models at different scale levels via the shared random effect have been proposed. One of the main goals in aggregated health data is to investigate the relationship between predictors and an outcome at different geographical levels. In this paper, we extend multiscale models to examine whether a predictor effect at a finer level hold true at a coarser level. To adjust for predictor uncertainty due to aggregation, we applied measurement error models in the framework of multiscale approach. To assess the benefit of using multiscale measurement error models, we compare the performance of multiscale models with and without measurement error in both real and simulated data. We found that ignoring the measurement error in multiscale models underestimates the regression coefficient, while it overestimates the variance of the spatially structured random effect. On the other hand, accounting for the measurement error in multiscale models provides a better model fit and unbiased parameter estimates. © The Author(s) 2016.
An efficient algorithm for generating random number pairs drawn from a bivariate normal distribution
NASA Technical Reports Server (NTRS)
Campbell, C. W.
1983-01-01
An efficient algorithm for generating random number pairs from a bivariate normal distribution was developed. Any desired value of the two means, two standard deviations, and correlation coefficient can be selected. Theoretically the technique is exact and in practice its accuracy is limited only by the quality of the uniform distribution random number generator, inaccuracies in computer function evaluation, and arithmetic. A FORTRAN routine was written to check the algorithm and good accuracy was obtained. Some small errors in the correlation coefficient were observed to vary in a surprisingly regular manner. A simple model was developed which explained the qualities aspects of the errors.
van Mierlo, Trevor; Hyatt, Douglas; Ching, Andrew T
2016-01-01
Digital Health Social Networks (DHSNs) are common; however, there are few metrics that can be used to identify participation inequality. The objective of this study was to investigate whether the Gini coefficient, an economic measure of statistical dispersion traditionally used to measure income inequality, could be employed to measure DHSN inequality. Quarterly Gini coefficients were derived from four long-standing DHSNs. The combined data set included 625,736 posts that were generated from 15,181 actors over 18,671 days. The range of actors (8-2323), posts (29-28,684), and Gini coefficients (0.15-0.37) varied. Pearson correlations indicated statistically significant associations between number of actors and number of posts (0.527-0.835, p < .001), and Gini coefficients and number of posts (0.342-0.725, p < .001). However, the association between Gini coefficient and number of actors was only statistically significant for the addiction networks (0.619 and 0.276, p < .036). Linear regression models had positive but mixed R 2 results (0.333-0.527). In all four regression models, the association between Gini coefficient and posts was statistically significant ( t = 3.346-7.381, p < .002). However, unlike the Pearson correlations, the association between Gini coefficient and number of actors was only statistically significant in the two mental health networks ( t = -4.305 and -5.934, p < .000). The Gini coefficient is helpful in measuring shifts in DHSN inequality. However, as a standalone metric, the Gini coefficient does not indicate optimal numbers or ratios of actors to posts, or effective network engagement. Further, mixed-methods research investigating quantitative performance metrics is required.
Fluctuating Navier-Stokes equations for inelastic hard spheres or disks.
Brey, J Javier; Maynar, P; de Soria, M I García
2011-04-01
Starting from the fluctuating Boltzmann equation for smooth inelastic hard spheres or disks, closed equations for the fluctuating hydrodynamic fields to Navier-Stokes order are derived. This requires deriving constitutive relations for both the fluctuating fluxes and the correlations of the random forces. The former are identified as having the same form as the macroscopic average fluxes and involving the same transport coefficients. On the other hand, the random force terms exhibit two peculiarities as compared with their elastic limit for molecular systems. First, they are not white but have some finite relaxation time. Second, their amplitude is not determined by the macroscopic transport coefficients but involves new coefficients. ©2011 American Physical Society
The Analysis of Completely Randomized Factorial Experiments When Observations Are Lost at Random.
ERIC Educational Resources Information Center
Hummel, Thomas J.
An investigation was conducted of the characteristics of two estimation procedures and corresponding test statistics used in the analysis of completely randomized factorial experiments when observations are lost at random. For one estimator, contrast coefficients for cell means did not involve the cell frequencies. For the other, contrast…
The Study of Rain Specific Attenuation for the Prediction of Satellite Propagation in Malaysia
NASA Astrophysics Data System (ADS)
Mandeep, J. S.; Ng, Y. Y.; Abdullah, H.; Abdullah, M.
2010-06-01
Specific attenuation is the fundamental quantity in the calculation of rain attenuation for terrestrial path and slant paths representing as rain attenuation per unit distance (dB/km). Specific attenuation is an important element in developing the predicted rain attenuation model. This paper deals with the empirical determination of the power law coefficients which allow calculating the specific attenuation in dB/km from the knowledge of the rain rate in mm/h. The main purpose of the paper is to obtain the coefficients of k and α of power law relationship between specific attenuation. Three years (from 1st January 2006 until 31st December 2008) rain gauge and beacon data taken from USM, Nibong Tebal have been used to do the empirical procedure analysis of rain specific attenuation. The data presented are semi-empirical in nature. A year-to-year variation of the coefficients has been indicated and the empirical measured data was compared with ITU-R provided regression coefficient. The result indicated that the USM empirical measured data was significantly vary from ITU-R predicted value. Hence, ITU-R recommendation for regression coefficients of rain specific attenuation is not suitable for predicting rain attenuation at Malaysia.
Merchantable sawlog and bole-length equations for the Northeastern United States
Daniel A. Yaussy; Martin E. Dale; Martin E. Dale
1991-01-01
A modified Richards growth model is used to develop species-specific coefficients for equations estimating the merchantable sawlog and bole lengths of trees from 25 species groups common to the Northeastern United States. These regression coefficients have been incorporated into the growth-and-yield simulation software, NE-TWIGS.
Leveraging EHRs to improve hospital performance: the role of management.
Adler-Milstein, Julia; Woody Scott, Kirstin; Jha, Ashish K
2014-11-01
Recent studies fail to find a consistent relationship between adoption of electronic health records (EHRs) and improved hospital performance. We sought to examine whether the quality of hospital management modifies the association between EHR adoption and outcomes related to cost and quality. Retrospective study of a random sample of US acute care hospitals. Management quality was assessed via phone interviews with clinical managers predominantly from cardiac units in a random sample of 325 hospitals using a validated scale of management practices in 4 areas: operations, performance monitoring, target setting, and talent management. American Hospital Association InformationTechnology Supplement data captured whether or not these hospitals had at least a basic EHR. Acute myocardial infarction (AMI) outcomes included risk-adjusted 30-day mortality, average length-of-stay, and average payment per discharge measured using MedPAR data. Ordinary least squares regressions assessed whether management quality modifies the relationship between EHR adoption and AMI outcomes. While we found no association between EHR adoption and our outcomes, management quality modified the relationship in the predicted direction. For length of stay, the coefficient on the interaction between EHR and management was -1.48 (P = .05) and for payment, it was -7786.74 (P = .014). We did not find strong evidence of effect modification for mortality (coefficient = -0.05; P = .37). Coupled with ongoing policy efforts to achieve nationwide EHR adoption is a growing unease that our national investment may not result in better, more efficient care. Our study is among the first to offer empirical evidence that management quality may help explain why some hospitals see substantial gains from EHR adoption while others do not.
de la Loge, Christine; Tugaut, Béatrice; Fofana, Fatoumata; Lambert, Jérémy; Hennig, Michael; Tschiesner, Uta; Vahdati-Bolouri, Mitra; Segun Ismaila, Afisi; Suresh Punekar, Yogesh
2016-03-15
Background: This meta-analysis assessed the relationship between change from baseline (CFB) in spirometric measurements (trough forced expiratory volume in 1 second [FEV 1 ] and FEV 1 area under the curve [AUC]) and patient-reported outcomes (St. George's Respiratory Questionnaire total score [SGRQ] CFB, Transition Dyspnea Index [TDI] and exacerbation rates) after 6-12 months' follow-up, using study treatment-group level data. Methods: A systematic literature search was performed for randomized controlled trials of ≥24 weeks duration in adults with chronic obstructive pulmonary disease (COPD). Studies reporting ≥1 spirometric measurement and ≥1 patient-reported outcome (PRO) at baseline and at study endpoint were selected. The relationships between PROs and spirometric endpoints were assessed using Pearson correlation coefficient and meta-regression. Results: Fifty-two studies (62,385 patients) were included. Primary weighted analysis conducted at the last assessment showed a large significant negative correlation (r, -0.68 [95% confidence interval (CI); -0.77, -0.57]) between trough FEV 1 and SGRQ. Improvement of 100 mL in trough FEV 1 corresponded to a 5.9 point reduction in SGRQ. Similarly, a reduction of 4 points on SGRQ corresponded to 40 mL improvement in trough FEV 1 ( p <0.001). The weighted correlation coefficients of trough FEV 1 with TDI, exacerbation rate (all) and exacerbation rate (moderate/severe) at last assessment point were 0.57, -0.69 and -0.57, respectively (all p <0.05). For the analyses excluding placebo groups, the correlations of FEV 1 with SGRQ and TDI were lower but significant. Conclusions: A strong association exists between changes in spirometric measurements and changes in PROs.
de la Loge, Christine; Tugaut, Béatrice; Fofana, Fatoumata; Lambert, Jérémy; Hennig, Michael; Tschiesner, Uta; Vahdati-Bolouri, Mitra; Segun Ismaila, Afisi; Suresh Punekar, Yogesh
2016-01-01
Background: This meta-analysis assessed the relationship between change from baseline (CFB) in spirometric measurements (trough forced expiratory volume in 1 second [FEV1] and FEV1 area under the curve [AUC]) and patient-reported outcomes (St. George’s Respiratory Questionnaire total score [SGRQ] CFB, Transition Dyspnea Index [TDI] and exacerbation rates) after 6-12 months’ follow-up, using study treatment-group level data. Methods: A systematic literature search was performed for randomized controlled trials of ≥24 weeks duration in adults with chronic obstructive pulmonary disease (COPD). Studies reporting ≥1 spirometric measurement and ≥1 patient-reported outcome (PRO) at baseline and at study endpoint were selected. The relationships between PROs and spirometric endpoints were assessed using Pearson correlation coefficient and meta-regression. Results: Fifty-two studies (62,385 patients) were included. Primary weighted analysis conducted at the last assessment showed a large significant negative correlation (r, −0.68 [95% confidence interval (CI); −0.77, −0.57]) between trough FEV1 and SGRQ. Improvement of 100 mL in trough FEV1 corresponded to a 5.9 point reduction in SGRQ. Similarly, a reduction of 4 points on SGRQ corresponded to 40 mL improvement in trough FEV1 (p<0.001). The weighted correlation coefficients of trough FEV1 with TDI, exacerbation rate (all) and exacerbation rate (moderate/severe) at last assessment point were 0.57, -0.69 and -0.57, respectively (all p<0.05). For the analyses excluding placebo groups, the correlations of FEV1 with SGRQ and TDI were lower but significant. Conclusions: A strong association exists between changes in spirometric measurements and changes in PROs. PMID:28848877
Comparing spatial regression to random forests for large environmental data sets
Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...
Sørensen, By Ole H
2016-10-01
Organizational-level occupational health interventions have great potential to improve employees' health and well-being. However, they often compare unfavourably to individual-level interventions. This calls for improving methods for designing, implementing and evaluating organizational interventions. This paper presents and discusses the regression discontinuity design because, like the randomized control trial, it is a strong summative experimental design, but it typically fits organizational-level interventions better. The paper explores advantages and disadvantages of a regression discontinuity design with an embedded randomized control trial. It provides an example from an intervention study focusing on reducing sickness absence in 196 preschools. The paper demonstrates that such a design fits the organizational context, because it allows management to focus on organizations or workgroups with the most salient problems. In addition, organizations may accept an embedded randomized design because the organizations or groups with most salient needs receive obligatory treatment as part of the regression discontinuity design. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Harrell-Williams, Leigh; Wolfe, Edward W
2014-01-01
Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.
Videodensitometric Methods for Cardiac Output Measurements
NASA Astrophysics Data System (ADS)
Mischi, Massimo; Kalker, Ton; Korsten, Erik
2003-12-01
Cardiac output is often measured by indicator dilution techniques, usually based on dye or cold saline injections. Developments of more stable ultrasound contrast agents (UCA) are leading to new noninvasive indicator dilution methods. However, several problems concerning the interpretation of dilution curves as detected by ultrasound transducers have arisen. This paper presents a method for blood flow measurements based on UCA dilution. Dilution curves are determined by real-time densitometric analysis of the video output of an ultrasound scanner and are automatically fitted by the Local Density Random Walk model. A new fitting algorithm based on multiple linear regression is developed. Calibration, that is, the relation between videodensity and UCA concentration, is modelled by in vitro experimentation. The flow measurement system is validated by in vitro perfusion of SonoVue contrast agent. The results show an accurate dilution curve fit and flow estimation with determination coefficient larger than 0.95 and 0.99, respectively.
Ross, Michelle; Wakefield, Jon
2015-10-01
Two-phase study designs are appealing since they allow for the oversampling of rare sub-populations which improves efficiency. In this paper we describe a Bayesian hierarchical model for the analysis of two-phase data. Such a model is particularly appealing in a spatial setting in which random effects are introduced to model between-area variability. In such a situation, one may be interested in estimating regression coefficients or, in the context of small area estimation, in reconstructing the population totals by strata. The efficiency gains of the two-phase sampling scheme are compared to standard approaches using 2011 birth data from the research triangle area of North Carolina. We show that the proposed method can overcome small sample difficulties and improve on existing techniques. We conclude that the two-phase design is an attractive approach for small area estimation.
Gonzalez-Mulé, Erik; DeGeest, David S; McCormick, Brian W; Seong, Jee Young; Brown, Kenneth G
2014-09-01
Drawing on the group-norms theory of organizational citizenship behaviors and person-environment fit theory, we introduce and test a multilevel model of the effects of additive and dispersion composition models of team members' personality characteristics on group norms and individual helping behaviors. Our model was tested using regression and random coefficients modeling on 102 research and development teams. Results indicated that high mean levels of extraversion are positively related to individual helping behaviors through the mediating effect of cooperative group norms. Further, low variance on agreeableness (supplementary fit) and high variance on extraversion (complementary fit) promote the enactment of individual helping behaviors, but only the effects of extraversion were mediated by cooperative group norms. Implications of these findings for theories of helping behaviors in teams are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Use of sand wave habitats by silver hake
Auster, P.J.; Lindholm, J.; Schaub, S.; Funnell, G.; Kaufman, L.S.; Valentine, P.C.
2003-01-01
Silver hake Merluccius bilinearis are common members of fish communities in sand wave habitats on Georges Bank and on Stellwagen Bank in the Gulf of Maine. Observations of fish size v. sand wave period showed that silver hake are not randomly distributed within sand wave landscapes. Regression analyses showed a significant positive relationship between sand wave period and fish length. Correlation coefficients, however, were low, suggesting other interactions with sand wave morphology, the range of current velocities, and available prey may also influence their distribution. Direct contact with sand wave habitats varied over diel periods, with more fish resting on the seafloor during daytime than at night. Social foraging, in the form of polarized groups of fish swimming in linear formations during crepuscular and daytime periods, was also observed. Sand wave habitats may provide shelter from current flows and mediate fish-prey interactions. ?? 2003 The Fisheries Society of the British Isles.
Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K
2014-10-01
Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.
Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta
2017-04-01
Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
Application of Temperature Sensitivities During Iterative Strain-Gage Balance Calibration Analysis
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2011-01-01
A new method is discussed that may be used to correct wind tunnel strain-gage balance load predictions for the influence of residual temperature effects at the location of the strain-gages. The method was designed for the iterative analysis technique that is used in the aerospace testing community to predict balance loads from strain-gage outputs during a wind tunnel test. The new method implicitly applies temperature corrections to the gage outputs during the load iteration process. Therefore, it can use uncorrected gage outputs directly as input for the load calculations. The new method is applied in several steps. First, balance calibration data is analyzed in the usual manner assuming that the balance temperature was kept constant during the calibration. Then, the temperature difference relative to the calibration temperature is introduced as a new independent variable for each strain--gage output. Therefore, sensors must exist near the strain--gages so that the required temperature differences can be measured during the wind tunnel test. In addition, the format of the regression coefficient matrix needs to be extended so that it can support the new independent variables. In the next step, the extended regression coefficient matrix of the original calibration data is modified by using the manufacturer specified temperature sensitivity of each strain--gage as the regression coefficient of the corresponding temperature difference variable. Finally, the modified regression coefficient matrix is converted to a data reduction matrix that the iterative analysis technique needs for the calculation of balance loads. Original calibration data and modified check load data of NASA's MC60D balance are used to illustrate the new method.
Weibull crack density coefficient for polydimensional stress states
NASA Technical Reports Server (NTRS)
Gross, Bernard; Gyekenyesi, John P.
1989-01-01
A structural ceramic analysis and reliability evaluation code has recently been developed encompassing volume and surface flaw induced fracture, modeled by the two-parameter Weibull probability density function. A segment of the software involves computing the Weibull polydimensional stress state crack density coefficient from uniaxial stress experimental fracture data. The relationship of the polydimensional stress coefficient to the uniaxial stress coefficient is derived for a shear-insensitive material with a random surface flaw population.
NASA Astrophysics Data System (ADS)
Han, Hao; Zhang, Hao; Wei, Xinzhou; Moore, William; Liang, Zhengrong
2016-03-01
In this paper, we proposed a low-dose computed tomography (LdCT) image reconstruction method with the help of prior knowledge learning from previous high-quality or normal-dose CT (NdCT) scans. The well-established statistical penalized weighted least squares (PWLS) algorithm was adopted for image reconstruction, where the penalty term was formulated by a texture-based Gaussian Markov random field (gMRF) model. The NdCT scan was firstly segmented into different tissue types by a feature vector quantization (FVQ) approach. Then for each tissue type, a set of tissue-specific coefficients for the gMRF penalty was statistically learnt from the NdCT image via multiple-linear regression analysis. We also proposed a scheme to adaptively select the order of gMRF model for coefficients prediction. The tissue-specific gMRF patterns learnt from the NdCT image were finally used to form an adaptive MRF penalty for the PWLS reconstruction of LdCT image. The proposed texture-adaptive PWLS image reconstruction algorithm was shown to be more effective to preserve image textures than the conventional PWLS image reconstruction algorithm, and we further demonstrated the gain of high-order MRF modeling for texture-preserved LdCT PWLS image reconstruction.
Predictors of surgeons' efficiency in the operating rooms.
Nakata, Yoshinori; Watanabe, Yuichi; Narimatsu, Hiroto; Yoshimura, Tatsuya; Otake, Hiroshi; Sawa, Tomohiro
2017-02-01
The sustainability of the Japanese healthcare system is questionable because of a huge fiscal debt. One of the solutions is to improve the efficiency of healthcare. The purpose of this study is to determine what factors are predictive of surgeons' efficiency scores. The authors collected data from all the surgical procedures performed at Teikyo University Hospital from April 1 through September 30 in 2013-2015. Output-oriented Charnes-Cooper-Rhodes model of data envelopment analysis was employed to calculate each surgeon's efficiency score. Seven independent variables that may predict their efficiency scores were selected: experience, medical school, surgical volume, gender, academic rank, surgical specialty, and the surgical fee schedule. Multiple regression analysis using random-effects Tobit model was used for our panel data. The data from total 8722 surgical cases were obtained in 18-month study period. The authors analyzed 134 surgeons. The only statistically significant coefficients were surgical specialty and surgical fee schedule (p = 0.000 and p = 0.016, respectively). Experience had some positive association with efficiency scores but did not reach statistical significance (p = 0.062). The other coefficients were not statistically significant. These results demonstrated that the surgical reimbursement system, not surgeons' personal characteristics, is a significant predictor of surgeons' efficiency.
Mohamadirizi, Soheila; Kordi, Masoumeh
2013-09-01
Menstruation signs are among the most common disorders in adolescents and are influenced by various environmental and psychosocial factors. This study aimed to define the association between menstruation signs and anxiety, depression, and stress in school girls in Mashhad in 2011-2012. This was a cross-sectional study on 407 high school girls in Mashhad who were selected through two-step random sampling. The students completed a questionnaire concerning demographic characteristics, menstruation, Depression, Anxiety, and Stress Scale of 21 questions (DASS-21), and menstruation signs in three phases of their menstruation. Data were analyzed by the statistical tests of Pearson correlation coefficient, Student's t-test, one-way analysis of variance (ANOVA), and regression through SPSS version 14. Based on the findings, 74% of the subjects reported pre-menstruation signs, 94% reported signs during bleeding, and 40.8% reported post-menstruation signs. About 44.3% of the subjects had anxiety, 45.5% had depression, and 47.2% had stress. In addition, Pearson correlation coefficient test showed a significant positive correlation between menstruation signs and depression, anxiety, and stress (P < 0.05). With regard to the association between menstruation signs and psycho-cognitive variables, prevention and treatment of these disorders by the authorities of education and training and the Ministry of Health are essential.
Bayesian dynamic modeling of time series of dengue disease case counts
López-Quílez, Antonio; Torres-Prieto, Alexander
2017-01-01
The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model’s short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful models for decision-making in public health. PMID:28671941
Development of a Random Field Model for Gas Plume Detection in Multiple LWIR Images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heasler, Patrick G.
This report develops a random field model that describes gas plumes in LWIR remote sensing images. The random field model serves as a prior distribution that can be combined with LWIR data to produce a posterior that determines the probability that a gas plume exists in the scene and also maps the most probable location of any plume. The random field model is intended to work with a single pixel regression estimator--a regression model that estimates gas concentration on an individual pixel basis.
Martin, Guillaume; Magne, Marie-Angélina; Cristobal, Magali San
2017-01-01
The need to adapt to decrease farm vulnerability to adverse contextual events has been extensively discussed on a theoretical basis. We developed an integrated and operational method to assess farm vulnerability to multiple and interacting contextual changes and explain how this vulnerability can best be reduced according to farm configurations and farmers' technical adaptations over time. Our method considers farm vulnerability as a function of the raw measurements of vulnerability variables (e.g., economic efficiency of production), the slope of the linear regression of these measurements over time, and the residuals of this linear regression. The last two are extracted from linear mixed models considering a random regression coefficient (an intercept common to all farms), a global trend (a slope common to all farms), a random deviation from the general mean for each farm, and a random deviation from the general trend for each farm. Among all possible combinations, the lowest farm vulnerability is obtained through a combination of high values of measurements, a stable or increasing trend and low variability for all vulnerability variables considered. Our method enables relating the measurements, trends and residuals of vulnerability variables to explanatory variables that illustrate farm exposure to climatic and economic variability, initial farm configurations and farmers' technical adaptations over time. We applied our method to 19 cattle (beef, dairy, and mixed) farms over the period 2008-2013. Selected vulnerability variables, i.e., farm productivity and economic efficiency, varied greatly among cattle farms and across years, with means ranging from 43.0 to 270.0 kg protein/ha and 29.4-66.0% efficiency, respectively. No farm had a high level, stable or increasing trend and low residuals for both farm productivity and economic efficiency of production. Thus, the least vulnerable farms represented a compromise among measurement value, trend, and variability of both performances. No specific combination of farmers' practices emerged for reducing cattle farm vulnerability to climatic and economic variability. In the least vulnerable farms, the practices implemented (stocking rate, input use…) were more consistent with the objective of developing the properties targeted (efficiency, robustness…). Our method can be used to support farmers with sector-specific and local insights about most promising farm adaptations.
Martin, Guillaume; Magne, Marie-Angélina; Cristobal, Magali San
2017-01-01
The need to adapt to decrease farm vulnerability to adverse contextual events has been extensively discussed on a theoretical basis. We developed an integrated and operational method to assess farm vulnerability to multiple and interacting contextual changes and explain how this vulnerability can best be reduced according to farm configurations and farmers’ technical adaptations over time. Our method considers farm vulnerability as a function of the raw measurements of vulnerability variables (e.g., economic efficiency of production), the slope of the linear regression of these measurements over time, and the residuals of this linear regression. The last two are extracted from linear mixed models considering a random regression coefficient (an intercept common to all farms), a global trend (a slope common to all farms), a random deviation from the general mean for each farm, and a random deviation from the general trend for each farm. Among all possible combinations, the lowest farm vulnerability is obtained through a combination of high values of measurements, a stable or increasing trend and low variability for all vulnerability variables considered. Our method enables relating the measurements, trends and residuals of vulnerability variables to explanatory variables that illustrate farm exposure to climatic and economic variability, initial farm configurations and farmers’ technical adaptations over time. We applied our method to 19 cattle (beef, dairy, and mixed) farms over the period 2008–2013. Selected vulnerability variables, i.e., farm productivity and economic efficiency, varied greatly among cattle farms and across years, with means ranging from 43.0 to 270.0 kg protein/ha and 29.4–66.0% efficiency, respectively. No farm had a high level, stable or increasing trend and low residuals for both farm productivity and economic efficiency of production. Thus, the least vulnerable farms represented a compromise among measurement value, trend, and variability of both performances. No specific combination of farmers’ practices emerged for reducing cattle farm vulnerability to climatic and economic variability. In the least vulnerable farms, the practices implemented (stocking rate, input use…) were more consistent with the objective of developing the properties targeted (efficiency, robustness…). Our method can be used to support farmers with sector-specific and local insights about most promising farm adaptations. PMID:28900435
Huynh-Tran, V H; Gilbert, H; David, I
2017-11-01
The objective of the present study was to compare a random regression model, usually used in genetic analyses of longitudinal data, with the structured antedependence (SAD) model to study the longitudinal feed conversion ratio (FCR) in growing Large White pigs and to propose criteria for animal selection when used for genetic evaluation. The study was based on data from 11,790 weekly FCR measures collected on 1,186 Large White male growing pigs. Random regression (RR) using orthogonal polynomial Legendre and SAD models was used to estimate genetic parameters and predict FCR-based EBV for each of the 10 wk of the test. The results demonstrated that the best SAD model (1 order of antedependence of degree 2 and a polynomial of degree 2 for the innovation variance for the genetic and permanent environmental effects, i.e., 12 parameters) provided a better fit for the data than RR with a quadratic function for the genetic and permanent environmental effects (13 parameters), with Bayesian information criteria values of -10,060 and -9,838, respectively. Heritabilities with the SAD model were higher than those of RR over the first 7 wk of the test. Genetic correlations between weeks were higher than 0.68 for short intervals between weeks and decreased to 0.08 for the SAD model and -0.39 for RR for the longest intervals. These differences in genetic parameters showed that, contrary to the RR approach, the SAD model does not suffer from border effect problems and can handle genetic correlations that tend to 0. Summarized breeding values were proposed for each approach as linear combinations of the individual weekly EBV weighted by the coefficients of the first or second eigenvector computed from the genetic covariance matrix of the additive genetic effects. These summarized breeding values isolated EBV trajectories over time, capturing either the average general value or the slope of the trajectory. Finally, applying the SAD model over a reduced period of time suggested that similar selection choices would result from the use of the records from the first 8 wk of the test. To conclude, the SAD model performed well for the genetic evaluation of longitudinal phenotypes.
Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.
ERIC Educational Resources Information Center
Kromrey, Jeffrey D.; Hines, Constance V.
1995-01-01
The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…
Enhance-Synergism and Suppression Effects in Multiple Regression
ERIC Educational Resources Information Center
Lipovetsky, Stan; Conklin, W. Michael
2004-01-01
Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…
Soares, M P; Gaya, L G; Lorentz, L H; Batistel, F; Rovadoscki, G A; Ticiani, E; Zabot, V; Di Domenico, Q; Madureira, A P; Pértile, S F N
2011-09-06
Artificial insemination has been used to improve production in Brazilian dairy cattle; however, this can lead to problems due to increased inbreeding. To evaluate the effect of the magnitude of inbreeding coefficients on predicted transmitting abilities (PTAs) for milk traits of Holstein and Jersey breeds, data on 392 Holstein and 92 Jersey sires used in Brazil were tabulated. The second-degree polynomial equations and points of maximum or minimal response were estimated to establish the regression equation of the variables as a function of the inbreeding coefficients. The mean inbreeding coefficient of the Holstein bulls was 5.10%; this did not significantly affect the PTA for percent milk fat, protein percentage and protein (P = 0.479, 0.058 and 0.087, respectively). However, the PTAs for milk yield and fat decreased significantly after reaching inbreeding coefficients of 6.43 (P = 0.034) and 5.75 (P = 0.007), respectively. The mean inbreeding coefficient of Jersey bulls was 6.45%; the PTAs for milk yield, fat and protein, in pounds, decreased significantly after reaching inbreeding coefficients of 15.04, 9.83 and 12.82% (P < 0.001, P = 0.002, and P = 0.001, respectively). The linear regression was only significant for fat and protein percentages in the Jersey breed (P = 0.002 and P = 0.005, respectively). The PTAs of Holstein sires were more affected by smaller magnitudes of inbreeding coefficients than those of Jersey sires. It is necessary to monitor the inbreeding coefficients of sires used for artificial insemination in breeding schemes in Brazil, since the low genetic variability of the available sires may lead to reduced production.
Braschel, Melissa C; Svec, Ivana; Darlington, Gerarda A; Donner, Allan
2016-04-01
Many investigators rely on previously published point estimates of the intraclass correlation coefficient rather than on their associated confidence intervals to determine the required size of a newly planned cluster randomized trial. Although confidence interval methods for the intraclass correlation coefficient that can be applied to community-based trials have been developed for a continuous outcome variable, fewer methods exist for a binary outcome variable. The aim of this study is to evaluate confidence interval methods for the intraclass correlation coefficient applied to binary outcomes in community intervention trials enrolling a small number of large clusters. Existing methods for confidence interval construction are examined and compared to a new ad hoc approach based on dividing clusters into a large number of smaller sub-clusters and subsequently applying existing methods to the resulting data. Monte Carlo simulation is used to assess the width and coverage of confidence intervals for the intraclass correlation coefficient based on Smith's large sample approximation of the standard error of the one-way analysis of variance estimator, an inverted modified Wald test for the Fleiss-Cuzick estimator, and intervals constructed using a bootstrap-t applied to a variance-stabilizing transformation of the intraclass correlation coefficient estimate. In addition, a new approach is applied in which clusters are randomly divided into a large number of smaller sub-clusters with the same methods applied to these data (with the exception of the bootstrap-t interval, which assumes large cluster sizes). These methods are also applied to a cluster randomized trial on adolescent tobacco use for illustration. When applied to a binary outcome variable in a small number of large clusters, existing confidence interval methods for the intraclass correlation coefficient provide poor coverage. However, confidence intervals constructed using the new approach combined with Smith's method provide nominal or close to nominal coverage when the intraclass correlation coefficient is small (<0.05), as is the case in most community intervention trials. This study concludes that when a binary outcome variable is measured in a small number of large clusters, confidence intervals for the intraclass correlation coefficient may be constructed by dividing existing clusters into sub-clusters (e.g. groups of 5) and using Smith's method. The resulting confidence intervals provide nominal or close to nominal coverage across a wide range of parameters when the intraclass correlation coefficient is small (<0.05). Application of this method should provide investigators with a better understanding of the uncertainty associated with a point estimator of the intraclass correlation coefficient used for determining the sample size needed for a newly designed community-based trial. © The Author(s) 2015.
Time trend of polycyclic aromatic hydrocarbon emission factors from motor vehicles
NASA Astrophysics Data System (ADS)
Tao, Shu; Shen, Huizhong; Wang, Rong; Sun, Kang
2010-05-01
Motor vehicle is an important emission source of polycyclic aromatic hydrocarbons (PAHs) and this is particularly true in urban areas. Motor vehicle emission factors (EFs) for individual PAH compound reported in the literature varied for 4 to 5 orders of magnitude, leading to high uncertainty in emission estimation. In this study, the major factors affecting EFs were investigated and characterized by regression models. Based on the model developed, a motor vehicle PAH emission inventory at country level was developed. It was found that country and model year are the most important factors affecting EFs for PAHs. The influence of the two factors can be quantified by a single parameter of per capita gross domestic production (purchasing power parity), which was used as the independent variables of the regression models. The models developed using randomly selected 80% of measurements and tested with the remained data accounted for 28 to 48% of the variations in EFs for PAHs measured in 16 countries over 50 years. The regression coefficients of the EF prediction models were molecular weight dependent. Motor vehicle emission of PAHs from individual countries in the world in 1985, 1995, 2005, 2015, and 2025 were calculated and the global emission of total PAHs were 470, 390, and 430 Gg in 1985, 1995, and 2005 and will be 290 and 130 Gg in 2015 and 2025, respectively. The emission is currently passing its peak and will decrease due to significant decrease in China and other developing countries.
Liu, Yan; Salvendy, Gavriel
2009-05-01
This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.
Influence of soil pH on the sorption of ionizable chemicals: modeling advances.
Franco, Antonio; Fu, Wenjing; Trapp, Stefan
2009-03-01
The soil-water distribution coefficient of ionizable chemicals (K(d)) depends on the soil acidity, mainly because the pH governs speciation. Using pH-specific K(d) values normalized to organic carbon (K(OC)) from the literature, a method was developed to estimate the K(OC) of monovalent organic acids and bases. The regression considers pH-dependent speciation and species-specific partition coefficients, calculated from the dissociation constant (pK(a)) and the octanol-water partition coefficient of the neutral molecule (log P(n)). Probably because of the lower pH near the organic colloid-water interface, the optimal pH to model dissociation was lower than the bulk soil pH. The knowledge of the soil pH allows calculation of the fractions of neutral and ionic molecules in the system, thus improving the existing regression for acids. The same approach was not successful with bases, for which the impact of pH on the total sorption is contrasting. In fact, the shortcomings of the model assumptions affect the predictive power for acids and for bases differently. We evaluated accuracy and limitations of the regressions for their use in the environmental fate assessment of ionizable chemicals.
The Use of Structure Coefficients to Address Multicollinearity in Sport and Exercise Science
ERIC Educational Resources Information Center
Yeatts, Paul E.; Barton, Mitch; Henson, Robin K.; Martin, Scott B.
2017-01-01
A common practice in general linear model (GLM) analyses is to interpret regression coefficients (e.g., standardized ß weights) as indicators of variable importance. However, focusing solely on standardized beta weights may provide limited or erroneous information. For example, ß weights become increasingly unreliable when predictor variables are…
Delgado, J; Liao, J C
1992-01-01
The methodology previously developed for determining the Flux Control Coefficients [Delgado & Liao (1992) Biochem. J. 282, 919-927] is extended to the calculation of metabolite Concentration Control Coefficients. It is shown that the transient metabolite concentrations are related by a few algebraic equations, attributed to mass balance, stoichiometric constraints, quasi-equilibrium or quasi-steady states, and kinetic regulations. The coefficients in these relations can be estimated using linear regression, and can be used to calculate the Control Coefficients. The theoretical basis and two examples are discussed. Although the methodology is derived based on the linear approximation of enzyme kinetics, it yields reasonably good estimates of the Control Coefficients for systems with non-linear kinetics. PMID:1497632
Monitoring Energy Balance in Breast Cancer Survivors Using a Mobile App: Reliability Study
Lozano-Lozano, Mario; Galiano-Castillo, Noelia; Martín-Martín, Lydia; Pace-Bedetti, Nicolás; Fernández-Lao, Carolina; Cantarero-Villanueva, Irene
2018-01-01
Background The majority of breast cancer survivors do not meet recommendations in terms of diet and physical activity. To address this problem, we developed a mobile health (mHealth) app for assessing and monitoring healthy lifestyles in breast cancer survivors, called the Energy Balance on Cancer (BENECA) mHealth system. The BENECA mHealth system is a novel and interactive mHealth app, which allows breast cancer survivors to engage themselves in their energy balance monitoring. BENECA was designed to facilitate adherence to healthy lifestyles in an easy and intuitive way. Objective The objective of the study was to assess the concurrent validity and test-retest reliability between the BENECA mHealth system and the gold standard assessment methods for diet and physical activity. Methods A reliability study was conducted with 20 breast cancer survivors. In the study, tri-axial accelerometers (ActiGraphGT3X+) were used as gold standard for 8 consecutive days, in addition to 2, 24-hour dietary recalls, 4 dietary records, and sociodemographic questionnaires. Two-way random effect intraclass correlation coefficients, a linear regression-analysis, and a Passing-Bablok regression were calculated. Results The reliability estimates were very high for all variables (alpha≥.90). The lowest reliability was found in fruit and vegetable intakes (alpha=.94). The reliability between the accelerometer and the dietary assessment instruments against the BENECA system was very high (intraclass correlation coefficient=.90). We found a mean match rate of 93.51% between instruments and a mean phantom rate of 3.35%. The Passing-Bablok regression analysis did not show considerable bias in fat percentage, portions of fruits and vegetables, or minutes of moderate to vigorous physical activity. Conclusions The BENECA mHealth app could be a new tool to measure energy balance in breast cancer survivors in a reliable and simple way. Our results support the use of this technology to not only to encourage changes in breast cancer survivors' lifestyles, but also to remotely monitor energy balance. Trial Registration ClinicalTrials.gov NCT02817724; https://clinicaltrials.gov/ct2/show/NCT02817724 (Archived by WebCite at http://www.webcitation.org/6xVY1buCc) PMID:29588273
2012-01-01
Background A discrete choice experiment (DCE) is a preference survey which asks participants to make a choice among product portfolios comparing the key product characteristics by performing several choice tasks. Analyzing DCE data needs to account for within-participant correlation because choices from the same participant are likely to be similar. In this study, we empirically compared some commonly-used statistical methods for analyzing DCE data while accounting for within-participant correlation based on a survey of patient preference for colorectal cancer (CRC) screening tests conducted in Hamilton, Ontario, Canada in 2002. Methods A two-stage DCE design was used to investigate the impact of six attributes on participants' preferences for CRC screening test and willingness to undertake the test. We compared six models for clustered binary outcomes (logistic and probit regressions using cluster-robust standard error (SE), random-effects and generalized estimating equation approaches) and three models for clustered nominal outcomes (multinomial logistic and probit regressions with cluster-robust SE and random-effects multinomial logistic model). We also fitted a bivariate probit model with cluster-robust SE treating the choices from two stages as two correlated binary outcomes. The rank of relative importance between attributes and the estimates of β coefficient within attributes were used to assess the model robustness. Results In total 468 participants with each completing 10 choices were analyzed. Similar results were reported for the rank of relative importance and β coefficients across models for stage-one data on evaluating participants' preferences for the test. The six attributes ranked from high to low as follows: cost, specificity, process, sensitivity, preparation and pain. However, the results differed across models for stage-two data on evaluating participants' willingness to undertake the tests. Little within-patient correlation (ICC ≈ 0) was found in stage-one data, but substantial within-patient correlation existed (ICC = 0.659) in stage-two data. Conclusions When small clustering effect presented in DCE data, results remained robust across statistical models. However, results varied when larger clustering effect presented. Therefore, it is important to assess the robustness of the estimates via sensitivity analysis using different models for analyzing clustered data from DCE studies. PMID:22348526
Teaching Students Not to Dismiss the Outermost Observations in Regressions
ERIC Educational Resources Information Center
Kasprowicz, Tomasz; Musumeci, Jim
2015-01-01
One econometric rule of thumb is that greater dispersion in observations of the independent variable improves estimates of regression coefficients and therefore produces better results, i.e., lower standard errors of the estimates. Nevertheless, students often seem to mistrust precisely the observations that contribute the most to this greater…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mele, L.M.; Prodan, P.F.
1983-04-01
Hydrologic data were collected and analyzed for three coal refuse disposal sites in southern Illinois. The disposal sites were associated with underground mines and consisted of piles of coarse waste (gob) and slurry areas where fine waste rejected from coal washing was deposited. Prereclamation data were available for the Superior washer site in Macoupin County and the New Kathleen site in Perry County. Post-reclamation data were available for the Staunton 1 site in Macoupin County and the New Kathleen site. Data analyzed from each phase (i.e., pre- or post-reclamation) were limited to one year. Storm event runoff coefficients were calculatedmore » for each site. Average runoff coefficients were compared for sites within the same reclamation phase to determine the effects of topographical parameters such as gob pile slope and percentage of drainage basin covered by the gob pile. Average runoff coefficients were then compared for pre- and post-reclamation data. Multiple regression analyses were performed on rainfall-runoff data for each site to determine the significance of independent variables other than rainfall in determining runoff. A generalized regression equation corrected data for topographical differences and included only those independent variables that were significant at all sites. Regression coefficients were compared for pre- and post-reclamation sites. The results of rainfall-runoff analysis indicate that the runoff coefficient increases because of reclamation. It is hypothesized that this effect is due to the placement of a soil cover that is less permeable than gob or slurry and occurs despite reduction in slope and the establishment of vegetation.« less
van Mil, Anke C C M; Greyling, Arno; Zock, Peter L; Geleijnse, Johanna M; Hopman, Maria T; Mensink, Ronald P; Reesink, Koen D; Green, Daniel J; Ghiadoni, Lorenzo; Thijssen, Dick H
2016-09-01
Brachial artery flow-mediated dilation (FMD) is a popular technique to examine endothelial function in humans. Identifying volunteer and methodological factors related to variation in FMD is important to improve measurement accuracy and applicability. Volunteer-related and methodology-related parameters were collected in 672 volunteers from eight affiliated centres worldwide who underwent repeated measures of FMD. All centres adopted contemporary expert-consensus guidelines for FMD assessment. After calculating the coefficient of variation (%) of the FMD for each individual, we constructed quartiles (n = 168 per quartile). Based on two regression models (volunteer-related factors and methodology-related factors), statistically significant components of these two models were added to a final regression model (calculated as β-coefficient and R). This allowed us to identify factors that independently contributed to the variation in FMD%. Median coefficient of variation was 17.5%, with healthy volunteers demonstrating a coefficient of variation 9.3%. Regression models revealed age (β = 0.248, P < 0.001), hypertension (β = 0.104, P < 0.001), dyslipidemia (β = 0.331, P < 0.001), time between measurements (β = 0.318, P < 0.001), lab experience (β = -0.133, P < 0.001) and baseline FMD% (β = 0.082, P < 0.05) as contributors to the coefficient of variation. After including all significant factors in the final model, we found that time between measurements, hypertension, baseline FMD% and lab experience with FMD independently predicted brachial artery variability (total R = 0.202). Although FMD% showed good reproducibility, larger variation was observed in conditions with longer time between measurements, hypertension, less experience and lower baseline FMD%. Accounting for these factors may improve FMD% variability.
Regression Simulation Model. Appendix X. Users Manual,
1981-03-01
change as the prediction equations become refined. Whereas no notice will be provided when the changes are made, the programs will be modified such that...NATIONAL BUREAU Of STANDARDS 1963 A ___,_ __ _ __ _ . APPENDIX X ( R4/ EGRESSION IMULATION ’jDEL. Ape’A ’) 7 USERS MANUA submitted to The Great River...regression analysis and to establish a prediction equation (model). The prediction equation contains the partial regression coefficients (B-weights) which
Reimus, Paul W; Callahan, Timothy J; Ware, S Doug; Haga, Marc J; Counce, Dale A
2007-08-15
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ((3)HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient (D(m)/D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of (D(m)/D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log(D(m)/D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log(D(m)/D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
NASA Astrophysics Data System (ADS)
Reimus, Paul W.; Callahan, Timothy J.; Ware, S. Doug; Haga, Marc J.; Counce, Dale A.
2007-08-01
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ( 3HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient ( Dm/ D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of ( Dm/ D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log( Dm/ D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log( Dm/ D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
NASA Astrophysics Data System (ADS)
Rocha, Alby D.; Groen, Thomas A.; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Willemen, Louise
2017-11-01
The growing number of narrow spectral bands in hyperspectral remote sensing improves the capacity to describe and predict biological processes in ecosystems. But it also poses a challenge to fit empirical models based on such high dimensional data, which often contain correlated and noisy predictors. As sample sizes, to train and validate empirical models, seem not to be increasing at the same rate, overfitting has become a serious concern. Overly complex models lead to overfitting by capturing more than the underlying relationship, and also through fitting random noise in the data. Many regression techniques claim to overcome these problems by using different strategies to constrain complexity, such as limiting the number of terms in the model, by creating latent variables or by shrinking parameter coefficients. This paper is proposing a new method, named Naïve Overfitting Index Selection (NOIS), which makes use of artificially generated spectra, to quantify the relative model overfitting and to select an optimal model complexity supported by the data. The robustness of this new method is assessed by comparing it to a traditional model selection based on cross-validation. The optimal model complexity is determined for seven different regression techniques, such as partial least squares regression, support vector machine, artificial neural network and tree-based regressions using five hyperspectral datasets. The NOIS method selects less complex models, which present accuracies similar to the cross-validation method. The NOIS method reduces the chance of overfitting, thereby avoiding models that present accurate predictions that are only valid for the data used, and too complex to make inferences about the underlying process.
Predicting active-layer soil thickness using topographic variables at a small watershed scale
Li, Aidi; Tan, Xing; Wu, Wei; Liu, Hongbin; Zhu, Jie
2017-01-01
Knowledge about the spatial distribution of active-layer (AL) soil thickness is indispensable for ecological modeling, precision agriculture, and land resource management. However, it is difficult to obtain the details on AL soil thickness by using conventional soil survey method. In this research, the objective is to investigate the possibility and accuracy of mapping the spatial distribution of AL soil thickness through random forest (RF) model by using terrain variables at a small watershed scale. A total of 1113 soil samples collected from the slope fields were randomly divided into calibration (770 soil samples) and validation (343 soil samples) sets. Seven terrain variables including elevation, aspect, relative slope position, valley depth, flow path length, slope height, and topographic wetness index were derived from a digital elevation map (30 m). The RF model was compared with multiple linear regression (MLR), geographically weighted regression (GWR) and support vector machines (SVM) approaches based on the validation set. Model performance was evaluated by precision criteria of mean error (ME), mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). Comparative results showed that RF outperformed MLR, GWR and SVM models. The RF gave better values of ME (0.39 cm), MAE (7.09 cm), and RMSE (10.85 cm) and higher R2 (62%). The sensitivity analysis demonstrated that the DEM had less uncertainty than the AL soil thickness. The outcome of the RF model indicated that elevation, flow path length and valley depth were the most important factors affecting the AL soil thickness variability across the watershed. These results demonstrated the RF model is a promising method for predicting spatial distribution of AL soil thickness using terrain parameters. PMID:28877196
Prevalence of kidney stones and associated risk factors in the Shunyi District of Beijing, China.
Jiang, Y G; He, L H; Luo, G T; Zhang, X D
2017-10-01
Kidney stone formation is a multifactorial condition that involves interaction of environmental and genetic factors. Presence of kidney stones is strongly related to other diseases, which may result in a heavy economic and social burden. Clinical data on the prevalence and influencing factors in kidney stone disease in the north of China are scarce. In this study, we explored the prevalence of kidney stone and potentially associated risk factors in the Shunyi District of Beijing, China. A population-based cross-sectional study was conducted from December 2011 to November 2012 in a northern area of China. Participants were interviewed in randomly selected towns. Univariate analysis of continuous and categorical variables was first performed by calculation of Spearman's correlation coefficient and Pearson Chi squared value, respectively. Variables with statistical significance were further analysed by multivariate logistic regression to explore the potential influencing factors. A total of 3350 participants (1091 males and 2259 females) completed the survey and the response rate was 99.67%. Among the participants, 3.61% were diagnosed with kidney stone. Univariate analysis showed that significant differences were evident in 31 variables. Blood and urine tests were performed in 100 randomly selected patients with kidney stone and 100 healthy controls. Serum creatinine, calcium, and uric acid were significantly different between the patients with kidney stone and healthy controls. Multivariate logistic regression revealed that being male (odds ratio=102.681; 95% confidence interval, 1.062-9925.797), daily intake of white spirits (6.331; 1.204-33.282), and a history of urolithiasis (1797.775; 24.228-133 396.982) were factors potentially associated with kidney stone prevalence. Male gender, drinking white spirits, and a history of urolithiasis are potentially associated with kidney stone formation.
Zhang, Yiyan; Xin, Yi; Li, Qin; Ma, Jianshe; Li, Shuai; Lv, Xiaodan; Lv, Weiqi
2017-11-02
Various kinds of data mining algorithms are continuously raised with the development of related disciplines. The applicable scopes and their performances of these algorithms are different. Hence, finding a suitable algorithm for a dataset is becoming an important emphasis for biomedical researchers to solve practical problems promptly. In this paper, seven kinds of sophisticated active algorithms, namely, C4.5, support vector machine, AdaBoost, k-nearest neighbor, naïve Bayes, random forest, and logistic regression, were selected as the research objects. The seven algorithms were applied to the 12 top-click UCI public datasets with the task of classification, and their performances were compared through induction and analysis. The sample size, number of attributes, number of missing values, and the sample size of each class, correlation coefficients between variables, class entropy of task variable, and the ratio of the sample size of the largest class to the least class were calculated to character the 12 research datasets. The two ensemble algorithms reach high accuracy of classification on most datasets. Moreover, random forest performs better than AdaBoost on the unbalanced dataset of the multi-class task. Simple algorithms, such as the naïve Bayes and logistic regression model are suitable for a small dataset with high correlation between the task and other non-task attribute variables. K-nearest neighbor and C4.5 decision tree algorithms perform well on binary- and multi-class task datasets. Support vector machine is more adept on the balanced small dataset of the binary-class task. No algorithm can maintain the best performance in all datasets. The applicability of the seven data mining algorithms on the datasets with different characteristics was summarized to provide a reference for biomedical researchers or beginners in different fields.
Multiple imputation for cure rate quantile regression with censored data.
Wu, Yuanshan; Yin, Guosheng
2017-03-01
The main challenge in the context of cure rate analysis is that one never knows whether censored subjects are cured or uncured, or whether they are susceptible or insusceptible to the event of interest. Considering the susceptible indicator as missing data, we propose a multiple imputation approach to cure rate quantile regression for censored data with a survival fraction. We develop an iterative algorithm to estimate the conditionally uncured probability for each subject. By utilizing this estimated probability and Bernoulli sample imputation, we can classify each subject as cured or uncured, and then employ the locally weighted method to estimate the quantile regression coefficients with only the uncured subjects. Repeating the imputation procedure multiple times and taking an average over the resultant estimators, we obtain consistent estimators for the quantile regression coefficients. Our approach relaxes the usual global linearity assumption, so that we can apply quantile regression to any particular quantile of interest. We establish asymptotic properties for the proposed estimators, including both consistency and asymptotic normality. We conduct simulation studies to assess the finite-sample performance of the proposed multiple imputation method and apply it to a lung cancer study as an illustration. © 2016, The International Biometric Society.
Generalized and synthetic regression estimators for randomized branch sampling
David L. R. Affleck; Timothy G. Gregoire
2015-01-01
In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...
Multilevel covariance regression with correlated random effects in the mean and variance structure.
Quintero, Adrian; Lesaffre, Emmanuel
2017-09-01
Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Estimation of subsurface thermal structure using sea surface height and sea surface temperature
NASA Technical Reports Server (NTRS)
Kang, Yong Q. (Inventor); Jo, Young-Heon (Inventor); Yan, Xiao-Hai (Inventor)
2012-01-01
A method of determining a subsurface temperature in a body of water is disclosed. The method includes obtaining surface temperature anomaly data and surface height anomaly data of the body of water for a region of interest, and also obtaining subsurface temperature anomaly data for the region of interest at a plurality of depths. The method further includes regressing the obtained surface temperature anomaly data and surface height anomaly data for the region of interest with the obtained subsurface temperature anomaly data for the plurality of depths to generate regression coefficients, estimating a subsurface temperature at one or more other depths for the region of interest based on the generated regression coefficients and outputting the estimated subsurface temperature at the one or more other depths. Using the estimated subsurface temperature, signal propagation times and trajectories of marine life in the body of water are determined.
NASA Technical Reports Server (NTRS)
Rogers, R. H. (Principal Investigator)
1976-01-01
The author has identified the following significant results. Computer techniques were developed for mapping water quality parameters from LANDSAT data, using surface samples collected in an ongoing survey of water quality in Saginaw Bay. Chemical and biological parameters were measured on 31 July 1975 at 16 bay stations in concert with the LANDSAT overflight. Application of stepwise linear regression bands to nine of these parameters and corresponding LANDSAT measurements for bands 4 and 5 only resulted in regression correlation coefficients that varied from 0.94 for temperature to 0.73 for Secchi depth. Regression equations expressed with the pair of bands 4 and 5, rather than the ratio band 4/band 5, provided higher correlation coefficients for all the water quality parameters studied (temperature, Secchi depth, chloride, conductivity, total kjeldahl nitrogen, total phosphorus, chlorophyll a, total solids, and suspended solids).
Prediction of anthropometric foot characteristics in children.
Morrison, Stewart C; Durward, Brian R; Watt, Gordon F; Donaldson, Malcolm D C
2009-01-01
The establishment of growth reference values is needed in pediatric practice where pathologic conditions can have a detrimental effect on the growth and development of the pediatric foot. This study aims to use multiple regression to evaluate the effects of multiple predictor variables (height, age, body mass, and gender) on anthropometric characteristics of the peripubescent foot. Two hundred children aged 9 to 12 years were recruited, and three anthropometric measurements of the pediatric foot were recorded (foot length, forefoot width, and navicular height). Multiple regression analysis was conducted, and coefficients for gender, height, and body mass all had significant relationships for the prediction of forefoot width and foot length (P < or = .05, r > or = 0.7). The coefficients for gender and body mass were not significant for the prediction of navicular height (P > or = .05), whereas height was (P < or = .05). Normative growth reference values and prognostic regression equations are presented for the peripubescent foot.
Measurement of the absorption coefficient using the sound-intensity technique
NASA Technical Reports Server (NTRS)
Atwal, M.; Bernhard, R.
1984-01-01
The possibility of using the sound intensity technique to measure the absorption coefficient of a material is investigated. This technique measures the absorption coefficient by measuring the intensity incident on the sample and the net intensity reflected by the sample. Results obtained by this technique are compared with the standard techniques of measuring the change in the reverberation time and the standing wave ratio in a tube, thereby, calculating the random incident and the normal incident adsorption coefficient.
Deorientation of PolSAR coherency matrix for volume scattering retrieval
NASA Astrophysics Data System (ADS)
Kumar, Shashi; Garg, R. D.; Kushwaha, S. P. S.
2016-05-01
Polarimetric SAR data has proven its potential to extract scattering information for different features appearing in single resolution cell. Several decomposition modelling approaches have been developed to retrieve scattering information from PolSAR data. During scattering power decomposition based on physical scattering models it becomes very difficult to distinguish volume scattering as a result from randomly oriented vegetation from scattering nature of oblique structures which are responsible for double-bounce and volume scattering , because both are decomposed in same scattering mechanism. The polarization orientation angle (POA) of an electromagnetic wave is one of the most important character which gets changed due to scattering from geometrical structure of topographic slopes, oriented urban area and randomly oriented features like vegetation cover. The shift in POA affects the polarimetric radar signatures. So, for accurate estimation of scattering nature of feature compensation in polarization orientation shift becomes an essential procedure. The prime objective of this work was to investigate the effect of shift in POA in scattering information retrieval and to explore the effect of deorientation on regression between field-estimated aboveground biomass (AGB) and volume scattering. For this study Dudhwa National Park, U.P., India was selected as study area and fully polarimetric ALOS PALSAR data was used to retrieve scattering information from the forest area of Dudhwa National Park. Field data for DBH and tree height was collect for AGB estimation using stratified random sampling. AGB was estimated for 170 plots for different locations of the forest area. Yamaguchi four component decomposition modelling approach was utilized to retrieve surface, double-bounce, helix and volume scattering information. Shift in polarization orientation angle was estimated and deorientation of coherency matrix for compensation of POA shift was performed. Effect of deorientation on RGB color composite for the forest area can be easily seen. Overestimation of volume scattering and under estimation of double bounce scattering was recorded for PolSAR decomposition without deorientation and increase in double bounce scattering and decrease in volume scattering was noticed after deorientation. This study was mainly focused on volume scattering retrieval and its relation with field estimated AGB. Change in volume scattering after POA compensation of PolSAR data was recorded and a comparison was performed on volume scattering values for all the 170 forest plots for which field data were collected. Decrease in volume scattering after deorientation was noted for all the plots. Regression between PolSAR decomposition based volume scattering and AGB was performed. Before deorientation, coefficient determination (R2) between volume scattering and AGB was 0.225. After deorientation an improvement in coefficient of determination was found and the obtained value was 0.613. This study recommends deorientation of PolSAR data for decomposition modelling to retrieve reliable volume scattering information from forest area.
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Ran, Tao; Liu, Yong; Li, Hengzhi; Tang, Shaoxun; He, Zhixiong; Munteanu, Cristian R; González-Díaz, Humberto; Tan, Zhiliang; Zhou, Chuanshe
2016-07-27
The management of ruminant growth yield has economic importance. The current work presents a study of the spatiotemporal dynamic expression of Ghrelin and GHR at mRNA levels throughout the gastrointestinal tract (GIT) of kid goats under housing and grazing systems. The experiments show that the feeding system and age affected the expression of either Ghrelin or GHR with different mechanisms. Furthermore, the experimental data are used to build new Machine Learning models based on the Perturbation Theory, which can predict the effects of perturbations of Ghrelin and GHR mRNA expression on the growth yield. The models consider eight longitudinal GIT segments (rumen, abomasum, duodenum, jejunum, ileum, cecum, colon and rectum), seven time points (0, 7, 14, 28, 42, 56 and 70 d) and two feeding systems (Supplemental and Grazing feeding) as perturbations from the expected values of the growth yield. The best regression model was obtained using Random Forest, with the coefficient of determination R(2) of 0.781 for the test subset. The current results indicate that the non-linear regression model can accurately predict the growth yield and the key nodes during gastrointestinal development, which is helpful to optimize the feeding management strategies in ruminant production system.
Ran, Tao; Liu, Yong; Li, Hengzhi; Tang, Shaoxun; He, Zhixiong; Munteanu, Cristian R.; González-Díaz, Humberto; Tan, Zhiliang; Zhou, Chuanshe
2016-01-01
The management of ruminant growth yield has economic importance. The current work presents a study of the spatiotemporal dynamic expression of Ghrelin and GHR at mRNA levels throughout the gastrointestinal tract (GIT) of kid goats under housing and grazing systems. The experiments show that the feeding system and age affected the expression of either Ghrelin or GHR with different mechanisms. Furthermore, the experimental data are used to build new Machine Learning models based on the Perturbation Theory, which can predict the effects of perturbations of Ghrelin and GHR mRNA expression on the growth yield. The models consider eight longitudinal GIT segments (rumen, abomasum, duodenum, jejunum, ileum, cecum, colon and rectum), seven time points (0, 7, 14, 28, 42, 56 and 70 d) and two feeding systems (Supplemental and Grazing feeding) as perturbations from the expected values of the growth yield. The best regression model was obtained using Random Forest, with the coefficient of determination R2 of 0.781 for the test subset. The current results indicate that the non-linear regression model can accurately predict the growth yield and the key nodes during gastrointestinal development, which is helpful to optimize the feeding management strategies in ruminant production system. PMID:27460882
Association of serum uric acid with high-sensitivity C-reactive protein in postmenopausal women.
Raeisi, A; Ostovar, A; Vahdat, K; Rezaei, P; Darabi, H; Moshtaghi, D; Nabipour, I
2017-02-01
To explore the independent correlation between serum uric acid and low-grade inflammation (measured by high-sensitivity C-reactive protein, hs-CRP) in postmenopausal women. A total of 378 healthy Iranian postmenopausal women were randomly selected in a population-based study. Circulating hs-CRP levels were measured by highly specific enzyme-linked immunosorbent assay method and an enzymatic calorimetric method was used to measure serum levels of uric acid. Pearson correlation coefficient, multiple linear regression and logistic regression models were used to analyze the association between uric acid and hs-CRP levels. A statistically significant correlation was seen between serum levels of uric acid and log-transformed circulating hs-CRP (r = 0.25, p < 0.001). After adjustment for age and cardiovascular risk factors (according to NCEP ATP III criteria), circulating hs-CRP levels were significantly associated with serum uric acid levels (β = 0.20, p < 0.001). After adjustment for age and cardiovascular risk factors, hs-CRP levels ≥3 mg/l were significantly associated with higher uric acid levels (odds ratio =1.52, 95% confidence interval 1.18-1.96). Higher serum uric acid levels were positively and independently associated with circulating hs-CRP in healthy postmenopausal women.
Magura, Stephen; Cleland, Charles M; Tonigan, J Scott
2013-05-01
The objective of the study is to determine whether Alcoholics Anonymous (AA) participation leads to reduced drinking and problems related to drinking within Project MATCH (Matching Alcoholism Treatments to Client Heterogeneity), an existing national alcoholism treatment data set. The method used is structural equation modeling of panel data with cross-lagged partial regression coefficients. The main advantage of this technique for the analysis of AA outcomes is that potential reciprocal causation between AA participation and drinking behavior can be explicitly modeled through the specification of finite causal lags. For the outpatient subsample (n = 952), the results strongly support the hypothesis that AA attendance leads to increases in alcohol abstinence and reduces drinking/ problems, whereas a causal effect in the reverse direction is unsupported. For the aftercare subsample (n = 774), the results are not as clear but also suggest that AA attendance leads to better outcomes. Although randomized controlled trials are the surest means of establishing causal relations between interventions and outcomes, such trials are rare in AA research for practical reasons. The current study successfully exploited the multiple data waves in Project MATCH to examine evidence of causality between AA participation and drinking outcomes. The study obtained unique statistical results supporting the effectiveness of AA primarily in the context of primary outpatient treatment for alcoholism.
Active microwave remote sensing of an anisotropic random medium layer
NASA Technical Reports Server (NTRS)
Lee, J. K.; Kong, J. A.
1985-01-01
A two-layer anisotropic random medium model has been developed to study the active remote sensing of the earth. The dyadic Green's function for a two-layer anisotropic medium is developed and used in conjunction with the first-order Born approximation to calculate the backscattering coefficients. It is shown that strong cross-polarization occurs in the single scattering process and is indispensable in the interpretation of radar measurements of sea ice at different frequencies, polarizations, and viewing angles. The effects of anisotropy on the angular responses of backscattering coefficients are also illustrated.
Modeling of Thermal Phase Noise in a Solid Core Photonic Crystal Fiber-Optic Gyroscope.
Song, Ningfang; Ma, Kun; Jin, Jing; Teng, Fei; Cai, Wei
2017-10-26
A theoretical model of the thermal phase noise in a square-wave modulated solid core photonic crystal fiber-optic gyroscope has been established, and then verified by measurements. The results demonstrate a good agreement between theory and experiment. The contribution of the thermal phase noise to the random walk coefficient of the gyroscope is derived. A fiber coil with 2.8 km length is used in the experimental solid core photonic crystal fiber-optic gyroscope, showing a random walk coefficient of 9.25 × 10 -5 deg/√h.
A scattering model for forested area
NASA Technical Reports Server (NTRS)
Karam, M. A.; Fung, A. K.
1988-01-01
A forested area is modeled as a volume of randomly oriented and distributed disc-shaped, or needle-shaped leaves shading a distribution of branches modeled as randomly oriented finite-length, dielectric cylinders above an irregular soil surface. Since the radii of branches have a wide range of sizes, the model only requires the length of a branch to be large compared with its radius which may be any size relative to the incident wavelength. In addition, the model also assumes the thickness of a disc-shaped leaf or the radius of a needle-shaped leaf is much smaller than the electromagnetic wavelength. The scattering phase matrices for disc, needle, and cylinder are developed in terms of the scattering amplitudes of the corresponding fields which are computed by the forward scattering theorem. These quantities along with the Kirchoff scattering model for a randomly rough surface are used in the standard radiative transfer formulation to compute the backscattering coefficient. Numerical illustrations for the backscattering coefficient are given as a function of the shading factor, incidence angle, leaf orientation distribution, branch orientation distribution, and the number density of leaves. Also illustrated are the properties of the extinction coefficient as a function of leaf and branch orientation distributions. Comparisons are made with measured backscattering coefficients from forested areas reported in the literature.
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Mean centering, multicollinearity, and moderators in multiple regression: The reconciliation redux.
Iacobucci, Dawn; Schneider, Matthew J; Popovich, Deidre L; Bakamitsos, Georgios A
2017-02-01
In this article, we attempt to clarify our statements regarding the effects of mean centering. In a multiple regression with predictors A, B, and A × B (where A × B serves as an interaction term), mean centering A and B prior to computing the product term can clarify the regression coefficients (which is good) and the overall model fit R 2 will remain undisturbed (which is also good).
NASA Astrophysics Data System (ADS)
Gholizadeh, H.; Robeson, S. M.
2015-12-01
Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient. PMID:26930593
Suppressor Variables: The Difference between "Is" versus "Acting As"
ERIC Educational Resources Information Center
Ludlow, Larry; Klein, Kelsey
2014-01-01
Correlated predictors in regression models are a fact of life in applied social science research. The extent to which they are correlated will influence the estimates and statistics associated with the other variables they are modeled along with. These effects, for example, may include enhanced regression coefficients for the other variables--a…
Causal Models with Unmeasured Variables: An Introduction to LISREL.
ERIC Educational Resources Information Center
Wolfle, Lee M.
Whenever one uses ordinary least squares regression, one is making an implicit assumption that all of the independent variables have been measured without error. Such an assumption is obviously unrealistic for most social data. One approach for estimating such regression models is to measure implied coefficients between latent variables for which…
Random attractor of non-autonomous stochastic Boussinesq lattice system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Min, E-mail: zhaomin1223@126.com; Zhou, Shengfan, E-mail: zhoushengfan@yahoo.com
2015-09-15
In this paper, we first consider the existence of tempered random attractor for second-order non-autonomous stochastic lattice dynamical system of nonlinear Boussinesq equations effected by time-dependent coupled coefficients and deterministic forces and multiplicative white noise. Then, we establish the upper semicontinuity of random attractors as the intensity of noise approaches zero.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Impact of Health Research Systems on Under-5 Mortality Rate: A Trend Analysis.
Yazdizadeh, Bahareh; Parsaeian, Mahboubeh; Majdzadeh, Reza; Nikooee, Sima
2016-11-26
Between 1990 and 2015, under-5 mortality rate (U5MR) declined by 53%, from an estimated rate of 91 deaths per 1000 live births to 43, globally. The aim of this study was to determine the share of health research systems in this decrease alongside other influential factors. We used random effect regression models including the 'random intercept' and 'random intercept and random slope' models to analyze the panel data from 1990 to 2010. We selected the countries with U5MRs falling between the first and third quartiles in 1990. We used both the total articles (TA) and the number of child-specific articles (CSA) as a proxy of the health research system. In order to account for the impact of other factors, measles vaccination coverage (MVC) (as a proxy of health system performance), gross domestic product (GDP), human development index (HDI), and corruption perception index (CPI) (as proxies of development), were embedded in the model. Among all the models, 'the random intercept and random slope models' had lower residuals. The same variables of CSA, HDI, and time were significant and the coefficient of CSA was estimated at -0.17; meaning, with the addition of every 100 CSA, the rate of U5MR decreased by 17 per 1000 live births. Although the number of CSA has contributed to the reduction of U5MR, the amount of its contribution is negligible compared to the countries' development. We recommend entering different types of researches into the model separately in future research and including the variable of 'exchange between knowledge generator and user.' © 2017 The Author(s); Published by Kerman University of Medical Sciences. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Random walk study of electron motion in helium in crossed electromagnetic fields
NASA Technical Reports Server (NTRS)
Englert, G. W.
1972-01-01
Random walk theory, previously adapted to electron motion in the presence of an electric field, is extended to include a transverse magnetic field. In principle, the random walk approach avoids mathematical complexity and concomitant simplifying assumptions and permits determination of energy distributions and transport coefficients within the accuracy of available collisional cross section data. Application is made to a weakly ionized helium gas. Time of relaxation of electron energy distribution, determined by the random walk, is described by simple expressions based on energy exchange between the electron and an effective electric field. The restrictive effect of the magnetic field on electron motion, which increases the required number of collisions per walk to reach a terminal steady state condition, as well as the effect of the magnetic field on electron transport coefficients and mean energy can be quite adequately described by expressions involving only the Hall parameter.
Tay, Laura; Lim, Wee Shiong; Chan, Mark; Ali, Noorhazlina; Chong, Mei Sian
2016-01-01
Gait disorders are common in early dementia, with particularly pronounced dual-task deficits, contributing to the increased fall risk and mobility decline associated with cognitive impairment. This study examines the effects of a combined cognitive stimulation and physical exercise programme (MINDVital) on gait performance under single- and dual-task conditions in older adults with mild dementia. Thirty-nine patients with early dementia participated in a multi-disciplinary rehabilitation programme comprising both physical exercise and cognitive stimulation. The programme was conducted in 8-week cycles with participants attending once weekly, and all participants completed 2 successive cycles. Cognitive, functional performance and behavioural symptoms were assessed at baseline and at the end of each 8-week cycle. Gait speed was examined under both single- (Timed Up and Go and 6-metre walk tests) and dual-task (animal category and serial counting) conditions. A random effects model was performed for the independent effect of MINDVital on the primary outcome variable of gait speed under dual-task conditions. The mean age of patients enroled in the rehabilitation programme was 79 ± 6.2 years; 25 (64.1%) had a diagnosis of Alzheimer's dementia, and 26 (66.7%) were receiving a cognitive enhancer therapy. There was a significant improvement in cognitive performance [random effects coefficient (standard error) = 0.90 (0.31), p = 0.003] and gait speed under both dual-task situations [animal category: random effects coefficient = 0.04 (0.02), p = 0.039; serial counting: random effects coefficient = 0.05 (0.02), p = 0.013], with reduced dual-task cost for gait speed [serial counting: random effects coefficient = -4.05 (2.35), p = 0.086] following successive MINDVital cycles. No significant improvement in single-task gait speed was observed. Improved cognitive performance over time was a significant determinant of changes in dual-task gait speed [random effects coefficients = 0.01 (0.005), p = 0.048, and 0.02 (0.005), p = 0.003 for category fluency and counting backwards, respectively]. A combined physical and cognitive rehabilitation programme leads to significant improvements in dual-task walking in early dementia, which may be contributed by improvement in cognitive performance, as single-task gait performance remained stable. © 2016 S. Karger AG, Basel.
ERIC Educational Resources Information Center
Strobl, Carolin; Malley, James; Tutz, Gerhard
2009-01-01
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
An Empirical Comparison of Randomized Control Trials and Regression Discontinuity Estimations
ERIC Educational Resources Information Center
Barrera-Osorio, Felipe; Filmer, Deon; McIntyre, Joe
2014-01-01
Randomized controlled trials (RCTs) and regression discontinuity (RD) studies both provide estimates of causal effects. A major difference between the two is that RD only estimates local average treatment effects (LATE) near the cutoff point of the forcing variable. This has been cited as a drawback to RD designs (Cook & Wong, 2008).…
Davies, H T; Leslie, G; Pereira, S M; Webb, S A R
2008-03-01
To determine if circuit life is influenced by a higher pre-dilution volume used in CVVH when compared with a lower pre-dilution volume approach in CVVHDF. A comparative crossover study. Cases were randomized to receive either CVVH or CVVHDF followed by the alternative treatment. All patients >or= 18 yrs of age who required CRRT while in ICU were eligible to participate, but excluded if coagulopathic, thrombocytopenic or unable to receive heparin. Based on an intention-to-treat, 45 patients were randomized to receive either CVVH or CVVHDF followed by the alternative treatment. A 24-bed, tertiary, medical and surgical adult intensive care unit (ICU). Blood flow rate, vascular access device and insertion site, hemofilter, anticoagulation and machine hardware were standardized. An ultrafiltrate dose of 35 ml/ kg/h delivered pre-filter was used for CVVH. A fixed pre-dilution volume of 600 mls/h with a dialysate dose of 1 L was used for CVVHDF. Thirty-one patients received CVVH or CVVHDF out of 45 participants followed by the alternative technique. There was a significant increase in circuit life in favor of CVVHDF (median=16 h 5 min, range=40 h 23 min) compared with CVVH (median=6 h 35 min, range=30 h 45 min). A Mann-Whitney U test was performed to compare circuit life between the two different CRRT modes (Z=-3.478, p<0.001). Measurements of circuit life on the 93 circuits which survived to clotting (50 CVVH and 43 CVVHDF) were log transformed prior to under taking a standard multiple regression analysis. None of the independent variables - activated prothrombin time (aPTT), platelet count, heparin dose, patient hematocrit or urea - had a coefficient partial correlation >0.09 (coefficient of the determination=0.117) or a linear relationship which could be associated with circuit life (p=0.228). Pre-diluted CVVHDF appeared to have a longer circuit life when compared to high volume pre-diluted CVVH. The choice of CRRT mode may be an important independent determinant of circuit life.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
The importance of regional models in assessing canine cancer incidences in Switzerland
Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina
2018-01-01
Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships. PMID:29652921
The importance of regional models in assessing canine cancer incidences in Switzerland.
Boo, Gianluca; Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina
2018-01-01
Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships.
Phung, Dung; Tran, Phu Dac; Nguyen, Lien Huong; Do, Cuong Manh; Rutherford, Shannon; Chu, Cordia
2017-12-01
To address to burden of infectious diseases such as diarrhoea, the Vietnamese government has enacted the Law on Prevention and Control of Infectious Diseases (LPCIDs) since July 2008. However, no evaluation of the impact of the LPCID has been conducted. This study aims to evaluate the impact of the LPCID on diarrhoeal control for the 5 years following the implementation of LPCID in Vietnam. We used an interrupted time series design using a segmented regression analysis to estimate the 'province-level' impact of LPCID and then used random-effect meta-analysis to estimate the pooled effect sizes of the 'country-level' impact of LPCID on diarrhoeal control throughout Vietnam. The results show that the impacts varied by provinces. They were classified in four groups: 'positive impact, positive impact without sustainability, possibly positive impact, no or negative impact' of the LPCID. The meta-analysis indicated that the country-level impact of the LPCID became significant at 11 months after the LPCID took effect, with a decrease in level of diarrhoea of 9.7% (coefficient, -0.097; 95% CI: -19.1 to - 0.002) and a permanent downward trend of diarrhoea at a rate of 1.1% per month (coefficient, -0.011; 95% CI: -0.02 to - 0.003); whereas the trend in diarrhoea before the LPCID took effect was unchanging (coefficient, 0.002; 95% CI, 0-0.004). At 12, 24, 36, 48 and 60 months following the LPCID implementation date the levels of diarrhoea decreased by 10.9% (coefficient, -0.109; 95% CI: -0.203 to - 0.015), P < 0.01), 21.8% (coefficient, -0.218; 95% CI: -0.338 to - 0.098), P < 0.01), 31% (coefficient, -0.31; 95% CI: -0.474 to - 0.145), P < 0.01), 46.8% (coefficient, -0.468; 95% CI: -0.667 to - 0.27), P < 0.01), 48.2% (coefficient, -0.482; 95% CI: -0.708 to - 0.256), P < 0.01) respectively. The findings of this study reveal the effectiveness of the LPCID in reducing diarrhoea incidence in Vietnam. However, further studies should be conducted to better understanding the cost-effectiveness, acceptability, and sustainability of each component of the LPCID. © The Author 2017. Published by Oxford University Press in association with The London School of Hygiene and Tropical Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Zer, Matan; Lindner, Arie; Greenstein, Alexander; Leibovici, Dan
2011-07-01
Academic careers of individual doctors are commonly evaluated by examining the number and quality of authored publications. Similarly, the extent and quality of medical research may be assessed nationwide by measuring the number of publications originating from the country of interest over time. This in turn, may indicate on the quality of medicine practiced. To evaluate the extent and quality of IsraeLi publications we measured the rate and quality of medical publications originating from Israel for two decades in the fields of urology, cardiology and orthopedics, and compared the data to those of other countries. Leading journals in urology, cardiology, and orthopedics were selected. A Medline search (http://www.ncbi.ntm.nih.gov/sites/entrez] was conducted for all the publications originating in Israel between the years 1990-2009 in the selected journals. Data from Israel was compared to those from Italy, France, Germany, Egypt and Turkey. The change in rate of publications was tested using Linear regression. The quality of publications was calculated by multiplying the number of publications by the relevant impact factor. While the urology publications rate in Israel increased by 32.7% in the second study decade as compared with the first, the urology publication rates during the same time period from Italy, France, Germany, Egypt and Turkey were 199%, 115%, 184%, 180% and 227% respectively. The regression coefficient for the urology publication rate was 0.51 for Israel, and 0.78, 0.95, 0.78, 0.87 and 0.97 for the other countries, respectively. The regression coefficient for the change in the quality of publications from Israel was 0.31 and 0.81, 0.75, 0.92, 0.73, and 0.92 for the other countries, respectively. In cardiology, the Israeli publication rate increased by 26% during the second study decade, whereas in the other countries the increments were 46%, 35%, 76%, 80% and 309% respectively. The regression coefficient for Israeli pubLication rate was 0.45, and 0.78, 0.54, 0.62, 0.13 and 0.75 for the other countries, respectively. The regression coefficient of the quality of publications in Israel was 0.3 as opposed to 0.47, 0.36, 0.48, 0.01, and 0.78 respectively. The Israeli publications in orthopedics increased by 9.3% during the second decade compared with the first. At the same time, other countries increased the publication rate in orthopedics by 69%, 121%, 173%, 140% and 296% respectively. The regression coefficient for the publication rate in orthopedics was 0.02 for Israel, and 0.62, 0.64, 0.78, 0.34 and 0.71 for the other countries, respectively. The regression coefficient of the quality of publications in Israel was 0.05 as opposed to 0.67, 0.62, 0.75, 0.31, and 0.66 in the other countries, respectively. Israel lags behind Italy, France, Germany, Egypt and Turkey with regard to the increase of both the number and the quality of medical publications in urology and orthopedics. While the rate and quality of IsraeLi publications in cardiology surpasses those from Egypt, they lag in the number of publications in this medical field behind those of all the rest of the countries examined. In a world of rapid progress and expansion of medical research, Israel has been stagnant in publications in 3 medical specialties, rendering it inferior to other nations.
Large signal-to-noise ratio quantification in MLE for ARARMAX models
NASA Astrophysics Data System (ADS)
Zou, Yiqun; Tang, Xiafei
2014-06-01
It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.
Effects of Medical Insurance on the Health Status and Life Satisfaction of the Elderly
GU, Liubao; FENG, Huihui; JIN, Jian
2017-01-01
Background: Population aging has become increasingly serious in China. The demand for medical insurance of the elderly is increasing, and their health status and life satisfaction are becoming significant issues. This study investigates the effects of medical insurance on the health status and life satisfaction of the elderly. Methods: The national baseline survey data of the China Health and Retirement Longitudinal Survey in 2013 were adopted. The Ordered Probit Model was established. The effects of the medical insurance for urban employees, medical insurance for urban residents, and new rural cooperative medical insurance on the health status and life satisfaction of the elderly were investigated. Results: Medical insurance could facilitate the improvement of the health status and life satisfaction of the elderly. Accordingly, the health status and life satisfaction of the elderly who have medical insurance for urban residents improved significantly. The regression coefficients were 0.348 and 0.307. The corresponding regression coefficients of the medical insurance for urban employees were 0.189 and 0.236. The regression coefficients of the new rural cooperative medical insurance were 0.170 and 0.188. Conclusion: Medical insurance can significantly improve the health status and life satisfaction of the elderly. This development is of immense significance for the formulation of equal medical security. PMID:29026784
Periodontal disease in children and adolescents with type 1 diabetes in Serbia.
Dakovic, Dragana; Pavlovic, Milos D
2008-06-01
The purpose of this study was to evaluate periodontal health in young patients with type 1 diabetes mellitus in Serbia. Periodontal disease was clinically assessed and compared in 187 children and adolescents (6 to 18 years of age) with type 1 diabetes mellitus and 178 control subjects without diabetes. Children and adolescents with type 1 diabetes mellitus had significantly more plaque, gingival inflammation, and periodontal destruction than control subjects. The main risk factors for periodontitis were diabetes (odds ratio [OR] = 2.78; 95% confidence interval [CI]: 1.42 to 5.44), bleeding/plaque ratio (OR = 1.25; 95% CI: 1.06 to 1.48), and age (OR = 1.10; 95% CI: 1.01 to 1.21). In case subjects, the number of teeth affected by periodontal destruction was associated with mean hemoglobin A1c (regression coefficient 0.17; P = 0.026), duration of diabetes (regression coefficient 0.19; P = 0.021), and bleeding/plaque ratio (regression coefficient 0.17; P = 0.021). Compared to children and adolescents without diabetes, periodontal disease is more prevalent and widespread in children and adolescents with type 1 diabetes mellitus and depends on the duration of disease, metabolic control, and the severity of gingival inflammation. Gingival inflammation in young patients with diabetes is more evident and more often results in periodontal destruction.
Social problem-solving among adolescents treated for depression.
Becker-Weidman, Emily G; Jacobs, Rachel H; Reinecke, Mark A; Silva, Susan G; March, John S
2010-01-01
Studies suggest that deficits in social problem-solving may be associated with increased risk of depression and suicidality in children and adolescents. It is unclear, however, which specific dimensions of social problem-solving are related to depression and suicidality among youth. Moreover, rational problem-solving strategies and problem-solving motivation may moderate or predict change in depression and suicidality among children and adolescents receiving treatment. The effect of social problem-solving on acute treatment outcomes were explored in a randomized controlled trial of 439 clinically depressed adolescents enrolled in the Treatment for Adolescents with Depression Study (TADS). Measures included the Children's Depression Rating Scale-Revised (CDRS-R), the Suicidal Ideation Questionnaire--Grades 7-9 (SIQ-Jr), and the Social Problem-Solving Inventory-Revised (SPSI-R). A random coefficients regression model was conducted to examine main and interaction effects of treatment and SPSI-R subscale scores on outcomes during the 12-week acute treatment stage. Negative problem orientation, positive problem orientation, and avoidant problem-solving style were non-specific predictors of depression severity. In terms of suicidality, avoidant problem-solving style and impulsiveness/carelessness style were predictors, whereas negative problem orientation and positive problem orientation were moderators of treatment outcome. Implications of these findings, limitations, and directions for future research are discussed. Copyright 2009 Elsevier Ltd. All rights reserved.
Zamani-Alavijeh, Fereshteh; Mojadam, Mehdi
2017-01-01
Background Pediculosis is a common parasitic infestation in students worldwide, including Iran. This condition is more prevalent in populous and deprived communities with poor personal hygiene. This study sought to assess the efficacy of peer education for adopting preventive behaviors against pediculosis in female elementary school students based on the Health Belief Model (HBM). Methods A total of 179 female fifth grade students were selected using multistage random sampling and were randomly allocated to control and intervention groups. A standard questionnaire was designed and administered to collect baseline information. An educational intervention was then designed based on the conducted needs assessment. The educational program consisted of three sessions, held by peers for the intervention group. The questionnaire was re-administered one month after the intervention. Independent and paired t-test, Pearson’s correlation coefficient, and regression analysis were applied as appropriate. Results The two groups had no significant differences in the scores of knowledge, HBM constructs, or behavior before the intervention. After the intervention, however, the mean scores of all parameters significantly improved in the intervention group. Conclusion Peer education based on HBM is an effective strategy to promote preventive behaviors against pediculosis in among fifth grade female elementary school students in Iran. PMID:28072852
Urinary isoflavonoid excretion as a biomarker of dietary soy intake during two randomized soy trials
Morimoto, Yukiko; Beckford, Fanchon; Franke, Adrian A.; Maskarinec, Gertraud
2014-01-01
We evaluated urinary isoflavonoid excretion as a biomarker of dietary isoflavone intake during two randomized soy trials (13–24 months) among 256 premenopausal women with a total of 1,385 repeated urine samples. Participants consumed a high-soy diet (2 servings/day) and a low-soy diet (<3 servings/week), completed 7 unannounced 24-hour dietary recalls, and donated repeated urine samples, which were analyzed for isoflavonoid excretion by liquid chromatography methods. We computed correlation coefficients and applied logistic regression to estimate the area under the curve. Median daily dietary isoflavone intakes at baseline, during low- and high-soy diet were 0.5, 0.2, and 67.7 mg aglycone equivalents, respectively. The corresponding urinary isoflavonoid excretion values were 0.9, 1.1, and 43.9 nmol/mg creatinine. Across diets, urinary isoflavonoid excretion was significantly associated with dietary isoflavone intake (rs=0.51, AUC=0.85; p<0.0001) but not within diet periods (rs=0.05–0.06, AUC=0.565–0.573). Urinary isoflavonoid excretion is an excellent biomarker to discriminate between low- and high-soy diets across populations, but the association with dietary isoflavone intake is weak when the range of soy intake is small. PMID:24901088
Application of the theory of reasoned action to environmental behaviors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duquette, R.D.
The applicability of Ajzen and Fishbein's Theory of Reasoned Action (1980) to environmental behaviors was examined. Trained interviewers conducted a telephone survey employing random digit dialing with random selection of individuals within households; 388 individuals completed interviews. A preliminary study was conducted to identify salient outcomes (advantages and disadvantages), referents (individuals or groups), and activities associated with protecting the environment. The main study questionnaire was based upon the most frequently identified outcomes, referents, and activities, using the procedures of Ajzen and Fishbein (1980). A pilot test indicated that the ..cap alpha.. coefficients of all subscales were greater than .70. Inmore » addition to the theory variables, the external variables of occupation and education were assessed. The relationships between the theory variables were examined using correlational and multiple regression techniques. Though weaker than in previous studies, all the theoretical relationships were in the hypothesized direction. Analysis of variance, used to examine the external variables, found significant differences among occupational groups and educational levels with regard to intention to protect the environment. Polluters scored lower on intention than individuals with non-polluting or not applicable occupations. Individuals with a high school diploma or less were lower on intention and were significantly less favorable toward protecting the environment than those with some college or a college degree.« less
Mazo Lopera, Mauricio A; Coombes, Brandon J; de Andrade, Mariza
2017-09-27
Gene-environment (GE) interaction has important implications in the etiology of complex diseases that are caused by a combination of genetic factors and environment variables. Several authors have developed GE analysis in the context of independent subjects or longitudinal data using a gene-set. In this paper, we propose to analyze GE interaction for discrete and continuous phenotypes in family studies by incorporating the relatedness among the relatives for each family into a generalized linear mixed model (GLMM) and by using a gene-based variance component test. In addition, we deal with collinearity problems arising from linkage disequilibrium among single nucleotide polymorphisms (SNPs) by considering their coefficients as random effects under the null model estimation. We show that the best linear unbiased predictor (BLUP) of such random effects in the GLMM is equivalent to the ridge regression estimator. This equivalence provides a simple method to estimate the ridge penalty parameter in comparison to other computationally-demanding estimation approaches based on cross-validation schemes. We evaluated the proposed test using simulation studies and applied it to real data from the Baependi Heart Study consisting of 76 families. Using our approach, we identified an interaction between BMI and the Peroxisome Proliferator Activated Receptor Gamma ( PPARG ) gene associated with diabetes.
Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung
2012-07-01
In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
Single-image super-resolution based on Markov random field and contourlet transform
NASA Astrophysics Data System (ADS)
Wu, Wei; Liu, Zheng; Gueaieb, Wail; He, Xiaohai
2011-04-01
Learning-based methods are well adopted in image super-resolution. In this paper, we propose a new learning-based approach using contourlet transform and Markov random field. The proposed algorithm employs contourlet transform rather than the conventional wavelet to represent image features and takes into account the correlation between adjacent pixels or image patches through the Markov random field (MRF) model. The input low-resolution (LR) image is decomposed with the contourlet transform and fed to the MRF model together with the contourlet transform coefficients from the low- and high-resolution image pairs in the training set. The unknown high-frequency components/coefficients for the input low-resolution image are inferred by a belief propagation algorithm. Finally, the inverse contourlet transform converts the LR input and the inferred high-frequency coefficients into the super-resolved image. The effectiveness of the proposed method is demonstrated with the experiments on facial, vehicle plate, and real scene images. A better visual quality is achieved in terms of peak signal to noise ratio and the image structural similarity measurement.
PET-CT image fusion using random forest and à-trous wavelet transform.
Seal, Ayan; Bhattacharjee, Debotosh; Nasipuri, Mita; Rodríguez-Esparragón, Dionisio; Menasalvas, Ernestina; Gonzalo-Martin, Consuelo
2018-03-01
New image fusion rules for multimodal medical images are proposed in this work. Image fusion rules are defined by random forest learning algorithm and a translation-invariant à-trous wavelet transform (AWT). The proposed method is threefold. First, source images are decomposed into approximation and detail coefficients using AWT. Second, random forest is used to choose pixels from the approximation and detail coefficients for forming the approximation and detail coefficients of the fused image. Lastly, inverse AWT is applied to reconstruct fused image. All experiments have been performed on 198 slices of both computed tomography and positron emission tomography images of a patient. A traditional fusion method based on Mallat wavelet transform has also been implemented on these slices. A new image fusion performance measure along with 4 existing measures has been presented, which helps to compare the performance of 2 pixel level fusion methods. The experimental results clearly indicate that the proposed method outperforms the traditional method in terms of visual and quantitative qualities and the new measure is meaningful. Copyright © 2017 John Wiley & Sons, Ltd.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Analysis of oscillatory motion of a light airplane at high values of lift coefficient
NASA Technical Reports Server (NTRS)
Batterson, J. G.
1983-01-01
A modified stepwise regression is applied to flight data from a light research air-plane operating at high angles at attack. The well-known phenomenon referred to as buckling or porpoising is analyzed and modeled using both power series and spline expansions of the aerodynamic force and moment coefficients associated with the longitudinal equations of motion.
ERIC Educational Resources Information Center
Longford, Nicholas T.
Operational procedures for the Graduate Record Examinations Validity Study Service are reviewed, with emphasis on the problem of frequent occurrence of negative coefficients in the fitted within-department regressions obtained by the empirical Bayes method of H. I. Braun and D. Jones (1985). Several alterations of the operational procedures are…
Silva, F G; Torres, R A; Brito, L F; Euclydes, R F; Melo, A L P; Souza, N O; Ribeiro, J I; Rodrigues, M T
2013-12-11
The objective of this study was to identify the best random regression model using Legendre orthogonal polynomials to evaluate Alpine goats genetically and to estimate the parameters for test day milk yield. On the test day, we analyzed 20,710 records of milk yield of 667 goats from the Goat Sector of the Universidade Federal de Viçosa. The evaluated models had combinations of distinct fitting orders for polynomials (2-5), random genetic (1-7), and permanent environmental (1-7) fixed curves and a number of classes for residual variance (2, 4, 5, and 6). WOMBAT software was used for all genetic analyses. A random regression model using the best Legendre orthogonal polynomial for genetic evaluation of milk yield on the test day of Alpine goats considered a fixed curve of order 4, curve of genetic additive effects of order 2, curve of permanent environmental effects of order 7, and a minimum of 5 classes of residual variance because it was the most economical model among those that were equivalent to the complete model by the likelihood ratio test. Phenotypic variance and heritability were higher at the end of the lactation period, indicating that the length of lactation has more genetic components in relation to the production peak and persistence. It is very important that the evaluation utilizes the best combination of fixed, genetic additive and permanent environmental regressions, and number of classes of heterogeneous residual variance for genetic evaluation using random regression models, thereby enhancing the precision and accuracy of the estimates of parameters and prediction of genetic values.
Selapa, N W; Nephawe, K A; Maiwashe, A; Norris, D
2012-02-08
The aim of this study was to estimate genetic parameters for body weights of individually fed beef bulls measured at centralized testing stations in South Africa using random regression models. Weekly body weights of Bonsmara bulls (N = 2919) tested between 1999 and 2003 were available for the analyses. The model included a fixed regression of the body weights on fourth-order orthogonal Legendre polynomials of the actual days on test (7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, and 84) for starting age and contemporary group effects. Random regressions on fourth-order orthogonal Legendre polynomials of the actual days on test were included for additive genetic effects and additional uncorrelated random effects of the weaning-herd-year and the permanent environment of the animal. Residual effects were assumed to be independently distributed with heterogeneous variance for each test day. Variance ratios for additive genetic, permanent environment and weaning-herd-year for weekly body weights at different test days ranged from 0.26 to 0.29, 0.37 to 0.44 and 0.26 to 0.34, respectively. The weaning-herd-year was found to have a significant effect on the variation of body weights of bulls despite a 28-day adjustment period. Genetic correlations amongst body weights at different test days were high, ranging from 0.89 to 1.00. Heritability estimates were comparable to literature using multivariate models. Therefore, random regression model could be applied in the genetic evaluation of body weight of individually fed beef bulls in South Africa.
Calibrating random forests for probability estimation.
Dankowski, Theresa; Ziegler, Andreas
2016-09-30
Probabilities can be consistently estimated using random forests. It is, however, unclear how random forests should be updated to make predictions for other centers or at different time points. In this work, we present two approaches for updating random forests for probability estimation. The first method has been proposed by Elkan and may be used for updating any machine learning approach yielding consistent probabilities, so-called probability machines. The second approach is a new strategy specifically developed for random forests. Using the terminal nodes, which represent conditional probabilities, the random forest is first translated to logistic regression models. These are, in turn, used for re-calibration. The two updating strategies were compared in a simulation study and are illustrated with data from the German Stroke Study Collaboration. In most simulation scenarios, both methods led to similar improvements. In the simulation scenario in which the stricter assumptions of Elkan's method were not met, the logistic regression-based re-calibration approach for random forests outperformed Elkan's method. It also performed better on the stroke data than Elkan's method. The strength of Elkan's method is its general applicability to any probability machine. However, if the strict assumptions underlying this approach are not met, the logistic regression-based approach is preferable for updating random forests for probability estimation. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.
2010-01-01
Background Whilst patellofemoral pain is one of the most common musculoskeletal disorders presenting to orthopaedic clinics, sports clinics, and general practices, factors contributing to its development in the absence of a defined arthropathy, such as osteoarthritis (OA), are unclear. The aim of this cross-sectional study was to describe the relationships between parameters of patellofemoral geometry (patella inclination, sulcus angle and patella height) and knee pain and patella cartilage volume. Methods 240 community-based adults aged 25-60 years were recruited to take part in a study of obesity and musculoskeletal health. Magnetic resonance imaging (MRI) of the dominant knee was used to determine the lateral condyle-patella angle, sulcus angle, and Insall-Salvati ratio, as well as patella cartilage and bone volumes. Pain was assessed by the Western Ontario and McMaster University Osteoarthritis Index (WOMAC) VA pain subscale. Results Increased lateral condyle-patella angle (increased medial patella inclination) was associated with a reduction in WOMAC pain score (Regression coefficient -1.57, 95% CI -3.05, -0.09) and increased medial patella cartilage volume (Regression coefficient 51.38 mm3, 95% CI 1.68, 101.08 mm3). Higher riding patella as indicated by increased Insall-Salvati ratio was associated with decreased medial patella cartilage volume (Regression coefficient -3187 mm3, 95% CI -5510, -864 mm3). There was a trend for increased lateral patella cartilage volume associated with increased (shallower) sulcus angle (Regression coefficient 43.27 mm3, 95% CI -2.43, 88.98 mm3). Conclusion These results suggest both symptomatic and structural benefits associated with a more medially inclined patella while a high-riding patella may be detrimental to patella cartilage. This provides additional theoretical support for the current use of corrective strategies for patella malalignment that are aimed at medial patella translation, although longitudinal studies will be needed to further substantiate this. PMID:20459700
Tanamas, Stephanie K; Teichtahl, Andrew J; Wluka, Anita E; Wang, Yuanyuan; Davies-Tuck, Miranda; Urquhart, Donna M; Jones, Graeme; Cicuttini, Flavia M
2010-05-10
Whilst patellofemoral pain is one of the most common musculoskeletal disorders presenting to orthopaedic clinics, sports clinics, and general practices, factors contributing to its development in the absence of a defined arthropathy, such as osteoarthritis (OA), are unclear.The aim of this cross-sectional study was to describe the relationships between parameters of patellofemoral geometry (patella inclination, sulcus angle and patella height) and knee pain and patella cartilage volume. 240 community-based adults aged 25-60 years were recruited to take part in a study of obesity and musculoskeletal health. Magnetic resonance imaging (MRI) of the dominant knee was used to determine the lateral condyle-patella angle, sulcus angle, and Insall-Salvati ratio, as well as patella cartilage and bone volumes. Pain was assessed by the Western Ontario and McMaster University Osteoarthritis Index (WOMAC) VA pain subscale. Increased lateral condyle-patella angle (increased medial patella inclination) was associated with a reduction in WOMAC pain score (Regression coefficient -1.57, 95% CI -3.05, -0.09) and increased medial patella cartilage volume (Regression coefficient 51.38 mm3, 95% CI 1.68, 101.08 mm3). Higher riding patella as indicated by increased Insall-Salvati ratio was associated with decreased medial patella cartilage volume (Regression coefficient -3187 mm3, 95% CI -5510, -864 mm3). There was a trend for increased lateral patella cartilage volume associated with increased (shallower) sulcus angle (Regression coefficient 43.27 mm3, 95% CI -2.43, 88.98 mm3). These results suggest both symptomatic and structural benefits associated with a more medially inclined patella while a high-riding patella may be detrimental to patella cartilage. This provides additional theoretical support for the current use of corrective strategies for patella malalignment that are aimed at medial patella translation, although longitudinal studies will be needed to further substantiate this.
ERIC Educational Resources Information Center
Moss, Brian G.; Yeaton, William H.; Lloyd, Jane E.
2014-01-01
Using a novel design approach, a randomized experiment (RE) was embedded within a regression discontinuity (RD) design (R-RE-D) to evaluate the impact of developmental mathematics at a large midwestern college ("n" = 2,122). Within a region of uncertainty near the cut-score, estimates of benefit from a prospective RE were closely…
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
ERIC Educational Resources Information Center
Vasu, Ellen Storey
1978-01-01
The effects of the violation of the assumption of normality in the conditional distributions of the dependent variable, coupled with the condition of multicollinearity upon the outcome of testing the hypothesis that the regression coefficient equals zero, are investigated via a Monte Carlo study. (Author/JKS)
ERIC Educational Resources Information Center
Marland, Eric; Bossé, Michael J.; Rhoads, Gregory
2018-01-01
Rounding is a necessary step in many mathematical processes. We are taught early in our education about significant figures and how to properly round a number. So when we are given a data set and asked to find a regression line, we are inclined to offer the line with rounded coefficients to reflect our model. However, the effects are not as…
Modeling maximum daily temperature using a varying coefficient regression model
Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith
2014-01-01
Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...
NASA Astrophysics Data System (ADS)
Huttunen, Jani; Kokkola, Harri; Mielonen, Tero; Esa Juhani Mononen, Mika; Lipponen, Antti; Reunanen, Juha; Vilhelm Lindfors, Anders; Mikkonen, Santtu; Erkki Juhani Lehtinen, Kari; Kouremeti, Natalia; Bais, Alkiviadis; Niska, Harri; Arola, Antti
2016-07-01
In order to have a good estimate of the current forcing by anthropogenic aerosols, knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from the 1990s onward. One option to lengthen the AOD time series beyond the 1990s is to retrieve AOD from surface solar radiation (SSR) measurements taken with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a non-linear regression method and four machine learning methods (Gaussian process, neural network, random forest and support vector machine) with AOD observations carried out with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and non-linear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the random forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval, whereas the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during the observation period.
NASA Astrophysics Data System (ADS)
Valyaev, A. B.; Krivoshlykov, S. G.
1989-06-01
It is shown that the problem of investigating the mode composition of a partly coherent radiation beam in a randomly inhomogeneous medium can be reduced to a study of evolution of the energy of individual modes and of the coefficients of correlations between the modes. General expressions are obtained for the coupling coefficients of modes in a parabolic waveguide with a random microbending of the axis and an analysis is made of their evolution as a function of the excitation conditions. An estimate is obtained of the distance in which a steady-state energy distribution between the modes is established. Explicit expressions are obtained for the correlation function in the case when a waveguide is excited by off-axial Gaussian beams or Gauss-Hermite modes.
Modeling of Thermal Phase Noise in a Solid Core Photonic Crystal Fiber-Optic Gyroscope
Song, Ningfang; Ma, Kun; Jin, Jing; Teng, Fei; Cai, Wei
2017-01-01
A theoretical model of the thermal phase noise in a square-wave modulated solid core photonic crystal fiber-optic gyroscope has been established, and then verified by measurements. The results demonstrate a good agreement between theory and experiment. The contribution of the thermal phase noise to the random walk coefficient of the gyroscope is derived. A fiber coil with 2.8 km length is used in the experimental solid core photonic crystal fiber-optic gyroscope, showing a random walk coefficient of 9.25 × 10−5 deg/h. PMID:29072605
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Does oral alprazolam affect ventilation? A randomised, double-blind, placebo-controlled trial.
Carraro, G E; Russi, E W; Buechi, S; Bloch, Konrad E
2009-05-01
The respiratory effects of benzodiazepines have been controversial. This investigation aimed to study the effects of oral alprazolam on ventilation. In a randomised, double-blind cross-over protocol, 20 healthy men ingested 1 mg of alprazolam or placebo in random order, 1 week apart. Ventilation was unobtrusively monitored by inductance plethysmography along with end-tidal PCO(2) and pulse oximetry 60-160 min after drug intake. Subjects were encouraged to keep their eyes open. Mean +/- SD minute ventilation 120 min after alprazolam and placebo was similar (6.21 +/- 0.71 vs 6.41 +/- 1.12 L/min, P = NS). End-tidal PCO(2) and oxygen saturation did also not differ between treatments. However, coefficients of variation of minute ventilation after alprazolam exceeded those after placebo (43 +/- 23% vs 31 +/- 13%, P < 0.05). More encouragements to keep the eyes open were required after alprazolam than after placebo (5.2 +/- 5.7 vs 1.3 +/- 2.3 calls, P < 0.05). In a multiple regression analysis, higher coefficients of variation of minute ventilation after alprazolam were related to a greater number of calls. Oral alprazolam in a mildly sedative dose has no clinically relevant effect on ventilation in healthy, awake men. The increased variability of ventilation on alprazolam seems related to vigilance fluctuations rather than to a direct drug effect on ventilation.
NASA Astrophysics Data System (ADS)
Meng, Q. Y.; Svendsgaard, D.; Kotchmar, D. J.; Pinto, J. P.
2012-09-01
Although positive associations between ambient NO2 concentrations and personal exposures have generally been found by exposure studies, the strength of the associations varied among studies. Differences in results could be related to differences in study design and in exposure factors. However, the effects of study design, exposure factors, and sampling and measurement errors on the strength of the personal-ambient associations have not been evaluated quantitatively in a systematic manner. A quantitative research synthesis was conducted to examine these issues based on peer-reviewed publications in the past 30 years. Factors affecting the strength of the personal-ambient associations across the studies were also examined with meta-regression. Ambient NO2 was found to be significantly associated with personal NO2 exposures, with estimates of 0.42, 0.16, and 0.72 for overall pooled, longitudinal and daily average correlation coefficients based on random-effects meta-analysis. This conclusion was robust after correction for publication bias with correlation coefficients of 0.37, 0.16 and 0.45. We found that season and some population characteristics, such as pre-existing disease, were significant factors affecting the strength of the personal-ambient associations. More meaningful and rigorous comparisons would be possible if greater detail were published on the study design (e.g. local and indoor sources, housing characteristics, etc.) and data quality (e.g., detection limits and percent of data above detection limits).
Meylan, César M P; Cronin, John B; Oliver, Jon L; Hughes, Michael M G; Jidovtseff, Boris; Pinder, Shane
2015-03-01
The purpose of this study was to quantify the inter-session reliability of force-velocity-power profiling and estimated maximal strength in youth. Thirty-six males (11-15 years old) performed a ballistic supine leg press test at five randomized loads (80%, 100%, 120%, 140%, and 160% body mass) on three separate occasions. Peak and mean force, power, velocity, and peak displacement were collected with a linear position transducer attached to the weight stack. Mean values at each load were used to calculate different regression lines and estimate maximal strength, force, velocity, and power. All variables were found reliable (change in the mean [CIM] = - 1 to 14%; coefficient of variation [CV] = 3-18%; intraclass correlation coefficient [ICC] = 0.74-0.99), but were likely to benefit from a familiarization, apart from the unreliable maximal force/velocity ratio (CIM = 0-3%; CV = 23-25%; ICC = 0.35-0.54) and load at maximal power (CIM = - 1 to 2%; CV = 10-13%; ICC = 0.26-0.61). Isoinertial force-velocity-power profiling and maximal strength in youth can be assessed after a familiarization session. Such profiling may provide valuable insight into neuromuscular capabilities during growth and maturation and may be used to monitor specific training adaptations.
Lu, Jia-hui; Zhang, Yi-bo; Zhang, Zhuo-yong; Meng, Qing-fan; Guo, Wei-liang; Teng, Li-rong
2008-06-01
A calibration model (WT-RBFNN) combination of wavelet transform (WT) and radial basis function neural network (RBFNN) was proposed for synchronous and rapid determination of rifampicin and isoniazide in Rifampicin and Isoniazide tablets by near infrared reflectance spectroscopy (NIRS). The approximation coefficients were used for input data in RBFNN. The network parameters including the number of hidden layer neurons and spread constant (SC) were investigated. WT-RBFNN model which compressed the original spectra data, removed the noise and the interference of background, and reduced the randomness, the capabilities of prediction were well optimized. The root mean square errors of prediction (RMSEP) for the determination of rifampicin and isoniazide obtained from the optimum WT-RBFNN model are 0.00639 and 0.00587, and the root mean square errors of cross-calibration (RMSECV) for them are 0.00604 and 0.00457, respectively which are superior to those obtained by the optimum RBFNN and PLS models. Regression coefficient (R) between NIRS predicted values and RP-HPLC values for rifampicin and isoniazide are 0.99522 and 0.99392, respectively and the relative error is lower than 2.300%. It was verified that WT-RBFNN model is a suitable approach to dealing with NIRS. The proposed WT-RBFNN model is convenient, and rapid and with no pollution for the determination of rifampicin and isoniazide tablets.
Mohamadirizi, Soheila; Kordi, Masoumeh
2013-01-01
Background: Menstruation signs are among the most common disorders in adolescents and are influenced by various environmental and psychosocial factors. This study aimed to define the association between menstruation signs and anxiety, depression, and stress in school girls in Mashhad in 2011-2012. Materials and Methods: This was a cross-sectional study on 407 high school girls in Mashhad who were selected through two-step random sampling. The students completed a questionnaire concerning demographic characteristics, menstruation, Depression, Anxiety, and Stress Scale of 21 questions (DASS-21), and menstruation signs in three phases of their menstruation. Data were analyzed by the statistical tests of Pearson correlation coefficient, Student's t-test, one-way analysis of variance (ANOVA), and regression through SPSS version 14. Results: Based on the findings, 74% of the subjects reported pre-menstruation signs, 94% reported signs during bleeding, and 40.8% reported post-menstruation signs. About 44.3% of the subjects had anxiety, 45.5% had depression, and 47.2% had stress. In addition, Pearson correlation coefficient test showed a significant positive correlation between menstruation signs and depression, anxiety, and stress (P < 0.05). Conclusion: With regard to the association between menstruation signs and psycho-cognitive variables, prevention and treatment of these disorders by the authorities of education and training and the Ministry of Health are essential. PMID:24403944
Estimating future burned areas under changing climate in the EU-Mediterranean countries.
Amatulli, Giuseppe; Camia, Andrea; San-Miguel-Ayanz, Jesús
2013-04-15
The impacts of climate change on forest fires have received increased attention in recent years at both continental and local scales. It is widely recognized that weather plays a key role in extreme fire situations. It is therefore of great interest to analyze projected changes in fire danger under climate change scenarios and to assess the consequent impacts of forest fires. In this study we estimated burned areas in the European Mediterranean (EU-Med) countries under past and future climate conditions. Historical (1985-2004) monthly burned areas in EU-Med countries were modeled by using the Canadian Fire Weather Index (CFWI). Monthly averages of the CFWI sub-indices were used as explanatory variables to estimate the monthly burned areas in each of the five most affected countries in Europe using three different modeling approaches (Multiple Linear Regression - MLR, Random Forest - RF, Multivariate Adaptive Regression Splines - MARS). MARS outperformed the other methods. Regression equations and significant coefficients of determination were obtained, although there were noticeable differences from country to country. Climatic conditions at the end of the 21st Century were simulated using results from the runs of the regional climate model HIRHAM in the European project PRUDENCE, considering two IPCC SRES scenarios (A2-B2). The MARS models were applied to both scenarios resulting in projected burned areas in each country and in the EU-Med region. Results showed that significant increases, 66% and 140% of the total burned area, can be expected in the EU-Med region under the A2 and B2 scenarios, respectively. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd
2017-08-01
The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.
Popa, Laurentiu S.; Hewitt, Angela L.; Ebner, Timothy J.
2012-01-01
The cerebellum has been implicated in processing motor errors required for online control of movement and motor learning. The dominant view is that Purkinje cell complex spike discharge signals motor errors. This study investigated whether errors are encoded in the simple spike discharge of Purkinje cells in monkeys trained to manually track a pseudo-randomly moving target. Four task error signals were evaluated based on cursor movement relative to target movement. Linear regression analyses based on firing residuals ensured that the modulation with a specific error parameter was independent of the other error parameters and kinematics. The results demonstrate that simple spike firing in lobules IV–VI is significantly correlated with position, distance and directional errors. Independent of the error signals, the same Purkinje cells encode kinematics. The strongest error modulation occurs at feedback timing. However, in 72% of cells at least one of the R2 temporal profiles resulting from regressing firing with individual errors exhibit two peak R2 values. For these bimodal profiles, the first peak is at a negative τ (lead) and a second peak at a positive τ (lag), implying that Purkinje cells encode both prediction and feedback about an error. For the majority of the bimodal profiles, the signs of the regression coefficients or preferred directions reverse at the times of the peaks. The sign reversal results in opposing simple spike modulation for the predictive and feedback components. Dual error representations may provide the signals needed to generate sensory prediction errors used to update a forward internal model. PMID:23115173
Recovering DC coefficients in block-based DCT.
Uehara, Takeyuki; Safavi-Naini, Reihaneh; Ogunbona, Philip
2006-11-01
It is a common approach for JPEG and MPEG encryption systems to provide higher protection for dc coefficients and less protection for ac coefficients. Some authors have employed a cryptographic encryption algorithm for the dc coefficients and left the ac coefficients to techniques based on random permutation lists which are known to be weak against known-plaintext and chosen-ciphertext attacks. In this paper we show that in block-based DCT, it is possible to recover dc coefficients from ac coefficients with reasonable image quality and show the insecurity of image encryption methods which rely on the encryption of dc values using a cryptoalgorithm. The method proposed in this paper combines dc recovery from ac coefficients and the fact that ac coefficients can be recovered using a chosen ciphertext attack. We demonstrate that a method proposed by Tang to encrypt and decrypt MPEG video can be completely broken.
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
Sun, Zhi-Jing; Zhu, Lan; Liang, Maolian; Xu, Tao; Lang, Jing-He
2016-08-01
WeChat is a promising tool for capturing electronic data; however, no research has examined its use. This study evaluates the reliability and feasibility of WeChat for administering the Pelvic Floor Impact Questionnaire Short Form 7 questionnaire to women with pelvic floor disorders. Sixty-eight pelvic floor rehabilitation women were recruited between June and December 2015 and crossover randomized to two groups. All participants completed two questionnaire formats. One group completed the paper version followed by the WeChat version; the other group completed the questionnaires in reverse order. Two weeks later, each group completed the two versions in reverse order. The WeChat version's reliability was assessed using intraclass correlation coefficients and test-retest reliability. Forty-two women (61.8%) preferred the WeChat to the paper format, eight (11.8%) preferred the paper format, and 18 (26.5%) had no preference. The younger women preferred WeChat. Completion time was 116.5 (61.3) seconds for the WeChat version and 133.4 (107.0) seconds for the paper version, with no significant difference (P = 0.145). Age and education did not impact completion time (P > 0.05). Consistency between the WeChat and paper versions was excellent. The intraclass correlation coefficients of the Pelvic Floor Impact Questionnaire Short Form 7 and the three subscales ranged from 0.915 to 0.980. The Bland-Altman analysis and linear regression results also showed high consistency. The test-retest study had a Pearson's correlation coefficient of 0.908, demonstrating a strong correlation. WeChat-based questionnaires were well accepted by women with pelvic floor disorders and had good data quality and reliability.
van Hasselt, Tim J; Pickles, Oliver; Midgley-Hunt, Alex; Jiang, Chao Quiang; Zhang, Wei Sen; Cheng, Kar Keung; Thomas, Graham Neil; Lam, Tai Hing
2014-01-01
Green tea consumption has been associated with many prophylactic health benefits. This study examined for the first time associations between tea consumption and renal function in a Chinese population. Cross-sectional baseline data including demographics, and lifestyle and weekly consumption of green, black, and oolong tea were analyzed from 12,428 ambulatory subjects aged 50 to 85 years (67.3% female) that were randomly selected from the membership list of a community social and welfare association in Guangzhou, China. Associations between tea consumption and renal function were assessed using regression analyses to adjust for potential confounding factors. Renal function was assessed using the estimated glomerular filtration rate (eGFR) and in a subcohort of 1,910 participants using a spot urinary albumin-to-creatinine ratio. Six thousand eight hundred and seventy-two participants drank at least 1 type of tea. Oolong tea consumption was negatively associated with eGFR (β-coefficient -0.019, P = .025), but in a gender-stratified analysis this was not the case. In men, black tea was positively associated with eGFR (β-coefficient 0.037, P = .013), but not in women (β-coefficient -0.002, P = .856). Otherwise, no statistically significant consistent associations between the measures of renal function and consumption of green tea, black tea, or oolong tea individually or total tea consumption were identified. Overall there was no clear evidence to suggest any consistent association between renal function and tea consumption in this large population-based study of older Chinese individuals. Copyright © 2014 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.
Regression discontinuity was a valid design for dichotomous outcomes in three randomized trials.
van Leeuwen, Nikki; Lingsma, Hester F; Mooijaart, Simon P; Nieboer, Daan; Trompet, Stella; Steyerberg, Ewout W
2018-06-01
Regression discontinuity (RD) is a quasi-experimental design that may provide valid estimates of treatment effects in case of continuous outcomes. We aimed to evaluate validity and precision in the RD design for dichotomous outcomes. We performed validation studies in three large randomized controlled trials (RCTs) (Corticosteroid Randomization After Significant Head injury [CRASH], the Global Utilization of Streptokinase and Tissue Plasminogen Activator for Occluded Coronary Arteries [GUSTO], and PROspective Study of Pravastatin in elderly individuals at risk of vascular disease [PROSPER]). To mimic the RD design, we selected patients above and below a cutoff (e.g., age 75 years) randomized to treatment and control, respectively. Adjusted logistic regression models using restricted cubic splines (RCS) and polynomials and local logistic regression models estimated the odds ratio (OR) for treatment, with 95% confidence intervals (CIs) to indicate precision. In CRASH, treatment increased mortality with OR 1.22 [95% CI 1.06-1.40] in the RCT. The RD estimates were 1.42 (0.94-2.16) and 1.13 (0.90-1.40) with RCS adjustment and local regression, respectively. In GUSTO, treatment reduced mortality (OR 0.83 [0.72-0.95]), with more extreme estimates in the RD analysis (OR 0.57 [0.35; 0.92] and 0.67 [0.51; 0.86]). In PROSPER, similar RCT and RD estimates were found, again with less precision in RD designs. We conclude that the RD design provides similar but substantially less precise treatment effect estimates compared with an RCT, with local regression being the preferred method of analysis. Copyright © 2018 Elsevier Inc. All rights reserved.
Wang, Xiaojing; Chen, Ming-Hui; Yan, Jun
2013-07-01
Cox models with time-varying coefficients offer great flexibility in capturing the temporal dynamics of covariate effects on event times, which could be hidden from a Cox proportional hazards model. Methodology development for varying coefficient Cox models, however, has been largely limited to right censored data; only limited work on interval censored data has been done. In most existing methods for varying coefficient models, analysts need to specify which covariate coefficients are time-varying and which are not at the time of fitting. We propose a dynamic Cox regression model for interval censored data in a Bayesian framework, where the coefficient curves are piecewise constant but the number of pieces and the jump points are covariate specific and estimated from the data. The model automatically determines the extent to which the temporal dynamics is needed for each covariate, resulting in smoother and more stable curve estimates. The posterior computation is carried out via an efficient reversible jump Markov chain Monte Carlo algorithm. Inference of each coefficient is based on an average of models with different number of pieces and jump points. A simulation study with three covariates, each with a coefficient of different degree in temporal dynamics, confirmed that the dynamic model is preferred to the existing time-varying model in terms of model comparison criteria through conditional predictive ordinate. When applied to a dental health data of children with age between 7 and 12 years, the dynamic model reveals that the relative risk of emergence of permanent tooth 24 between children with and without an infected primary predecessor is the highest at around age 7.5, and that it gradually reduces to one after age 11. These findings were not seen from the existing studies with Cox proportional hazards models.
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Abdul, Wares MD.; Ohtsu, Mizuki; Nakano, Kazuya; Haneishi, Hideaki
2018-02-01
We propose a method to estimate transcutaneous bilirubin, hemoglobin, and melanin based on the diffuse reflectance spectroscopy. In the proposed method, the Monte Carlo simulation-based multiple regression analysis for an absorbance spectrum in the visible wavelength region (460-590 nm) is used to specify the concentrations of bilirubin (Cbil), oxygenated hemoglobin (Coh), deoxygenated hemoglobin (Cdh), and melanin (Cm). Using the absorbance spectrum calculated from the measured diffuse reflectance spectrum as a response variable and the extinction coefficients of bilirubin, oxygenated hemoglobin, deoxygenated hemoglobin, and melanin, as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of bilirubin, oxygenated hemoglobin, deoxygenated hemoglobin, and melanin, are then determined from the regression coefficients using conversion vectors that are numerically deduced in advance by the Monte Carlo simulations for light transport in skin. Total hemoglobin concentration (Cth) and tissue oxygen saturation (StO2) are simply calculated from the oxygenated hemoglobin and deoxygenated hemoglobin. In vivo animal experiments with bile duct ligation in rats demonstrated that the estimated Cbil is increased after ligation of bile duct and reaches to around 20 mg/dl at 72 h after the onset of the ligation, which corresponds to the reference value of Cbil measured by a commercially available transcutaneous bilirubin meter. We also performed in vivo experiments with rats while varying the fraction of inspired oxygen (FiO2). Coh and Cdh decreased and increased, respectively, as FiO2 decreased. Consequently, StO2 was dramatically decreased. The results in this study indicate potential of the method for simultaneous evaluation of multiple chromophores in skin tissue.
Informal Peer-Assisted Learning Groups Did Not Lead to Better Performance of Saudi Dental Students.
AbdelSalam, Maha; El Tantawi, Maha; Al-Ansari, Asim; AlAgl, Adel; Al-Harbi, Fahad
2017-01-01
To describe peer-assisted learning (PAL) groups formed by dental undergraduate students in a biomedical course and to investigate the association of individual and group characteristics with academic performance. In 2015, 92 fourth-year students (43 males and 49 females) in the College of Dentistry, University of Dammam, Saudi Arabia, were invited to form PAL groups to study a unit of a biomedical course. An examination was used to assess their knowledge after 2 weeks. In addition, a questionnaire and social network analysis were used to investigate (1) individual student attributes: gender, role, subject matter knowledge, grade in previous year, teaming with friends, previous communication with teammates, and content discussion, and (2) group attributes: group teacher's previous grade, number of colleagues with whom a student connected, teaming with friends, similarity of teammates' previous grades, and teacher having higher previous grades than other teammates. Regression analysis was used to assess the association of examination scores with individual and group attributes. The response rate was 80.4% (74 students: 36 males and 38 females). Students who previously scored grades A and B had higher examination scores than students with grades C/less (regression coefficient = 18.50 and 13.39) within the groups. Higher scores were not associated with working in groups including friends only (regression coefficient = 1.17) or when all students had similar previous grades (regression coefficient = 0.85). Students with previous high grades benefited to a greater extent from working in PAL groups. Similarity of teammates in PAL groups was not associated with better scores. © 2017 S. Karger AG, Basel.
Daily magnesium intake and serum magnesium concentration among Japanese people.
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. The mean (+/-standard deviation) daily magnesium intake was 322 (+/-132), 323 (+/-163), and 322 (+/-147) mg/day for men, women, and the entire group, respectively. The mean (+/-standard deviation) serum magnesium concentration was 20.69 (+/-2.83), 20.69 (+/-2.88), and 20.69 (+/-2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log(10)X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed.
Estimation of variance in Cox's regression model with shared gamma frailties.
Andersen, P K; Klein, J P; Knudsen, K M; Tabanera y Palacios, R
1997-12-01
The Cox regression model with a shared frailty factor allows for unobserved heterogeneity or for statistical dependence between the observed survival times. Estimation in this model when the frailties are assumed to follow a gamma distribution is reviewed, and we address the problem of obtaining variance estimates for regression coefficients, frailty parameter, and cumulative baseline hazards using the observed nonparametric information matrix. A number of examples are given comparing this approach with fully parametric inference in models with piecewise constant baseline hazards.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Determination of suitable drying curve model for bread moisture loss during baking
NASA Astrophysics Data System (ADS)
Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.
2013-03-01
This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi
2010-02-01
Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in finding a relationship between electromagnetic fields and different biological processes. (c) 2009 Wiley-Liss, Inc.
Development of 1RM Prediction Equations for Bench Press in Moderately Trained Men.
Macht, Jordan W; Abel, Mark G; Mullineaux, David R; Yates, James W
2016-10-01
Macht, JW, Abel, MG, Mullineaux, DR, and Yates, JW. Development of 1RM prediction equations for bench press in moderately trained men. J Strength Cond Res 30(10): 2901-2906, 2016-There are a variety of established 1 repetition maximum (1RM) prediction equations, however, very few prediction equations use anthropometric characteristics exclusively or in part, to estimate 1RM strength. Therefore, the purpose of this study was to develop an original 1RM prediction equation for bench press using anthropometric and performance characteristics in moderately trained male subjects. Sixty male subjects (21.2 ± 2.4 years) completed a 1RM bench press and were randomly assigned a load to complete as many repetitions as possible. In addition, body composition, upper-body anthropometric characteristics, and handgrip strength were assessed. Regression analysis was used to develop a performance-based 1RM prediction equation: 1RM = 1.20 repetition weight + 2.19 repetitions to fatigue - 0.56 biacromial width (cm) + 9.6 (R = 0.99, standard error of estimate [SEE] = 3.5 kg). Regression analysis to develop a nonperformance-based 1RM prediction equation yielded: 1RM (kg) = 0.997 cross-sectional area (CSA) (cm) + 0.401 chest circumference (cm) - 0.385%fat - 0.185 arm length (cm) + 36.7 (R = 0.81, SEE = 13.0 kg). The performance prediction equations developed in this study had high validity coefficients, minimal mean bias, and small limits of agreement. The anthropometric equations had moderately high validity coefficient but larger limits of agreement. The practical applications of this study indicate that the inclusion of anthropometric characteristics and performance variables produce a valid prediction equation for 1RM strength. In addition, the CSA of the arm uses a simple nonperformance method of estimating the lifter's 1RM. This information may be used to predict the starting load for a lifter performing a 1RM prediction protocol or a 1RM testing protocol.
Hulse, Anjana; Rai, Suahma; Prasanna Kumar, K M
2016-01-01
In children with type 1 diabetes, intensive diabetes management has been demonstrated to reduce long-term microvascular complications. At present, self-monitoring of blood glucose (SMBG) by patients at home and glycated hemoglobin estimation every 3 months are used to monitor glycemic control in children. Recently, ambulatory glucose profile (AGP) is increasingly being used to study the glycemic patterns in adults. However, accuracy and reliability of AGP in children have not been evaluated yet. To assess the accuracy of AGP data in children with type 1 diabetes mellitus when compared with laboratory random blood sugar (RBS) levels, capillary blood glucose (CBG) measured by glucometer in the hospital, and SMBG monitored at home. Paired RBS, CBG, and AGP data were analyzed for 51 patients who wore AGP sensors for 2 weeks. Simultaneous venous and CBG samples were collected on day 1 and day 14. SMBG at home was checked and recorded by the patients for optimizing insulin doses. Accuracy measures (mean absolute deviation, mean absolute relative difference (MARD), and coefficient of linear regression of AGP on RBS, CBG, and home-monitored SMBG were calculated. Seventy paired RBS, CBG, and AGP data and 362 paired home-monitored SMBG and AGP data were available. The MARD was 9.56% for AGP over RBS and 15.07% for AGP over CBG. The linear regression coefficient of AGP over RBS was 0.93 and that of AGP over CBG was 0.89 ( P < 0.001). The accuracy of AGP over SMBG was evaluated over four ranges: <75, 76-140, 141-200, and >200 mg/dl. In this study, AGP data significantly correlate with RBS and CBG data in children with type 1 diabetes. However, a large number of samples in a research setting would help to document reproducibility of our results.
Kharuzhyk, S A
2015-01-01
to carry out a quantitative analysis of diffusion-weighted magnetic resonance images (DWI) in cancer of the cervix uteri (CCU) and to estimate the possibility of using pretreatment measured diffusion coefficient (MDC) to predict chemoradiation therapy (CRT). The investigation prospectively enrolled 46 women with morphologically verified Stages IB-IVB CCU. All the women underwent diffusion-weighted magnetic resonance imaging of pelvic organs before and after treatment. A semiautomatic method was used to determine tumor signal intensity (SI) on DWI at b 1000 s/mm2 (SI b1000) and tumor MDC. The reproducibility of MDC measurements was assessed in 16 randomly selected women. The investigators compared the pretreatment quantitative DWI measures in complete and incomplete regression (CR and IR) groups and the presence and absence of tumor progression during a follow-up. An association of MDC with progression-free and overall survivals (PFS and OS) was determined in the patients. A semiautomatic tumor segmentation framework could determine the pretreatment quantitative DMI measures with minimal time spent and high reproducibility. The mean tumor MDC was 0.82 +/- 0.14 x 10(-3) mm2/s. CR and IR were established in 28 and 18 women, respectively. The MDC < or = 0.83 x 10(-3) mm2/s predicted CR with a sensitivity of 64.3% and a specificity of 77.8% (p=0.007). The median follow-up was 47 months (range, 3-82 months). With the MDC < or = 0.86 x 10(-3) mm2/s, 5-year PFS was 74.1% versus 42.1% with a higher MDC (p=0.023) and 5-year OS was 70.4 and 40.6%, respectively (p=0.021). The survival difference was insignificant in relation to the degree of tumor regression. The pretreatment IS at b1000 was of no prognostic value. The pretreatment tumor MDC may serve as a biomarker for predicting the efficiency of CRT for CCU.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, B; Fujita, A; Buch, K
Purpose: To investigate the correlation between texture analysis-based model observer and human observer in the task of diagnosis of ischemic infarct in non-contrast head CT of adults. Methods: Non-contrast head CTs of five patients (2 M, 3 F; 58–83 y) with ischemic infarcts were retro-reconstructed using FBP and Adaptive Statistical Iterative Reconstruction (ASIR) of various levels (10–100%). Six neuro -radiologists reviewed each image and scored image quality for diagnosing acute infarcts by a 9-point Likert scale in a blinded test. These scores were averaged across the observers to produce the average human observer responses. The chief neuro-radiologist placed multiple ROIsmore » over the infarcts. These ROIs were entered into a texture analysis software package. Forty-two features per image, including 11 GLRL, 5 GLCM, 4 GLGM, 9 Laws, and 13 2-D features, were computed and averaged over the images per dataset. The Fisher-coefficient (ratio of between-class variance to in-class variance) was calculated for each feature to identify the most discriminating features from each matrix that separate the different confidence scores most efficiently. The 15 features with the highest Fisher -coefficient were entered into linear multivariate regression for iterative modeling. Results: Multivariate regression analysis resulted in the best prediction model of the confidence scores after three iterations (df=11, F=11.7, p-value<0.0001). The model predicted scores and human observers were highly correlated (R=0.88, R-sq=0.77). The root-mean-square and maximal residual were 0.21 and 0.44, respectively. The residual scatter plot appeared random, symmetric, and unbiased. Conclusion: For diagnosis of ischemic infarct in non-contrast head CT in adults, the predicted image quality scores from texture analysis-based model observer was highly correlated with that of human observers for various noise levels. Texture-based model observer can characterize image quality of low contrast, subtle texture changes in addition to human observers.« less
Liu, Xuling; Yu, Yang
2018-01-01
Background The aim of this study was to summarize and discuss the similarities and differences in inflammatory biomarkers in postoperative delirium (POD) and cognitive dysfunction (POCD). Methods A systematic retrieval of literature up to June 2017 in PubMed, Embase, the Cochrane Library, the China National Knowledge Infrastructure database, and the Wanfang database was conducted. Extracted data were analyzed with STATA (version 14). The standardized mean difference (SMD) and the 95% confidence interval (95% CI) of each indicator were calculated using a random effect model. We also performed tests of heterogeneity, sensitivity analysis, assessments of bias, and meta-regression in this meta-analysis. Results A total of 54 observational studies were included. By meta-analysis we found significantly increased C-reactive protein (CRP) (9 studies, SMD 0.883, 95% CI 0.130 to 1.637, P = 0.022 in POD; 10 studies, SMD -0.133, 95% CI -0.512 to 0.246, P = 0.429 in POCD) and interleukin (IL)-6 (7 studies, SMD 0.386, 95% CI 0.054 to 0.717, P = 0.022 in POD; 16 studies, SMD 0.089, 95% CI -0.133 to 0.311, P = 0.433 in POCD) concentrations in both POD and POCD patients. We also found that the SMDs of CRP and IL-6 from POCD patients were positively correlated with surgery type in the meta-regression (CRP: Coefficient = 1.555365, P = 0.001, 10 studies; IL-6: Coefficient = -0.6455521, P = 0.086, 16 studies). Conclusion Available evidence from medium-to-high quality observational studies suggests that POD and POCD are indeed correlated with the concentration of peripheral and cerebrospinal fluid (CSF) inflammatory markers. Some of these markers, such as CRP and IL-6, play roles in both POD and POCD, while others are specific to either one of them. PMID:29641605
UNSTEADY DISPERSION IN RANDOM INTERMITTENT FLOW
The longitudinal dispersion coefficient of a conservative tracer was calculated from flow tests in a dead-end pipe loop system. Flow conditions for these tests ranged from laminar to transitional flow, and from steady to intermittent and random. Two static mixers linked in series...
L.R. Iverson; A.M. Prasad; A. Liaw
2004-01-01
More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
ERIC Educational Resources Information Center
Trochim, William M. K.; And Others
1991-01-01
The regression-discontinuity design involving a treatment interaction effect (TIE), pretest-posttest functional form specification, and choice of point-of-estimation of the TIE are examined. Formulas for controlling the magnitude of TIE in simulations can be used for simulating the randomized experimental case where estimation is not at the…