A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
ERIC Educational Resources Information Center
Ker, H. W.
2014-01-01
Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Confidence Intervals for Assessing Heterogeneity in Generalized Linear Mixed Models
ERIC Educational Resources Information Center
Wagler, Amy E.
2014-01-01
Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. The effect of clustering on the response is often difficult to practically assess partly because it is reported on a scale on which comparisons with regression parameters are difficult to make. This article proposes confidence intervals for…
Modeling containment of large wildfires using generalized linear mixed-model analysis
Mark Finney; Isaac C. Grenfell; Charles W. McHugh
2009-01-01
Billions of dollars are spent annually in the United States to contain large wildland fires, but the factors contributing to suppression success remain poorly understood. We used a regression model (generalized linear mixed-model) to model containment probability of individual fires, assuming that containment was a repeated-measures problem (fixed effect) and...
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.
Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard
2017-04-01
To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.
BIODEGRADATION PROBABILITY PROGRAM (BIODEG)
The Biodegradation Probability Program (BIODEG) calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and non-linear regressions and d...
Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data
Ying, Gui-shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard
2017-01-01
Purpose To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. Methods We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field data in the elderly. Results When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI −0.03 to 0.32D, P=0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, P=0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller P-values, while analysis of the worse eye provided larger P-values than mixed effects models and marginal models. Conclusion In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision. PMID:28102741
Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Montoye, Alexander H K; Begum, Munni; Henning, Zachary; Pfeiffer, Karin A
2017-02-01
This study had three purposes, all related to evaluating energy expenditure (EE) prediction accuracy from body-worn accelerometers: (1) compare linear regression to linear mixed models, (2) compare linear models to artificial neural network models, and (3) compare accuracy of accelerometers placed on the hip, thigh, and wrists. Forty individuals performed 13 activities in a 90 min semi-structured, laboratory-based protocol. Participants wore accelerometers on the right hip, right thigh, and both wrists and a portable metabolic analyzer (EE criterion). Four EE prediction models were developed for each accelerometer: linear regression, linear mixed, and two ANN models. EE prediction accuracy was assessed using correlations, root mean square error (RMSE), and bias and was compared across models and accelerometers using repeated-measures analysis of variance. For all accelerometer placements, there were no significant differences for correlations or RMSE between linear regression and linear mixed models (correlations: r = 0.71-0.88, RMSE: 1.11-1.61 METs; p > 0.05). For the thigh-worn accelerometer, there were no differences in correlations or RMSE between linear and ANN models (ANN-correlations: r = 0.89, RMSE: 1.07-1.08 METs. Linear models-correlations: r = 0.88, RMSE: 1.10-1.11 METs; p > 0.05). Conversely, one ANN had higher correlations and lower RMSE than both linear models for the hip (ANN-correlation: r = 0.88, RMSE: 1.12 METs. Linear models-correlations: r = 0.86, RMSE: 1.18-1.19 METs; p < 0.05), and both ANNs had higher correlations and lower RMSE than both linear models for the wrist-worn accelerometers (ANN-correlations: r = 0.82-0.84, RMSE: 1.26-1.32 METs. Linear models-correlations: r = 0.71-0.73, RMSE: 1.55-1.61 METs; p < 0.01). For studies using wrist-worn accelerometers, machine learning models offer a significant improvement in EE prediction accuracy over linear models. Conversely, linear models showed similar EE prediction accuracy to machine learning models for hip- and thigh-worn accelerometers and may be viable alternative modeling techniques for EE prediction for hip- or thigh-worn accelerometers.
A Bayesian Semiparametric Latent Variable Model for Mixed Responses
ERIC Educational Resources Information Center
Fahrmeir, Ludwig; Raach, Alexander
2007-01-01
In this paper we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric Gaussian regression model. We extend existing LVMs with the usual linear covariate effects by including nonparametric components for nonlinear…
D.b.h./crown diameter relationships in mixed Appalachian hardwood stands
Neil I. Lamson; Neil I. Lamson
1987-01-01
Linear regression formulae for predicting crown diameter as a function of stem diameter are presented for nine species found in 50- to 80-year-old mixed hardwood stands in north-central West Virginia. Generally, crown diameter was closely related to tolerance; more tolerant species had larger crowns.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Baqué, Michèle; Amendt, Jens
2013-01-01
Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Factors associated with parasite dominance in fishes from Brazil.
Amarante, Cristina Fernandes do; Tassinari, Wagner de Souza; Luque, Jose Luis; Pereira, Maria Julia Salim
2016-06-14
The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.
GWAS with longitudinal phenotypes: performance of approximate procedures
Sikorska, Karolina; Montazeri, Nahid Mostafavi; Uitterlinden, André; Rivadeneira, Fernando; Eilers, Paul HC; Lesaffre, Emmanuel
2015-01-01
Analysis of genome-wide association studies with longitudinal data using standard procedures, such as linear mixed model (LMM) fitting, leads to discouragingly long computation times. There is a need to speed up the computations significantly. In our previous work (Sikorska et al: Fast linear mixed model computations for genome-wide association studies with longitudinal data. Stat Med 2012; 32.1: 165–180), we proposed the conditional two-step (CTS) approach as a fast method providing an approximation to the P-value for the longitudinal single-nucleotide polymorphism (SNP) effect. In the first step a reduced conditional LMM is fit, omitting all the SNP terms. In the second step, the estimated random slopes are regressed on SNPs. The CTS has been applied to the bone mineral density data from the Rotterdam Study and proved to work very well even in unbalanced situations. In another article (Sikorska et al: GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies. BMC Bioinformatics 2013; 14: 166), we suggested semi-parallel computations, greatly speeding up fitting many linear regressions. Combining CTS with fast linear regression reduces the computation time from several weeks to a few minutes on a single computer. Here, we explore further the properties of the CTS both analytically and by simulations. We investigate the performance of our proposal in comparison with a related but different approach, the two-step procedure. It is analytically shown that for the balanced case, under mild assumptions, the P-value provided by the CTS is the same as from the LMM. For unbalanced data and in realistic situations, simulations show that the CTS method does not inflate the type I error rate and implies only a minimal loss of power. PMID:25712081
ERIC Educational Resources Information Center
Pliszka, Steven R.; Matthews, Thomas L.; Braslow, Kenneth J.; Watson, Melissa A.
2006-01-01
Objective: To determine whether methylphenidate (MPH) and mixed salts amphetamine (MSA) have different effects on growth in children with attention-deficit/hyperactivity disorder. Method: Patients treated for at least 1 year with MPH or MSA were identified. A linear regression was performed to determine the effect of stimulant type, patient…
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
The PX-EM algorithm for fast stable fitting of Henderson's mixed model
Foulley, Jean-Louis; Van Dyk, David A
2000-01-01
This paper presents procedures for implementing the PX-EM algorithm of Liu, Rubin and Wu to compute REML estimates of variance covariance components in Henderson's linear mixed models. The class of models considered encompasses several correlated random factors having the same vector length e.g., as in random regression models for longitudinal data analysis and in sire-maternal grandsire models for genetic evaluation. Numerical examples are presented to illustrate the procedures. Much better results in terms of convergence characteristics (number of iterations and time required for convergence) are obtained for PX-EM relative to the basic EM algorithm in the random regression. PMID:14736399
Mendez, Javier; Monleon-Getino, Antonio; Jofre, Juan; Lucena, Francisco
2017-10-01
The present study aimed to establish the kinetics of the appearance of coliphage plaques using the double agar layer titration technique to evaluate the feasibility of using traditional coliphage plaque forming unit (PFU) enumeration as a rapid quantification method. Repeated measurements of the appearance of plaques of coliphages titrated according to ISO 10705-2 at different times were analysed using non-linear mixed-effects regression to determine the most suitable model of their appearance kinetics. Although this model is adequate, to simplify its applicability two linear models were developed to predict the numbers of coliphages reliably, using the PFU counts as determined by the ISO after only 3 hours of incubation. One linear model, when the number of plaques detected was between 4 and 26 PFU after 3 hours, had a linear fit of: (1.48 × Counts 3 h + 1.97); and the other, values >26 PFU, had a fit of (1.18 × Counts 3 h + 2.95). If the number of plaques detected was <4 PFU after 3 hours, we recommend incubation for (18 ± 3) hours. The study indicates that the traditional coliphage plating technique has a reasonable potential to provide results in a single working day without the need to invest in additional laboratory equipment.
Mosing, Martina; Waldmann, Andreas D.; MacFarlane, Paul; Iff, Samuel; Auer, Ulrike; Bohm, Stephan H.; Bettschart-Wolfensberger, Regula; Bardell, David
2016-01-01
This study evaluated the breathing pattern and distribution of ventilation in horses prior to and following recovery from general anaesthesia using electrical impedance tomography (EIT). Six horses were anaesthetised for 6 hours in dorsal recumbency. Arterial blood gas and EIT measurements were performed 24 hours before (baseline) and 1, 2, 3, 4, 5 and 6 hours after horses stood following anaesthesia. At each time point 4 representative spontaneous breaths were analysed. The percentage of the total breath length during which impedance remained greater than 50% of the maximum inspiratory impedance change (breath holding), the fraction of total tidal ventilation within each of four stacked regions of interest (ROI) (distribution of ventilation) and the filling time and inflation period of seven ROI evenly distributed over the dorso-ventral height of the lungs were calculated. Mixed effects multi-linear regression and linear regression were used and significance was set at p<0.05. All horses demonstrated inspiratory breath holding until 5 hours after standing. No change from baseline was seen for the distribution of ventilation during inspiration. Filling time and inflation period were more rapid and shorter in ventral and slower and longer in most dorsal ROI compared to baseline, respectively. In a mixed effects multi-linear regression, breath holding was significantly correlated with PaCO2 in both the univariate and multivariate regression. Following recovery from anaesthesia, horses showed inspiratory breath holding during which gas redistributed from ventral into dorsal regions of the lungs. This suggests auto-recruitment of lung tissue which would have been dependent and likely atelectic during anaesthesia. PMID:27331910
The Bayesian group lasso for confounded spatial data
Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.
2017-01-01
Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
Analysis and generation of groundwater concentration time series
NASA Astrophysics Data System (ADS)
Crăciun, Maria; Vamoş, Călin; Suciu, Nicolae
2018-01-01
Concentration time series are provided by simulated concentrations of a nonreactive solute transported in groundwater, integrated over the transverse direction of a two-dimensional computational domain and recorded at the plume center of mass. The analysis of a statistical ensemble of time series reveals subtle features that are not captured by the first two moments which characterize the approximate Gaussian distribution of the two-dimensional concentration fields. The concentration time series exhibit a complex preasymptotic behavior driven by a nonstationary trend and correlated fluctuations with time-variable amplitude. Time series with almost the same statistics are generated by successively adding to a time-dependent trend a sum of linear regression terms, accounting for correlations between fluctuations around the trend and their increments in time, and terms of an amplitude modulated autoregressive noise of order one with time-varying parameter. The algorithm generalizes mixing models used in probability density function approaches. The well-known interaction by exchange with the mean mixing model is a special case consisting of a linear regression with constant coefficients.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.
2013-01-01
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
Fokkema, M; Smits, N; Zeileis, A; Hothorn, T; Kelderman, H
2017-10-25
Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are particularly common in psychological research. Therefore, we propose the generalized linear mixed-effects model tree (GLMM tree) algorithm, which allows for the detection of treatment-subgroup interactions, while accounting for the clustered structure of a dataset. The algorithm uses model-based recursive partitioning to detect treatment-subgroup interactions, and a GLMM to estimate the random-effects parameters. In a simulation study, GLMM trees show higher accuracy in recovering treatment-subgroup interactions, higher predictive accuracy, and lower type II error rates than linear-model-based recursive partitioning and mixed-effects regression trees. Also, GLMM trees show somewhat higher predictive accuracy than linear mixed-effects models with pre-specified interaction effects, on average. We illustrate the application of GLMM trees on an individual patient-level data meta-analysis on treatments for depression. We conclude that GLMM trees are a promising exploratory tool for the detection of treatment-subgroup interactions in clustered datasets.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Log-normal frailty models fitted as Poisson generalized linear mixed models.
Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver
2016-12-01
The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Using existing case-mix methods to fund trauma cases.
Monakova, Julia; Blais, Irene; Botz, Charles; Chechulin, Yuriy; Picciano, Gino; Basinski, Antoni
2010-01-01
Policymakers frequently face the need to increase funding in isolated and frequently heterogeneous (clinically and in terms of resource consumption) patient subpopulations. This article presents a methodologic solution for testing the appropriateness of using existing grouping and weighting methodologies for funding subsets of patients in the scenario where a case-mix approach is preferable to a flat-rate based payment system. Using as an example the subpopulation of trauma cases of Ontario lead trauma hospitals, the statistical techniques of linear and nonlinear regression models, regression trees, and spline models were applied to examine the fit of the existing case-mix groups and reference weights for the trauma cases. The analyses demonstrated that for funding Ontario trauma cases, the existing case-mix systems can form the basis for rational and equitable hospital funding, decreasing the need to develop a different grouper for this subset of patients. This study confirmed that Injury Severity Score is a poor predictor of costs for trauma patients. Although our analysis used the Canadian case-mix classification system and cost weights, the demonstrated concept of using existing case-mix systems to develop funding rates for specific subsets of patient populations may be applicable internationally.
Zhang, J; Feng, J-Y; Ni, Y-L; Wen, Y-J; Niu, Y; Tamba, C L; Yue, C; Song, Q; Zhang, Y-M
2017-06-01
Multilocus genome-wide association studies (GWAS) have become the state-of-the-art procedure to identify quantitative trait nucleotides (QTNs) associated with complex traits. However, implementation of multilocus model in GWAS is still difficult. In this study, we integrated least angle regression with empirical Bayes to perform multilocus GWAS under polygenic background control. We used an algorithm of model transformation that whitened the covariance matrix of the polygenic matrix K and environmental noise. Markers on one chromosome were included simultaneously in a multilocus model and least angle regression was used to select the most potentially associated single-nucleotide polymorphisms (SNPs), whereas the markers on the other chromosomes were used to calculate kinship matrix as polygenic background control. The selected SNPs in multilocus model were further detected for their association with the trait by empirical Bayes and likelihood ratio test. We herein refer to this method as the pLARmEB (polygenic-background-control-based least angle regression plus empirical Bayes). Results from simulation studies showed that pLARmEB was more powerful in QTN detection and more accurate in QTN effect estimation, had less false positive rate and required less computing time than Bayesian hierarchical generalized linear model, efficient mixed model association (EMMA) and least angle regression plus empirical Bayes. pLARmEB, multilocus random-SNP-effect mixed linear model and fast multilocus random-SNP-effect EMMA methods had almost equal power of QTN detection in simulation experiments. However, only pLARmEB identified 48 previously reported genes for 7 flowering time-related traits in Arabidopsis thaliana.
Cook, James P; Mahajan, Anubha; Morris, Andrew P
2017-02-01
Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.
Huff, Andrew G; Hodges, James S; Kennedy, Shaun P; Kircher, Amy
2015-08-01
To protect and secure food resources for the United States, it is crucial to have a method to compare food systems' criticality. In 2007, the U.S. government funded development of the Food and Agriculture Sector Criticality Assessment Tool (FASCAT) to determine which food and agriculture systems were most critical to the nation. FASCAT was developed in a collaborative process involving government officials and food industry subject matter experts (SMEs). After development, data were collected using FASCAT to quantify threats, vulnerabilities, consequences, and the impacts on the United States from failure of evaluated food and agriculture systems. To examine FASCAT's utility, linear regression models were used to determine: (1) which groups of questions posed in FASCAT were better predictors of cumulative criticality scores; (2) whether the items included in FASCAT's criticality method or the smaller subset of FASCAT items included in DHS's risk analysis method predicted similar criticality scores. Akaike's information criterion was used to determine which regression models best described criticality, and a mixed linear model was used to shrink estimates of criticality for individual food and agriculture systems. The results indicated that: (1) some of the questions used in FASCAT strongly predicted food or agriculture system criticality; (2) the FASCAT criticality formula was a stronger predictor of criticality compared to the DHS risk formula; (3) the cumulative criticality formula predicted criticality more strongly than weighted criticality formula; and (4) the mixed linear regression model did not change the rank-order of food and agriculture system criticality to a large degree. © 2015 Society for Risk Analysis.
Correcting for population structure and kinship using the linear mixed model: theory and extensions.
Hoffman, Gabriel E
2013-01-01
Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.
The long-solved problem of the best-fit straight line: application to isotopic mixing lines
NASA Astrophysics Data System (ADS)
Wehr, Richard; Saleska, Scott R.
2017-01-01
It has been almost 50 years since York published an exact and general solution for the best-fit straight line to independent points with normally distributed errors in both x and y. York's solution is highly cited in the geophysical literature but almost unknown outside of it, so that there has been no ebb in the tide of books and papers wrestling with the problem. Much of the post-1969 literature on straight-line fitting has sown confusion not merely by its content but by its very existence. The optimal least-squares fit is already known; the problem is already solved. Here we introduce the non-specialist reader to York's solution and demonstrate its application in the interesting case of the isotopic mixing line, an analytical tool widely used to determine the isotopic signature of trace gas sources for the study of biogeochemical cycles. The most commonly known linear regression methods - ordinary least-squares regression (OLS), geometric mean regression (GMR), and orthogonal distance regression (ODR) - have each been recommended as the best method for fitting isotopic mixing lines. In fact, OLS, GMR, and ODR are all special cases of York's solution that are valid only under particular measurement conditions, and those conditions do not hold in general for isotopic mixing lines. Using Monte Carlo simulations, we quantify the biases in OLS, GMR, and ODR under various conditions and show that York's general - and convenient - solution is always the least biased.
ERIC Educational Resources Information Center
Malin, Heather; Han, Hyemin; Liauw, Indrawati
2017-01-01
This study investigated the effects of internal and demographic variables on civic development in late adolescence using the construct "civic purpose." We conducted surveys on civic engagement with 480 high school seniors, and surveyed them again 2 years later. Using multivariate regression and linear mixed models, we tested the main…
ERIC Educational Resources Information Center
Pratt, Charlotte; Webber, Larry S.; Baggett, Chris D.; Ward, Dianne; Pate, Russell R.; Murray, David; Lohman, Timothy; Lytle, Leslie; Elder, John P.
2008-01-01
This study describes the relationships between sedentary activity and body composition in 1,458 sixth-grade girls from 36 middle schools across the United States. Multivariate associations between sedentary activity and body composition were examined with regression analyses using general linear mixed models. Mean age, body mass index, and…
Curriculum-Based Measurement of Oral Reading: Quality of Progress Monitoring Outcomes
ERIC Educational Resources Information Center
Christ, Theodore J.; Zopluoglu, Cengiz; Long, Jeffery D.; Monaghen, Barbara D.
2012-01-01
Curriculum-based measurement of oral reading (CBM-R) is frequently used to set student goals and monitor student progress. This study examined the quality of growth estimates derived from CBM-R progress monitoring data. The authors used a linear mixed effects regression (LMER) model to simulate progress monitoring data for multiple levels of…
Understory response following varying levels of overstory removal in mixed conifer stands
Fabian C.C. Uzoh; Leroy K. Dolph; John R. Anstead
1997-01-01
Diameter growth rates of understory trees were measured for periods both before and after overstory removal on six study areas in northern California. All the species responded with increased diameter growth after adjusting to their new environments. Linear regression equations that predict post treatment diameter growth increment of the residual trees are presented...
Alan K. Swanson; Solomon Z. Dobrowski; Andrew O. Finley; James H. Thorne; Michael K. Schwartz
2013-01-01
The uncertainty associated with species distribution model (SDM) projections is poorly characterized, despite its potential value to decision makers. Error estimates from most modelling techniques have been shown to be biased due to their failure to account for spatial autocorrelation (SAC) of residual error. Generalized linear mixed models (GLMM) have the ability to...
Penalized nonparametric scalar-on-function regression via principal coordinates
Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu
2016-01-01
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963
Wu, Lingtao; Lord, Dominique
2017-05-01
This study further examined the use of regression models for developing crash modification factors (CMFs), specifically focusing on the misspecification in the link function. The primary objectives were to validate the accuracy of CMFs derived from the commonly used regression models (i.e., generalized linear models or GLMs with additive linear link functions) when some of the variables have nonlinear relationships and quantify the amount of bias as a function of the nonlinearity. Using the concept of artificial realistic data, various linear and nonlinear crash modification functions (CM-Functions) were assumed for three variables. Crash counts were randomly generated based on these CM-Functions. CMFs were then derived from regression models for three different scenarios. The results were compared with the assumed true values. The main findings are summarized as follows: (1) when some variables have nonlinear relationships with crash risk, the CMFs for these variables derived from the commonly used GLMs are all biased, especially around areas away from the baseline conditions (e.g., boundary areas); (2) with the increase in nonlinearity (i.e., nonlinear relationship becomes stronger), the bias becomes more significant; (3) the quality of CMFs for other variables having linear relationships can be influenced when mixed with those having nonlinear relationships, but the accuracy may still be acceptable; and (4) the misuse of the link function for one or more variables can also lead to biased estimates for other parameters. This study raised the importance of the link function when using regression models for developing CMFs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Modeling Longitudinal Data Containing Non-Normal Within Subject Errors
NASA Technical Reports Server (NTRS)
Feiveson, Alan; Glenn, Nancy L.
2013-01-01
The mission of the National Aeronautics and Space Administration’s (NASA) human research program is to advance safe human spaceflight. This involves conducting experiments, collecting data, and analyzing data. The data are longitudinal and result from a relatively few number of subjects; typically 10 – 20. A longitudinal study refers to an investigation where participant outcomes and possibly treatments are collected at multiple follow-up times. Standard statistical designs such as mean regression with random effects and mixed–effects regression are inadequate for such data because the population is typically not approximately normally distributed. Hence, more advanced data analysis methods are necessary. This research focuses on four such methods for longitudinal data analysis: the recently proposed linear quantile mixed models (lqmm) by Geraci and Bottai (2013), quantile regression, multilevel mixed–effects linear regression, and robust regression. This research also provides computational algorithms for longitudinal data that scientists can directly use for human spaceflight and other longitudinal data applications, then presents statistical evidence that verifies which method is best for specific situations. This advances the study of longitudinal data in a broad range of applications including applications in the sciences, technology, engineering and mathematics fields.
Pre-natal exposures to cocaine and alcohol and physical growth patterns to age 8 years
Lumeng, Julie C.; Cabral, Howard J.; Gannon, Katherine; Heeren, Timothy; Frank, Deborah A.
2007-01-01
Two hundred and two primarily African American/Caribbean children (classified by maternal report and infant meconium as 38 heavier, 74 lighter and 89 not cocaine-exposed) were measured repeatedly from birth to age 8 years to assess whether there is an independent effect of prenatal cocaine exposure on physical growth patterns. Children with fetal alcohol syndrome identifiable at birth were excluded. At birth, cocaine and alcohol exposures were significantly and independently associated with lower weight, length and head circumference in cross-sectional multiple regression analyses. The relationship over time of pre-natal exposures to weight, height, and head circumference was then examined by multiple linear regression using mixed linear models including covariates: child’s gestational age, gender, ethnicity, age at assessment, current caregiver, birth mother’s use of alcohol, marijuana and tobacco during the pregnancy and pre-pregnancy weight (for child’s weight) and height (for child’s height and head circumference). The cocaine effects did not persist beyond infancy in piecewise linear mixed models, but a significant and independent negative effect of pre-natal alcohol exposure persisted for weight, height, and head circumference. Catch-up growth in cocaine-exposed infants occurred primarily by 6 months of age for all growth parameters, with some small fluctuations in growth rates in the preschool age range but no detectable differences between heavier versus unexposed nor lighter versus unexposed thereafter. PMID:17412558
Competing regression models for longitudinal data.
Alencar, Airlane P; Singer, Julio M; Rocha, Francisco Marcelo M
2012-03-01
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest-posttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Malloy, Elizabeth J; Morris, Jeffrey S; Adar, Sara D; Suh, Helen; Gold, Diane R; Coull, Brent A
2010-07-01
Frequently, exposure data are measured over time on a grid of discrete values that collectively define a functional observation. In many applications, researchers are interested in using these measurements as covariates to predict a scalar response in a regression setting, with interest focusing on the most biologically relevant time window of exposure. One example is in panel studies of the health effects of particulate matter (PM), where particle levels are measured over time. In such studies, there are many more values of the functional data than observations in the data set so that regularization of the corresponding functional regression coefficient is necessary for estimation. Additional issues in this setting are the possibility of exposure measurement error and the need to incorporate additional potential confounders, such as meteorological or co-pollutant measures, that themselves may have effects that vary over time. To accommodate all these features, we develop wavelet-based linear mixed distributed lag models that incorporate repeated measures of functional data as covariates into a linear mixed model. A Bayesian approach to model fitting uses wavelet shrinkage to regularize functional coefficients. We show that, as long as the exposure error induces fine-scale variability in the functional exposure profile and the distributed lag function representing the exposure effect varies smoothly in time, the model corrects for the exposure measurement error without further adjustment. Both these conditions are likely to hold in the environmental applications we consider. We examine properties of the method using simulations and apply the method to data from a study examining the association between PM, measured as hourly averages for 1-7 days, and markers of acute systemic inflammation. We use the method to fully control for the effects of confounding by other time-varying predictors, such as temperature and co-pollutants.
ERIC Educational Resources Information Center
Lazar, Ann A.; Zerbe, Gary O.
2011-01-01
Researchers often compare the relationship between an outcome and covariate for two or more groups by evaluating whether the fitted regression curves differ significantly. When they do, researchers need to determine the "significance region," or the values of the covariate where the curves significantly differ. In analysis of covariance (ANCOVA),…
ERIC Educational Resources Information Center
Van Norman, Ethan R.; Christ, Theodore J.; Zopluoglu, Cengiz
2013-01-01
This study examined the effect of baseline estimation on the quality of trend estimates derived from Curriculum Based Measurement of Oral Reading (CBM-R) progress monitoring data. The authors used a linear mixed effects regression (LMER) model to simulate progress monitoring data for schedules ranging from 6-20 weeks for datasets with high and low…
Advanced Statistical Analyses to Reduce Inconsistency of Bond Strength Data.
Minamino, T; Mine, A; Shintani, A; Higashi, M; Kawaguchi-Uemura, A; Kabetani, T; Hagino, R; Imai, D; Tajiri, Y; Matsumoto, M; Yatani, H
2017-11-01
This study was designed to clarify the interrelationship of factors that affect the value of microtensile bond strength (µTBS), focusing on nondestructive testing by which information of the specimens can be stored and quantified. µTBS test specimens were prepared from 10 noncarious human molars. Six factors of µTBS test specimens were evaluated: presence of voids at the interface, X-ray absorption coefficient of resin, X-ray absorption coefficient of dentin, length of dentin part, size of adhesion area, and individual differences of teeth. All specimens were observed nondestructively by optical coherence tomography and micro-computed tomography before µTBS testing. After µTBS testing, the effect of these factors on µTBS data was analyzed by the general linear model, linear mixed effects regression model, and nonlinear regression model with 95% confidence intervals. By the general linear model, a significant difference in individual differences of teeth was observed ( P < 0.001). A significantly positive correlation was shown between µTBS and length of dentin part ( P < 0.001); however, there was no significant nonlinearity ( P = 0.157). Moreover, a significantly negative correlation was observed between µTBS and size of adhesion area ( P = 0.001), with significant nonlinearity ( P = 0.014). No correlation was observed between µTBS and X-ray absorption coefficient of resin ( P = 0.147), and there was no significant nonlinearity ( P = 0.089). Additionally, a significantly positive correlation was observed between µTBS and X-ray absorption coefficient of dentin ( P = 0.022), with significant nonlinearity ( P = 0.036). A significant difference was also observed between the presence and absence of voids by linear mixed effects regression analysis. Our results showed correlations between various parameters of tooth specimens and µTBS data. To evaluate the performance of the adhesive more precisely, the effect of tooth variability and a method to reduce variation in bond strength values should also be considered.
NASA Technical Reports Server (NTRS)
Banse, Karl; Yong, Marina
1990-01-01
As a proxy for satellite CZCS observations and concurrent measurements of primary production rates, data from 138 stations occupied seasonally during 1967-1968 in the offshore eastern tropical Pacific were analyzed in terms of six temporal groups and our current regimes. Multiple linear regressions on column production Pt show that simulated satellite pigment is generally weakly correlated, but sometimes not correlated with Pt, and that incident irradiance, sea surface temperature, nitrate, transparency, and depths of mixed layer or nitracline assume little or no importance. After a proxy for the light-saturated chlorophyll-specific photosynthetic rate P(max) is added, the coefficient of determination ranges from 0.55 to 0.91 (median of 0.85) for the 10 cases. In stepwise multiple linear regressions the P(max) proxy is the best predictor for Pt.
The long-solved problem of the best-fit straight line: Application to isotopic mixing lines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wehr, Richard; Saleska, Scott R.
It has been almost 50 years since York published an exact and general solution for the best-fit straight line to independent points with normally distributed errors in both x and y. York's solution is highly cited in the geophysical literature but almost unknown outside of it, so that there has been no ebb in the tide of books and papers wrestling with the problem. Much of the post-1969 literature on straight-line fitting has sown confusion not merely by its content but by its very existence. The optimal least-squares fit is already known; the problem is already solved. Here we introducemore » the non-specialist reader to York's solution and demonstrate its application in the interesting case of the isotopic mixing line, an analytical tool widely used to determine the isotopic signature of trace gas sources for the study of biogeochemical cycles. The most commonly known linear regression methods – ordinary least-squares regression (OLS), geometric mean regression (GMR), and orthogonal distance regression (ODR) – have each been recommended as the best method for fitting isotopic mixing lines. In fact, OLS, GMR, and ODR are all special cases of York's solution that are valid only under particular measurement conditions, and those conditions do not hold in general for isotopic mixing lines. Here, using Monte Carlo simulations, we quantify the biases in OLS, GMR, and ODR under various conditions and show that York's general – and convenient – solution is always the least biased.« less
The long-solved problem of the best-fit straight line: Application to isotopic mixing lines
Wehr, Richard; Saleska, Scott R.
2017-01-03
It has been almost 50 years since York published an exact and general solution for the best-fit straight line to independent points with normally distributed errors in both x and y. York's solution is highly cited in the geophysical literature but almost unknown outside of it, so that there has been no ebb in the tide of books and papers wrestling with the problem. Much of the post-1969 literature on straight-line fitting has sown confusion not merely by its content but by its very existence. The optimal least-squares fit is already known; the problem is already solved. Here we introducemore » the non-specialist reader to York's solution and demonstrate its application in the interesting case of the isotopic mixing line, an analytical tool widely used to determine the isotopic signature of trace gas sources for the study of biogeochemical cycles. The most commonly known linear regression methods – ordinary least-squares regression (OLS), geometric mean regression (GMR), and orthogonal distance regression (ODR) – have each been recommended as the best method for fitting isotopic mixing lines. In fact, OLS, GMR, and ODR are all special cases of York's solution that are valid only under particular measurement conditions, and those conditions do not hold in general for isotopic mixing lines. Here, using Monte Carlo simulations, we quantify the biases in OLS, GMR, and ODR under various conditions and show that York's general – and convenient – solution is always the least biased.« less
Dong, Ling-Bo; Liu, Zhao-Gang; Li, Feng-Ri; Jiang, Li-Chun
2013-09-01
By using the branch analysis data of 955 standard branches from 60 sampled trees in 12 sampling plots of Pinus koraiensis plantation in Mengjiagang Forest Farm in Heilongjiang Province of Northeast China, and based on the linear mixed-effect model theory and methods, the models for predicting branch variables, including primary branch diameter, length, and angle, were developed. Considering tree effect, the MIXED module of SAS software was used to fit the prediction models. The results indicated that the fitting precision of the models could be improved by choosing appropriate random-effect parameters and variance-covariance structure. Then, the correlation structures including complex symmetry structure (CS), first-order autoregressive structure [AR(1)], and first-order autoregressive and moving average structure [ARMA(1,1)] were added to the optimal branch size mixed-effect model. The AR(1) improved the fitting precision of branch diameter and length mixed-effect model significantly, but all the three structures didn't improve the precision of branch angle mixed-effect model. In order to describe the heteroscedasticity during building mixed-effect model, the CF1 and CF2 functions were added to the branch mixed-effect model. CF1 function improved the fitting effect of branch angle mixed model significantly, whereas CF2 function improved the fitting effect of branch diameter and length mixed model significantly. Model validation confirmed that the mixed-effect model could improve the precision of prediction, as compare to the traditional regression model for the branch size prediction of Pinus koraiensis plantation.
The Weight of Euro Coins: Its Distribution Might Not Be as Normal as You Would Expect
ERIC Educational Resources Information Center
Shkedy, Ziv; Aerts, Marc; Callaert, Herman
2006-01-01
Classical regression models, ANOVA models and linear mixed models are just three examples (out of many) in which the normal distribution of the response is an essential assumption of the model. In this paper we use a dataset of 2000 euro coins containing information (up to the milligram) about the weight of each coin, to illustrate that the…
ERIC Educational Resources Information Center
Daniels, Amy M.; Mandell, David S.
2013-01-01
This study estimated compliance with American Academy of Pediatrics (AAP) guidelines for well-child care and the association between compliance and age at diagnosis in a national sample of Medicaid-enrolled children with autism (N = 1,475). Mixed effects linear regression was used to assess the relationship between compliance and age at diagnosis.…
NASA Astrophysics Data System (ADS)
Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.
2014-09-01
Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.
González, Juan R; Carrasco, Josep L; Armengol, Lluís; Villatoro, Sergi; Jover, Lluís; Yasui, Yutaka; Estivill, Xavier
2008-01-01
Background MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes based on linear mixed-model. This method establishes a threshold by using different tolerance intervals that accommodates the specific random error variability observed in each test sample. Results Through simulation studies we have shown that our proposed method outperforms two existing methods that are based on simple threshold rules or iterative regression. We have illustrated the method using a controlled MLPA assay in which targeted regions are variable in copy number in individuals suffering from different disorders such as Prader-Willi, DiGeorge or Autism showing the best performace. Conclusion Using the proposed mixed-model, we are able to determine thresholds to decide whether a region is altered. These threholds are specific for each individual, incorporating experimental variability, resulting in improved sensitivity and specificity as the examples with real data have revealed. PMID:18522760
Solving large mixed linear models using preconditioned conjugate gradient iteration.
Strandén, I; Lidauer, M
1999-12-01
Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
NASA Technical Reports Server (NTRS)
Flynn, Clare; Pickering, Kenneth E.; Crawford, James H.; Lamsol, Lok; Krotkov, Nickolay; Herman, Jay; Weinheimer, Andrew; Chen, Gao; Liu, Xiong; Szykman, James;
2014-01-01
To investigate the ability of column (or partial column) information to represent surface air quality, results of linear regression analyses between surface mixing ratio data and column abundances for O3 and NO2 are presented for the July 2011 Maryland deployment of the DISCOVER-AQ mission. Data collected by the P-3B aircraft, ground-based Pandora spectrometers, Aura/OMI satellite instrument, and simulations for July 2011 from the CMAQ air quality model during this deployment provide a large and varied data set, allowing this problem to be approached from multiple perspectives. O3 columns typically exhibited a statistically significant and high degree of correlation with surface data (R(sup 2) > 0.64) in the P- 3B data set, a moderate degree of correlation (0.16 < R(sup 2) < 0.64) in the CMAQ data set, and a low degree of correlation (R(sup 2) < 0.16) in the Pandora and OMI data sets. NO2 columns typically exhibited a low to moderate degree of correlation with surface data in each data set. The results of linear regression analyses for O3 exhibited smaller errors relative to the observations than NO2 regressions. These results suggest that O3 partial column observations from future satellite instruments with sufficient sensitivity to the lower troposphere can be meaningful for surface air quality analysis.
NASA Astrophysics Data System (ADS)
Hapugoda, J. C.; Sooriyarachchi, M. R.
2017-09-01
Survival time of patients with a disease and the incidence of that particular disease (count) is frequently observed in medical studies with the data of a clustered nature. In many cases, though, the survival times and the count can be correlated in a way that, diseases that occur rarely could have shorter survival times or vice versa. Due to this fact, joint modelling of these two variables will provide interesting and certainly improved results than modelling these separately. Authors have previously proposed a methodology using Generalized Linear Mixed Models (GLMM) by joining the Discrete Time Hazard model with the Poisson Regression model to jointly model survival and count model. As Aritificial Neural Network (ANN) has become a most powerful computational tool to model complex non-linear systems, it was proposed to develop a new joint model of survival and count of Dengue patients of Sri Lanka by using that approach. Thus, the objective of this study is to develop a model using ANN approach and compare the results with the previously developed GLMM model. As the response variables are continuous in nature, Generalized Regression Neural Network (GRNN) approach was adopted to model the data. To compare the model fit, measures such as root mean square error (RMSE), absolute mean error (AME) and correlation coefficient (R) were used. The measures indicate the GRNN model fits the data better than the GLMM model.
Coupé, Christophe
2018-01-01
As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for ‘difficult’ variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we assess a range of candidate distributions, including the Sichel, Delaporte, Box-Cox Green and Cole, and Box-Cox t distributions. We find that the Box-Cox t distribution, with appropriate modeling of its parameters, best fits the conditional distribution of phonemic inventory size. We finally discuss the specificities of phoneme counts, weak effects, and how GAMLSS should be considered for other linguistic variables. PMID:29713298
Coupé, Christophe
2018-01-01
As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for 'difficult' variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we assess a range of candidate distributions, including the Sichel, Delaporte, Box-Cox Green and Cole, and Box-Cox t distributions. We find that the Box-Cox t distribution, with appropriate modeling of its parameters, best fits the conditional distribution of phonemic inventory size. We finally discuss the specificities of phoneme counts, weak effects, and how GAMLSS should be considered for other linguistic variables.
Fitzsimmons, Eric J; Kvam, Vanessa; Souleyrette, Reginald R; Nambisan, Shashi S; Bonett, Douglas G
2013-01-01
Despite recent improvements in highway safety in the United States, serious crashes on curves remain a significant problem. To assist in better understanding causal factors leading to this problem, this article presents and demonstrates a methodology for collection and analysis of vehicle trajectory and speed data for rural and urban curves using Z-configured road tubes. For a large number of vehicle observations at 2 horizontal curves located in Dexter and Ames, Iowa, the article develops vehicle speed and lateral position prediction models for multiple points along these curves. Linear mixed-effects models were used to predict vehicle lateral position and speed along the curves as explained by operational, vehicle, and environmental variables. Behavior was visually represented for an identified subset of "risky" drivers. Linear mixed-effect regression models provided the means to predict vehicle speed and lateral position while taking into account repeated observations of the same vehicle along horizontal curves. Speed and lateral position at point of entry were observed to influence trajectory and speed profiles. Rural horizontal curve site models are presented that indicate that the following variables were significant and influenced both vehicle speed and lateral position: time of day, direction of travel (inside or outside lane), and type of vehicle.
Goeyvaerts, Nele; Leuridan, Elke; Faes, Christel; Van Damme, Pierre; Hens, Niel
2015-09-10
Biomedical studies often generate repeated measures of multiple outcomes on a set of subjects. It may be of interest to develop a biologically intuitive model for the joint evolution of these outcomes while assessing inter-subject heterogeneity. Even though it is common for biological processes to entail non-linear relationships, examples of multivariate non-linear mixed models (MNMMs) are still fairly rare. We contribute to this area by jointly analyzing the maternal antibody decay for measles, mumps, rubella, and varicella, allowing for a different non-linear decay model for each infectious disease. We present a general modeling framework to analyze multivariate non-linear longitudinal profiles subject to censoring, by combining multivariate random effects, non-linear growth and Tobit regression. We explore the hypothesis of a common infant-specific mechanism underlying maternal immunity using a pairwise correlated random-effects approach and evaluating different correlation matrix structures. The implied marginal correlation between maternal antibody levels is estimated using simulations. The mean duration of passive immunity was less than 4 months for all diseases with substantial heterogeneity between infants. The maternal antibody levels against rubella and varicella were found to be positively correlated, while little to no correlation could be inferred for the other disease pairs. For some pairs, computational issues occurred with increasing correlation matrix complexity, which underlines the importance of further developing estimation methods for MNMMs. Copyright © 2015 John Wiley & Sons, Ltd.
Mixed models, linear dependency, and identification in age-period-cohort models.
O'Brien, Robert M
2017-07-20
This paper examines the identification problem in age-period-cohort models that use either linear or categorically coded ages, periods, and cohorts or combinations of these parameterizations. These models are not identified using the traditional fixed effect regression model approach because of a linear dependency between the ages, periods, and cohorts. However, these models can be identified if the researcher introduces a single just identifying constraint on the model coefficients. The problem with such constraints is that the results can differ substantially depending on the constraint chosen. Somewhat surprisingly, age-period-cohort models that specify one or more of ages and/or periods and/or cohorts as random effects are identified. This is the case without introducing an additional constraint. I label this identification as statistical model identification and show how statistical model identification comes about in mixed models and why which effects are treated as fixed and which are treated as random can substantially change the estimates of the age, period, and cohort effects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A Tutorial on Multilevel Survival Analysis: Methods, Models and Applications
Austin, Peter C.
2017-01-01
Summary Data that have a multilevel structure occur frequently across a range of disciplines, including epidemiology, health services research, public health, education and sociology. We describe three families of regression models for the analysis of multilevel survival data. First, Cox proportional hazards models with mixed effects incorporate cluster-specific random effects that modify the baseline hazard function. Second, piecewise exponential survival models partition the duration of follow-up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model that incorporates the duration of exposure within each interval. By incorporating cluster-specific random effects, generalised linear mixed models can be used to analyse these data. Third, after partitioning the duration of follow-up into mutually exclusive intervals, one can use discrete time survival models that use a complementary log–log generalised linear model to model the occurrence of the outcome of interest within each interval. Random effects can be incorporated to account for within-cluster homogeneity in outcomes. We illustrate the application of these methods using data consisting of patients hospitalised with a heart attack. We illustrate the application of these methods using three statistical programming languages (R, SAS and Stata). PMID:29307954
A Tutorial on Multilevel Survival Analysis: Methods, Models and Applications.
Austin, Peter C
2017-08-01
Data that have a multilevel structure occur frequently across a range of disciplines, including epidemiology, health services research, public health, education and sociology. We describe three families of regression models for the analysis of multilevel survival data. First, Cox proportional hazards models with mixed effects incorporate cluster-specific random effects that modify the baseline hazard function. Second, piecewise exponential survival models partition the duration of follow-up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model that incorporates the duration of exposure within each interval. By incorporating cluster-specific random effects, generalised linear mixed models can be used to analyse these data. Third, after partitioning the duration of follow-up into mutually exclusive intervals, one can use discrete time survival models that use a complementary log-log generalised linear model to model the occurrence of the outcome of interest within each interval. Random effects can be incorporated to account for within-cluster homogeneity in outcomes. We illustrate the application of these methods using data consisting of patients hospitalised with a heart attack. We illustrate the application of these methods using three statistical programming languages (R, SAS and Stata).
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
Wang, S; Martinez-Lage, M; Sakai, Y; Chawla, S; Kim, S G; Alonso-Basanta, M; Lustig, R A; Brem, S; Mohan, S; Wolf, R L; Desai, A; Poptani, H
2016-01-01
Early assessment of treatment response is critical in patients with glioblastomas. A combination of DTI and DSC perfusion imaging parameters was evaluated to distinguish glioblastomas with true progression from mixed response and pseudoprogression. Forty-one patients with glioblastomas exhibiting enhancing lesions within 6 months after completion of chemoradiation therapy were retrospectively studied. All patients underwent surgery after MR imaging and were histologically classified as having true progression (>75% tumor), mixed response (25%-75% tumor), or pseudoprogression (<25% tumor). Mean diffusivity, fractional anisotropy, linear anisotropy coefficient, planar anisotropy coefficient, spheric anisotropy coefficient, and maximum relative cerebral blood volume values were measured from the enhancing tissue. A multivariate logistic regression analysis was used to determine the best model for classification of true progression from mixed response or pseudoprogression. Significantly elevated maximum relative cerebral blood volume, fractional anisotropy, linear anisotropy coefficient, and planar anisotropy coefficient and decreased spheric anisotropy coefficient were observed in true progression compared with pseudoprogression (P < .05). There were also significant differences in maximum relative cerebral blood volume, fractional anisotropy, planar anisotropy coefficient, and spheric anisotropy coefficient measurements between mixed response and true progression groups. The best model to distinguish true progression from non-true progression (pseudoprogression and mixed) consisted of fractional anisotropy, linear anisotropy coefficient, and maximum relative cerebral blood volume, resulting in an area under the curve of 0.905. This model also differentiated true progression from mixed response with an area under the curve of 0.901. A combination of fractional anisotropy and maximum relative cerebral blood volume differentiated pseudoprogression from nonpseudoprogression (true progression and mixed) with an area under the curve of 0.807. DTI and DSC perfusion imaging can improve accuracy in assessing treatment response and may aid in individualized treatment of patients with glioblastomas. © 2016 by American Journal of Neuroradiology.
Pagniez, Dominique; Duhamel, Alain; Boulanger, Eric; Lessore de Sainte Foy, Celia; Beuscart, Jean-Baptiste
2017-08-31
Glucose is widely used as an osmotic agent in peritoneal dialysis (PD), but exerts untoward effects on the peritoneum. The potential protective effect of a reduced exposure to hypertonic glucose has never been investigated. The cohort of PD patients attending our center which tackled the challenge of a restricted use of hypertonic glucose solutions has been prospectively followed since 1992. Small-solute transport was assessed using an equivalent of the glucose peritoneal equilibration test after 6 months, and then every year. Study was stopped on July 1st, 2008, before use of biocompatible solutions. Repeated measures in patients treated with PD for 54 months were analyzed by using (1) the slopes of the linear regression for D 4 /D 0 ratios over time computed for each individual, and (2) a linear mixed model. In the study period, 44 patients were treated for a total of 2376 months, 2058 without hypertonic glucose. There was one episode of peritoneal infection every 18 patient-months. The mean of slopes of the linear regression for D 4 /D 0 ratios was found to be significantly positive (Student's test, p < .001) and the results of the mixed model reflected a similar significant increase for D 4 /D 0 ratios over time. These results reflected a significant decrease of small-solute transport. In this large series, minimizing the use of hypertonic glucose solutions was associated in patients on long term PD with an overall decrease of small-solute transport within 54 months, despite a high rate of peritoneal infection.
Liu, Zun-Lei; Yuan, Xing-Wei; Yan, Li-Ping; Yang, Lin-Lin; Cheng, Jia-Hua
2013-09-01
By using the 2008-2010 investigation data about the body condition of small yellow croaker in the offshore waters of southern Yellow Sea (SYS), open waters of northern East China Sea (NECS), and offshore waters of middle East China Sea (MECS), this paper analyzed the spatial heterogeneity of body length-body mass of juvenile and adult small yellow croakers by the statistical approaches of mean regression model and quantile regression model. The results showed that the residual standard errors from the analysis of covariance (ANCOVA) and the linear mixed-effects model were similar, and those from the simple linear regression were the highest. For the juvenile small yellow croakers, their mean body mass in SYS and NECS estimated by the mixed-effects mean regression model was higher than the overall average mass across the three regions, while the mean body mass in MECS was below the overall average. For the adult small yellow croakers, their mean body mass in NECS was higher than the overall average, while the mean body mass in SYS and MECS was below the overall average. The results from quantile regression indicated the substantial differences in the allometric relationships of juvenile small yellow croakers between SYS, NECS, and MECS, with the estimated mean exponent of the allometric relationship in SYS being 2.85, and the interquartile range being from 2.63 to 2.96, which indicated the heterogeneity of body form. The results from ANCOVA showed that the allometric body length-body mass relationships were significantly different between the 25th and 75th percentile exponent values (F=6.38, df=1737, P<0.01) and the 25th percentile and median exponent values (F=2.35, df=1737, P=0.039). The relationship was marginally different between the median and 75th percentile exponent values (F=2.21, df=1737, P=0.051). The estimated body length-body mass exponent of adult small yellow croakers in SYS was 3.01 (10th and 95th percentiles = 2.77 and 3.1, respectively). The estimated body length-body mass relationships were significantly different from the lower and upper quantiles of the exponent (F=3.31, df=2793, P=0.01) and the median and upper quantiles (F=3.56, df=2793, P<0.01), while no significant difference was observed between the lower and median quantiles (F=0.98, df=2793, P=0.43).
Hao, Xu; Yujun, Sun; Xinjie, Wang; Jin, Wang; Yao, Fu
2015-01-01
A multiple linear model was developed for individual tree crown width of Cunninghamia lanceolata (Lamb.) Hook in Fujian province, southeast China. Data were obtained from 55 sample plots of pure China-fir plantation stands. An Ordinary Linear Least Squares (OLS) regression was used to establish the crown width model. To adjust for correlations between observations from the same sample plots, we developed one level linear mixed-effects (LME) models based on the multiple linear model, which take into account the random effects of plots. The best random effects combinations for the LME models were determined by the Akaike's information criterion, the Bayesian information criterion and the -2logarithm likelihood. Heteroscedasticity was reduced by three residual variance functions: the power function, the exponential function and the constant plus power function. The spatial correlation was modeled by three correlation structures: the first-order autoregressive structure [AR(1)], a combination of first-order autoregressive and moving average structures [ARMA(1,1)], and the compound symmetry structure (CS). Then, the LME model was compared to the multiple linear model using the absolute mean residual (AMR), the root mean square error (RMSE), and the adjusted coefficient of determination (adj-R2). For individual tree crown width models, the one level LME model showed the best performance. An independent dataset was used to test the performance of the models and to demonstrate the advantage of calibrating LME models.
Composition and structure of Pinus koraiensis mixed forest respond to spatial climatic changes.
Zhang, Jingli; Zhou, Yong; Zhou, Guangsheng; Xiao, Chunwang
2014-01-01
Although some studies have indicated that climate changes can affect Pinus koraiensis mixed forest, the responses of composition and structure of Pinus koraiensis mixed forests to climatic changes are unknown and the key climatic factors controlling the composition and structure of Pinus koraiensis mixed forest are uncertain. Field survey was conducted in the natural Pinus koraiensis mixed forests along a latitudinal gradient and an elevational gradient in Northeast China. In order to build the mathematical models for simulating the relationships of compositional and structural attributes of the Pinus koraiensis mixed forest with climatic and non-climatic factors, stepwise linear regression analyses were performed, incorporating 14 dependent variables and the linear and quadratic components of 9 factors. All the selected new models were computed under the +2°C and +10% precipitation and +4°C and +10% precipitation scenarios. The Max Temperature of Warmest Month, Mean Temperature of Warmest Quarter and Precipitation of Wettest Month were observed to be key climatic factors controlling the stand densities and total basal areas of Pinus koraiensis mixed forest. Increased summer temperatures and precipitations strongly enhanced the stand densities and total basal areas of broadleaf trees but had little effect on Pinus koraiensis under the +2°C and +10% precipitation scenario and +4°C and +10% precipitation scenario. These results show that the Max Temperature of Warmest Month, Mean Temperature of Warmest Quarter and Precipitation of Wettest Month are key climatic factors which shape the composition and structure of Pinus koraiensis mixed forest. Although the Pinus koraiensis would persist, the current forests dominated by Pinus koraiensis in the region would all shift and become broadleaf-dominated forests due to the dramatic increase of broadleaf trees under the future global warming and increased precipitation.
Two biased estimation techniques in linear regression: Application to aircraft
NASA Technical Reports Server (NTRS)
Klein, Vladislav
1988-01-01
Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.
Ulibarri, Monica D; Hiller, Sarah P; Lozada, Remedios; Rangel, M Gudelia; Stockman, Jamila K; Silverman, Jay G; Ojeda, Victoria D
2013-01-01
This mixed methods study examined the prevalence and characteristics of physical and sexual abuse and depression symptoms among 624 injection drug-using female sex workers (FSW-IDUs) in Tijuana and Ciudad Juarez, Mexico; a subset of 47 from Tijuana also underwent qualitative interviews. Linear regressions identified correlates of current depression symptoms. In the interviews, FSW-IDUs identified drug use as a method of coping with the trauma they experienced from abuse that occurred before and after age 18 and during the course of sex work. In a multivariate linear regression model, two factors-ever experiencing forced sex and forced sex in the context of sex work-were significantly associated with higher levels of depression symptoms. Our findings suggest the need for integrated mental health and drug abuse services for FSW-IDUs addressing history of trauma as well as for further research on violence revictimization in the context of sex work in Mexico.
Hobbs, Brian P.; Sargent, Daniel J.; Carlin, Bradley P.
2014-01-01
Assessing between-study variability in the context of conventional random-effects meta-analysis is notoriously difficult when incorporating data from only a small number of historical studies. In order to borrow strength, historical and current data are often assumed to be fully homogeneous, but this can have drastic consequences for power and Type I error if the historical information is biased. In this paper, we propose empirical and fully Bayesian modifications of the commensurate prior model (Hobbs et al., 2011) extending Pocock (1976), and evaluate their frequentist and Bayesian properties for incorporating patient-level historical data using general and generalized linear mixed regression models. Our proposed commensurate prior models lead to preposterior admissible estimators that facilitate alternative bias-variance trade-offs than those offered by pre-existing methodologies for incorporating historical data from a small number of historical studies. We also provide a sample analysis of a colon cancer trial comparing time-to-disease progression using a Weibull regression model. PMID:24795786
Ulibarri, Monica D.; Hiller, Sarah P.; Lozada, Remedios; Rangel, M. Gudelia; Stockman, Jamila K.; Silverman, Jay G.; Ojeda, Victoria D.
2013-01-01
This mixed methods study examined the prevalence and characteristics of physical and sexual abuse and depression symptoms among 624 injection drug-using female sex workers (FSW-IDUs) in Tijuana and Ciudad Juarez, Mexico; a subset of 47 from Tijuana also underwent qualitative interviews. Linear regressions identified correlates of current depression symptoms. In the interviews, FSW-IDUs identified drug use as a method of coping with the trauma they experienced from abuse that occurred before and after age 18 and during the course of sex work. In a multivariate linear regression model, two factors—ever experiencing forced sex and forced sex in the context of sex work—were significantly associated with higher levels of depression symptoms. Our findings suggest the need for integrated mental health and drug abuse services for FSW-IDUs addressing history of trauma as well as for further research on violence revictimization in the context of sex work in Mexico. PMID:23737808
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.
2000-01-01
The NASA Engine Performance Program (NEPP) can configure and analyze almost any type of gas turbine engine that can be generated through the interconnection of a set of standard physical components. In addition, the code can optimize engine performance by changing adjustable variables under a set of constraints. However, for engine cycle problems at certain operating points, the NEPP code can encounter difficulties: nonconvergence in the currently implemented Powell's optimization algorithm and deficiencies in the Newton-Raphson solver during engine balancing. A project was undertaken to correct these deficiencies. Nonconvergence was avoided through a cascade optimization strategy, and deficiencies associated with engine balancing were eliminated through neural network and linear regression methods. An approximation-interspersed cascade strategy was used to optimize the engine's operation over its flight envelope. Replacement of Powell's algorithm by the cascade strategy improved the optimization segment of the NEPP code. The performance of the linear regression and neural network methods as alternative engine analyzers was found to be satisfactory. This report considers two examples-a supersonic mixed-flow turbofan engine and a subsonic waverotor-topped engine-to illustrate the results, and it discusses insights gained from the improved version of the NEPP code.
NASA Technical Reports Server (NTRS)
Jones, Harrison P.; Branston, Detrick D.; Jones, Patricia B.; Popescu, Miruna D.
2002-01-01
An earlier study compared NASA/NSO Spectromagnetograph (SPM) data with spacecraft measurements of total solar irradiance (TSI) variations over a 1.5 year period in the declining phase of solar cycle 22. This paper extends the analysis to an eight-year period which also spans the rising and early maximum phases of cycle 23. The conclusions of the earlier work appear to be robust: three factors (sunspots, strong unipolar regions, and strong mixed polarity regions) describe most of the variation in the SPM record, but only the first two are associated with TSI. Additionally, the residuals of a linear multiple regression of TSI against SPM observations over the entire eight-year period show an unexplained, increasing, linear time variation with a rate of about 0.05 W m(exp -2) per year. Separate regressions for the periods before and after 1996 January 01 show no unexplained trends but differ substantially in regression parameters. This behavior may reflect a solar source of TSI variations beyond sunspots and faculae but more plausibly results from uncompensated non-solar effects in one or both of the TSI and SPM data sets.
Wang, Ke-Sheng; Liu, Xuefeng; Ategbole, Muyiwa; Xie, Xin; Liu, Ying; Xu, Chun; Xie, Changchun; Sha, Zhanxin
2017-01-01
Objective: Screening for colorectal cancer (CRC) can reduce disease incidence, morbidity, and mortality. However, few studies have investigated the urban-rural differences in social and behavioral factors influencing CRC screening. The objective of the study was to investigate the potential factors across urban-rural groups on the usage of CRC screening. Methods: A total of 38,505 adults (aged ≥40 years) were selected from the 2009 California Health Interview Survey (CHIS) data - the latest CHIS data on CRC screening. The weighted generalized linear mixed-model (WGLIMM) was used to deal with this hierarchical structure data. Weighted simple and multiple mixed logistic regression analyses in SAS ver. 9.4 were used to obtain the odds ratios (ORs) and their 95% confidence intervals (CIs). Results: The overall prevalence of CRC screening was 48.1% while the prevalence in four residence groups - urban, second city, suburban, and town/rural, were 45.8%, 46.9%, 53.7% and 50.1%, respectively. The results of WGLIMM analysis showed that there was residence effect (p<0.0001) and residence groups had significant interactions with gender, age group, education level, and employment status (p<0.05). Multiple logistic regression analysis revealed that age, race, marital status, education level, employment stats, binge drinking, and smoking status were associated with CRC screening (p<0.05). Stratified by residence regions, age and poverty level showed associations with CRC screening in all four residence groups. Education level was positively associated with CRC screening in second city and suburban. Infrequent binge drinking was associated with CRC screening in urban and suburban; while current smoking was a protective factor in urban and town/rural groups. Conclusions: Mixed models are useful to deal with the clustered survey data. Social factors and behavioral factors (binge drinking and smoking) were associated with CRC screening and the associations were affected by living areas such as urban and rural regions. PMID:28952708
Wang, Ke-Sheng; Liu, Xuefeng; Ategbole, Muyiwa; Xie, Xin; Liu, Ying; Xu, Chun; Xie, Changchun; Sha, Zhanxin
2017-09-27
Objective: Screening for colorectal cancer (CRC) can reduce disease incidence, morbidity, and mortality. However, few studies have investigated the urban-rural differences in social and behavioral factors influencing CRC screening. The objective of the study was to investigate the potential factors across urban-rural groups on the usage of CRC screening. Methods: A total of 38,505 adults (aged ≥40 years) were selected from the 2009 California Health Interview Survey (CHIS) data - the latest CHIS data on CRC screening. The weighted generalized linear mixed-model (WGLIMM) was used to deal with this hierarchical structure data. Weighted simple and multiple mixed logistic regression analyses in SAS ver. 9.4 were used to obtain the odds ratios (ORs) and their 95% confidence intervals (CIs). Results: The overall prevalence of CRC screening was 48.1% while the prevalence in four residence groups - urban, second city, suburban, and town/rural, were 45.8%, 46.9%, 53.7% and 50.1%, respectively. The results of WGLIMM analysis showed that there was residence effect (p<0.0001) and residence groups had significant interactions with gender, age group, education level, and employment status (p<0.05). Multiple logistic regression analysis revealed that age, race, marital status, education level, employment stats, binge drinking, and smoking status were associated with CRC screening (p<0.05). Stratified by residence regions, age and poverty level showed associations with CRC screening in all four residence groups. Education level was positively associated with CRC screening in second city and suburban. Infrequent binge drinking was associated with CRC screening in urban and suburban; while current smoking was a protective factor in urban and town/rural groups. Conclusions: Mixed models are useful to deal with the clustered survey data. Social factors and behavioral factors (binge drinking and smoking) were associated with CRC screening and the associations were affected by living areas such as urban and rural regions. Creative Commons Attribution License
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Multivariate statistical approach to estimate mixing proportions for unknown end members
Valder, Joshua F.; Long, Andrew J.; Davis, Arden D.; Kenner, Scott J.
2012-01-01
A multivariate statistical method is presented, which includes principal components analysis (PCA) and an end-member mixing model to estimate unknown end-member hydrochemical compositions and the relative mixing proportions of those end members in mixed waters. PCA, together with the Hotelling T2 statistic and a conceptual model of groundwater flow and mixing, was used in selecting samples that best approximate end members, which then were used as initial values in optimization of the end-member mixing model. This method was tested on controlled datasets (i.e., true values of estimates were known a priori) and found effective in estimating these end members and mixing proportions. The controlled datasets included synthetically generated hydrochemical data, synthetically generated mixing proportions, and laboratory analyses of sample mixtures, which were used in an evaluation of the effectiveness of this method for potential use in actual hydrological settings. For three different scenarios tested, correlation coefficients (R2) for linear regression between the estimated and known values ranged from 0.968 to 0.993 for mixing proportions and from 0.839 to 0.998 for end-member compositions. The method also was applied to field data from a study of end-member mixing in groundwater as a field example and partial method validation.
Dey, Jacob K; Ishii, Masaru; Boahene, Kofi D O; Byrne, Patrick J; Ishii, Lisa E
2014-01-01
Determine the effect of facial reanimation surgery on observer-graded attractiveness and negative facial perception of patients with facial paralysis. Randomized controlled experiment. Ninety observers viewed images of paralyzed faces, smiling and in repose, before and after reanimation surgery, as well as normal comparison faces. Observers rated the attractiveness of each face and characterized the paralyzed faces by rating severity, disfigured/bothersome, and importance to repair. Iterated factor analysis indicated these highly correlated variables measure a common domain, so they were combined to create the disfigured, important to repair, bothersome, severity (DIBS) factor score. Mixed effects linear regression determined the effect of facial reanimation surgery on attractiveness and DIBS score. Facial paralysis induces an attractiveness penalty of 2.51 on a 10-point scale for faces in repose and 3.38 for smiling faces. Mixed effects linear regression showed that reanimation surgery improved attractiveness for faces both in repose and smiling by 0.84 (95% confidence interval [CI]: 0.67, 1.01) and 1.24 (95% CI: 1.07, 1.42) respectively. Planned hypothesis tests confirmed statistically significant differences in attractiveness ratings between postoperative and normal faces, indicating attractiveness was not completely normalized. Regression analysis also showed that reanimation surgery decreased DIBS by 0.807 (95% CI: 0.704, 0.911) for faces in repose and 0.989 (95% CI: 0.886, 1.093), an entire standard deviation, for smiling faces. Facial reanimation surgery increases attractiveness and decreases negative facial perception of patients with facial paralysis. These data emphasize the need to optimize reanimation surgery to restore not only function, but also symmetry and cosmesis to improve facial perception and patient quality of life. © 2013 The American Laryngological, Rhinological and Otological Society, Inc.
Lloyd-Jones, Luke R; Robinson, Matthew R; Yang, Jian; Visscher, Peter M
2018-04-01
Genome-wide association studies (GWAS) have identified thousands of loci that are robustly associated with complex diseases. The use of linear mixed model (LMM) methodology for GWAS is becoming more prevalent due to its ability to control for population structure and cryptic relatedness and to increase power. The odds ratio (OR) is a common measure of the association of a disease with an exposure ( e.g. , a genetic variant) and is readably available from logistic regression. However, when the LMM is applied to all-or-none traits it provides estimates of genetic effects on the observed 0-1 scale, a different scale to that in logistic regression. This limits the comparability of results across studies, for example in a meta-analysis, and makes the interpretation of the magnitude of an effect from an LMM GWAS difficult. In this study, we derived transformations from the genetic effects estimated under the LMM to the OR that only rely on summary statistics. To test the proposed transformations, we used real genotypes from two large, publicly available data sets to simulate all-or-none phenotypes for a set of scenarios that differ in underlying model, disease prevalence, and heritability. Furthermore, we applied these transformations to GWAS summary statistics for type 2 diabetes generated from 108,042 individuals in the UK Biobank. In both simulation and real-data application, we observed very high concordance between the transformed OR from the LMM and either the simulated truth or estimates from logistic regression. The transformations derived and validated in this study improve the comparability of results from prospective and already performed LMM GWAS on complex diseases by providing a reliable transformation to a common comparative scale for the genetic effects. Copyright © 2018 by the Genetics Society of America.
NASA Astrophysics Data System (ADS)
Qie, G.; Wang, G.; Wang, M.
2016-12-01
Mixed pixels and shadows due to buildings in urban areas impede accurate estimation and mapping of city vegetation carbon density. In most of previous studies, these factors are often ignored, which thus result in underestimation of city vegetation carbon density. In this study we presented an integrated methodology to improve the accuracy of mapping city vegetation carbon density. Firstly, we applied a linear shadow remove analysis (LSRA) on remotely sensed Landsat 8 images to reduce the shadow effects on carbon estimation. Secondly, we integrated a linear spectral unmixing analysis (LSUA) with a linear stepwise regression (LSR), a logistic model-based stepwise regression (LMSR) and k-Nearest Neighbors (kNN), and utilized and compared the integrated models on shadow-removed images to map vegetation carbon density. This methodology was examined in Shenzhen City of Southeast China. A data set from a total of 175 sample plots measured in 2013 and 2014 was used to train the models. The independent variables statistically significantly contributing to improving the fit of the models to the data and reducing the sum of squared errors were selected from a total of 608 variables derived from different image band combinations and transformations. The vegetation fraction from LSUA was then added into the models as an important independent variable. The estimates obtained were evaluated using a cross-validation method. Our results showed that higher accuracies were obtained from the integrated models compared with the ones using traditional methods which ignore the effects of mixed pixels and shadows. This study indicates that the integrated method has great potential on improving the accuracy of urban vegetation carbon density estimation. Key words: Urban vegetation carbon, shadow, spectral unmixing, spatial modeling, Landsat 8 images
KMgene: a unified R package for gene-based association analysis for complex traits.
Yan, Qi; Fang, Zhou; Chen, Wei; Stegle, Oliver
2018-02-09
In this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model (GLMM) framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene. http://cran.r-project.org/web/packages/KMgene. qi.yan@chp.edu or wei.chen@chp.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.
Ice cream structural elements that affect melting rate and hardness.
Muse, M R; Hartel, R W
2004-01-01
Statistical models were developed to reveal which structural elements of ice cream affect melting rate and hardness. Ice creams were frozen in a batch freezer with three types of sweetener, three levels of the emulsifier polysorbate 80, and two different draw temperatures to produce ice creams with a range of microstructures. Ice cream mixes were analyzed for viscosity, and finished ice creams were analyzed for air cell and ice crystal size, overrun, and fat destabilization. The ice phase volume of each ice cream were calculated based on the freezing point of the mix. Melting rate and hardness of each hardened ice cream was measured and correlated with the structural attributes by using analysis of variance and multiple linear regression. Fat destabilization, ice crystal size, and the consistency coefficient of the mix were found to affect the melting rate of ice cream, whereas hardness was influenced by ice phase volume, ice crystal size, overrun, fat destabilization, and the rheological properties of the mix.
"Mad or bad?": burden on caregivers of patients with personality disorders.
Bauer, Rita; Döring, Antje; Schmidt, Tanja; Spießl, Hermann
2012-12-01
The burden on caregivers of patients with personality disorders is often greatly underestimated or completely disregarded. Possibilities for caregiver support have rarely been assessed. Thirty interviews were conducted with caregivers of such patients to assess illness-related burden. Responses were analyzed with a mixed method of qualitative and quantitative analysis in a sequential design. Patient and caregiver data, including sociodemographic and disease-related variables, were evaluated with regression analysis and regression trees. Caregiver statements (n = 404) were summarized into 44 global statements. The most frequent global statements were worries about the burden on other family members (70.0%), poor cooperation with clinical centers and other institutions (60.0%), financial burden (56.7%), worry about the patient's future (53.3%), and dissatisfaction with the patient's treatment and rehabilitation (53.3%). Linear regression and regression tree analysis identified predictors for more burdened caregivers. Caregivers of patients with personality disorders experience a variety of burdens, some disorder specific. Yet these caregivers often receive little attention or support.
The effects of climate change on harp seals (Pagophilus groenlandicus).
Johnston, David W; Bowers, Matthew T; Friedlaender, Ari S; Lavigne, David M
2012-01-01
Harp seals (Pagophilus groenlandicus) have evolved life history strategies to exploit seasonal sea ice as a breeding platform. As such, individuals are prepared to deal with fluctuations in the quantity and quality of ice in their breeding areas. It remains unclear, however, how shifts in climate may affect seal populations. The present study assesses the effects of climate change on harp seals through three linked analyses. First, we tested the effects of short-term climate variability on young-of-the year harp seal mortality using a linear regression of sea ice cover in the Gulf of St. Lawrence against stranding rates of dead harp seals in the region during 1992 to 2010. A similar regression of stranding rates and North Atlantic Oscillation (NAO) index values was also conducted. These analyses revealed negative correlations between both ice cover and NAO conditions and seal mortality, indicating that lighter ice cover and lower NAO values result in higher mortality. A retrospective cross-correlation analysis of NAO conditions and sea ice cover from 1978 to 2011 revealed that NAO-related changes in sea ice may have contributed to the depletion of seals on the east coast of Canada during 1950 to 1972, and to their recovery during 1973 to 2000. This historical retrospective also reveals opposite links between neonatal mortality in harp seals in the Northeast Atlantic and NAO phase. Finally, an assessment of the long-term trends in sea ice cover in the breeding regions of harp seals across the entire North Atlantic during 1979 through 2011 using multiple linear regression models and mixed effects linear regression models revealed that sea ice cover in all harp seal breeding regions has been declining by as much as 6 percent per decade over the time series of available satellite data.
The Effects of Climate Change on Harp Seals (Pagophilus groenlandicus)
Johnston, David W.; Bowers, Matthew T.; Friedlaender, Ari S.; Lavigne, David M.
2012-01-01
Harp seals (Pagophilus groenlandicus) have evolved life history strategies to exploit seasonal sea ice as a breeding platform. As such, individuals are prepared to deal with fluctuations in the quantity and quality of ice in their breeding areas. It remains unclear, however, how shifts in climate may affect seal populations. The present study assesses the effects of climate change on harp seals through three linked analyses. First, we tested the effects of short-term climate variability on young-of-the year harp seal mortality using a linear regression of sea ice cover in the Gulf of St. Lawrence against stranding rates of dead harp seals in the region during 1992 to 2010. A similar regression of stranding rates and North Atlantic Oscillation (NAO) index values was also conducted. These analyses revealed negative correlations between both ice cover and NAO conditions and seal mortality, indicating that lighter ice cover and lower NAO values result in higher mortality. A retrospective cross-correlation analysis of NAO conditions and sea ice cover from 1978 to 2011 revealed that NAO-related changes in sea ice may have contributed to the depletion of seals on the east coast of Canada during 1950 to 1972, and to their recovery during 1973 to 2000. This historical retrospective also reveals opposite links between neonatal mortality in harp seals in the Northeast Atlantic and NAO phase. Finally, an assessment of the long-term trends in sea ice cover in the breeding regions of harp seals across the entire North Atlantic during 1979 through 2011 using multiple linear regression models and mixed effects linear regression models revealed that sea ice cover in all harp seal breeding regions has been declining by as much as 6 percent per decade over the time series of available satellite data. PMID:22238591
Koerner, Tess K; Zhang, Yang
2017-02-27
Neurophysiological studies are often designed to examine relationships between measures from different testing conditions, time points, or analysis techniques within the same group of participants. Appropriate statistical techniques that can take into account repeated measures and multivariate predictor variables are integral and essential to successful data analysis and interpretation. This work implements and compares conventional Pearson correlations and linear mixed-effects (LME) regression models using data from two recently published auditory electrophysiology studies. For the specific research questions in both studies, the Pearson correlation test is inappropriate for determining strengths between the behavioral responses for speech-in-noise recognition and the multiple neurophysiological measures as the neural responses across listening conditions were simply treated as independent measures. In contrast, the LME models allow a systematic approach to incorporate both fixed-effect and random-effect terms to deal with the categorical grouping factor of listening conditions, between-subject baseline differences in the multiple measures, and the correlational structure among the predictor variables. Together, the comparative data demonstrate the advantages as well as the necessity to apply mixed-effects models to properly account for the built-in relationships among the multiple predictor variables, which has important implications for proper statistical modeling and interpretation of human behavior in terms of neural correlates and biomarkers.
Coker, Freya; Williams, Cylie M; Taylor, Nicholas F; Caspers, Kirsten; McAlinden, Fiona; Wilton, Anita; Shields, Nora; Haines, Terry P
2018-05-10
This protocol considers three allied health staffing models across public health subacute hospitals. This quasi-experimental mixed-methods study, including qualitative process evaluation, aims to evaluate the impact of additional allied health services in subacute care, in rehabilitation and geriatric evaluation management settings, on patient, health service and societal outcomes. This health services research will analyse outcomes of patients exposed to different allied health models of care at three health services. Each health service will have a control ward (routine care) and an intervention ward (additional allied health). This project has two parts. Part 1: a whole of site data extraction for included wards. Outcome measures will include: length of stay, rate of readmissions, discharge destinations, community referrals, patient feedback and staff perspectives. Part 2: Functional Independence Measure scores will be collected every 2-3 days for the duration of 60 patient admissions.Data from part 1 will be analysed by linear regression analysis for continuous outcomes using patient-level data and logistic regression analysis for binary outcomes. Qualitative data will be analysed using a deductive thematic approach. For part 2, a linear mixed model analysis will be conducted using therapy service delivery and days since admission to subacute care as fixed factors in the model and individual participant as a random factor. Graphical analysis will be used to examine the growth curve of the model and transformations. The days since admission factor will be used to examine non-linear growth trajectories to determine if they lead to better model fit. Findings will be disseminated through local reports and to the Department of Health and Human Services Victoria. Results will be presented at conferences and submitted to peer-reviewed journals. The Monash Health Human Research Ethics committee approved this multisite research (HREC/17/MonH/144 and HREC/17/MonH/547). © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Mixed kernel function support vector regression for global sensitivity analysis
NASA Astrophysics Data System (ADS)
Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng
2017-11-01
Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Mazo Lopera, Mauricio A; Coombes, Brandon J; de Andrade, Mariza
2017-09-27
Gene-environment (GE) interaction has important implications in the etiology of complex diseases that are caused by a combination of genetic factors and environment variables. Several authors have developed GE analysis in the context of independent subjects or longitudinal data using a gene-set. In this paper, we propose to analyze GE interaction for discrete and continuous phenotypes in family studies by incorporating the relatedness among the relatives for each family into a generalized linear mixed model (GLMM) and by using a gene-based variance component test. In addition, we deal with collinearity problems arising from linkage disequilibrium among single nucleotide polymorphisms (SNPs) by considering their coefficients as random effects under the null model estimation. We show that the best linear unbiased predictor (BLUP) of such random effects in the GLMM is equivalent to the ridge regression estimator. This equivalence provides a simple method to estimate the ridge penalty parameter in comparison to other computationally-demanding estimation approaches based on cross-validation schemes. We evaluated the proposed test using simulation studies and applied it to real data from the Baependi Heart Study consisting of 76 families. Using our approach, we identified an interaction between BMI and the Peroxisome Proliferator Activated Receptor Gamma ( PPARG ) gene associated with diabetes.
Huber, Stefan; Klein, Elise; Moeller, Korbinian; Willmes, Klaus
2015-10-01
In neuropsychological research, single-cases are often compared with a small control sample. Crawford and colleagues developed inferential methods (i.e., the modified t-test) for such a research design. In the present article, we suggest an extension of the methods of Crawford and colleagues employing linear mixed models (LMM). We first show that a t-test for the significance of a dummy coded predictor variable in a linear regression is equivalent to the modified t-test of Crawford and colleagues. As an extension to this idea, we then generalized the modified t-test to repeated measures data by using LMMs to compare the performance difference in two conditions observed in a single participant to that of a small control group. The performance of LMMs regarding Type I error rates and statistical power were tested based on Monte-Carlo simulations. We found that starting with about 15-20 participants in the control sample Type I error rates were close to the nominal Type I error rate using the Satterthwaite approximation for the degrees of freedom. Moreover, statistical power was acceptable. Therefore, we conclude that LMMs can be applied successfully to statistically evaluate performance differences between a single-case and a control sample. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hossain, Ahmed; Beyene, Joseph
2014-01-01
This article compares baseline, average, and longitudinal data analysis methods for identifying genetic variants in genome-wide association study using the Genetic Analysis Workshop 18 data. We apply methods that include (a) linear mixed models with baseline measures, (b) random intercept linear mixed models with mean measures outcome, and (c) random intercept linear mixed models with longitudinal measurements. In the linear mixed models, covariates are included as fixed effects, whereas relatedness among individuals is incorporated as the variance-covariance structure of the random effect for the individuals. The overall strategy of applying linear mixed models decorrelate the data is based on Aulchenko et al.'s GRAMMAR. By analyzing systolic and diastolic blood pressure, which are used separately as outcomes, we compare the 3 methods in identifying a known genetic variant that is associated with blood pressure from chromosome 3 and simulated phenotype data. We also analyze the real phenotype data to illustrate the methods. We conclude that the linear mixed model with longitudinal measurements of diastolic blood pressure is the most accurate at identifying the known single-nucleotide polymorphism among the methods, but linear mixed models with baseline measures perform best with systolic blood pressure as the outcome.
Conditional Monte Carlo randomization tests for regression models.
Parhat, Parwen; Rosenberger, William F; Diao, Guoqing
2014-08-15
We discuss the computation of randomization tests for clinical trials of two treatments when the primary outcome is based on a regression model. We begin by revisiting the seminal paper of Gail, Tan, and Piantadosi (1988), and then describe a method based on Monte Carlo generation of randomization sequences. The tests based on this Monte Carlo procedure are design based, in that they incorporate the particular randomization procedure used. We discuss permuted block designs, complete randomization, and biased coin designs. We also use a new technique by Plamadeala and Rosenberger (2012) for simple computation of conditional randomization tests. Like Gail, Tan, and Piantadosi, we focus on residuals from generalized linear models and martingale residuals from survival models. Such techniques do not apply to longitudinal data analysis, and we introduce a method for computation of randomization tests based on the predicted rate of change from a generalized linear mixed model when outcomes are longitudinal. We show, by simulation, that these randomization tests preserve the size and power well under model misspecification. Copyright © 2014 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Lucifredi, A.; Mazzieri, C.; Rossi, M.
2000-05-01
Since the operational conditions of a hydroelectric unit can vary within a wide range, the monitoring system must be able to distinguish between the variations of the monitored variable caused by variations of the operation conditions and those due to arising and progressing of failures and misoperations. The paper aims to identify the best technique to be adopted for the monitoring system. Three different methods have been implemented and compared. Two of them use statistical techniques: the first, the linear multiple regression, expresses the monitored variable as a linear function of the process parameters (independent variables), while the second, the dynamic kriging technique, is a modified technique of multiple linear regression representing the monitored variable as a linear combination of the process variables in such a way as to minimize the variance of the estimate error. The third is based on neural networks. Tests have shown that the monitoring system based on the kriging technique is not affected by some problems common to the other two models e.g. the requirement of a large amount of data for their tuning, both for training the neural network and defining the optimum plane for the multiple regression, not only in the system starting phase but also after a trivial operation of maintenance involving the substitution of machinery components having a direct impact on the observed variable. Or, in addition, the necessity of different models to describe in a satisfactory way the different ranges of operation of the plant. The monitoring system based on the kriging statistical technique overrides the previous difficulties: it does not require a large amount of data to be tuned and is immediately operational: given two points, the third can be immediately estimated; in addition the model follows the system without adapting itself to it. The results of the experimentation performed seem to indicate that a model based on a neural network or on a linear multiple regression is not optimal, and that a different approach is necessary to reduce the amount of work during the learning phase using, when available, all the information stored during the initial phase of the plant to build the reference baseline, elaborating, if it is the case, the raw information available. A mixed approach using the kriging statistical technique and neural network techniques could optimise the result.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
40 CFR 60.667 - Chemicals affected by subpart NNN.
Code of Federal Regulations, 2010 CFR
2010-07-01
... alcohols, ethoxylated, mixed Linear alcohols, ethoxylated, and sulfated, sodium salt, mixed Linear alcohols, sulfated, sodium salt, mixed Linear alkylbenzene 123-01-3 Magnesium acetate 142-72-3 Maleic anhydride 108...
40 CFR 60.667 - Chemicals affected by subpart NNN.
Code of Federal Regulations, 2011 CFR
2011-07-01
... alcohols, ethoxylated, mixed Linear alcohols, ethoxylated, and sulfated, sodium salt, mixed Linear alcohols, sulfated, sodium salt, mixed Linear alkylbenzene 123-01-3 Magnesium acetate 142-72-3 Maleic anhydride 108...
Mixed conditional logistic regression for habitat selection studies.
Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas
2010-05-01
1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Water mass mixing: The dominant control on the zinc distribution in the North Atlantic Ocean
NASA Astrophysics Data System (ADS)
Roshan, Saeed; Wu, Jingfeng
2015-07-01
Dissolved zinc (dZn) concentration was determined in the North Atlantic during the U.S. GEOTRACES 2010 and 2011 cruise (GOETRACES GA03). A relatively poor linear correlation (R2 = 0.756) was observed between dZn and silicic acid (Si), the slope of which was 0.0577 nM/µmol/kg. We attribute the relatively poor dZn-Si correlation to the following processes: (a) differential regeneration of zinc relative to silicic acid, (b) mixing of multiple water masses that have different Zn/Si, and (c) zinc sources such as sedimentary or hydrothermal. To quantitatively distinguish these possibilities, we use the results of Optimum Multi-Parameter Water Mass Analysis by Jenkins et al. (2015) to model the zinc distribution below 500 m. We hypothesized two scenarios: conservative mixing and regenerative mixing. The first scenario (conservative) could be modeled to results in a correlation with observations with a R2 = 0.846. In the second scenario, we took a Si-related regeneration into account, which could model the observations with a R2 = 0.867. Through this regenerative mixing scenario, we estimated a Zn/Si = 0.0548 nM/µmol/kg that may be more realistic than linear regression slope due to accounting for process b. However, this did not improve the model substantially (R2 = 0.867 versus0.846), which may indicate the insignificant effect of remineralization on the zinc distribution in this region. The relative weakness in the model-observation correlation (R2~0.85 for both scenarios) implies that processes (a) and (c) may be plausible. Furthermore, dZn in the upper 500 m exhibited a very poor correlation with apparent oxygen utilization, suggesting a minimal role for the organic matter-associated remineralization process.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Koerner, Tess K.; Zhang, Yang
2017-01-01
Neurophysiological studies are often designed to examine relationships between measures from different testing conditions, time points, or analysis techniques within the same group of participants. Appropriate statistical techniques that can take into account repeated measures and multivariate predictor variables are integral and essential to successful data analysis and interpretation. This work implements and compares conventional Pearson correlations and linear mixed-effects (LME) regression models using data from two recently published auditory electrophysiology studies. For the specific research questions in both studies, the Pearson correlation test is inappropriate for determining strengths between the behavioral responses for speech-in-noise recognition and the multiple neurophysiological measures as the neural responses across listening conditions were simply treated as independent measures. In contrast, the LME models allow a systematic approach to incorporate both fixed-effect and random-effect terms to deal with the categorical grouping factor of listening conditions, between-subject baseline differences in the multiple measures, and the correlational structure among the predictor variables. Together, the comparative data demonstrate the advantages as well as the necessity to apply mixed-effects models to properly account for the built-in relationships among the multiple predictor variables, which has important implications for proper statistical modeling and interpretation of human behavior in terms of neural correlates and biomarkers. PMID:28264422
Traub, Meike; Lauer, Romy; Kesztyüs, Tibor; Wartha, Olivia; Steinacker, Jürgen Michael; Kesztyüs, Dorothea
2018-03-16
Regular breakfast and well-balanced soft drink, and screen media consumption are associated with a lower risk of overweight and obesity in schoolchildren. The aim of this research is the combined examination of these three parameters as influencing factors for longitudinal weight development in schoolchildren in order to adapt targeted preventive measures. In the course of the Baden-Württemberg Study, Germany, data from direct measurements (baseline (2010) and follow-up (2011)) at schools was available for 1733 primary schoolchildren aged 7.08 ± 0.6 years (50.8% boys). Anthropometric measurements of the children were taken according to ISAK-standards (International Standard for Anthropometric Assessment) by trained staff. Health and lifestyle characteristics of the children and their parents were assessed in questionnaires. A linear mixed effects regression analysis was conducted to examine influences on changes in waist-to-height-ratio (WHtR), weight, and body mass index (BMI) measures. A generalised linear mixed effects regression analysis was performed to identify the relationship between breakfast, soft drink and screen media consumption with the prevalence of overweight, obesity and abdominal obesity at follow-up. According to the regression analyses, skipping breakfast led to increased changes in WHtR, weight and BMI measures. Skipping breakfast and the overconsumption of screen media at baseline led to higher odds of abdominal obesity and overweight at follow-up. No significant association between soft drink consumption and weight development was found. Targeted prevention for healthy weight status and development in primary schoolchildren should aim towards promoting balanced breakfast habits and a reduction in screen media consumption. Future research on soft drink consumption is needed. Health promoting interventions should synergistically involve children, parents, and schools. The Baden-Württemberg Study is registered at the German Clinical Trials Register (DRKS) under the DRKS-ID: DRKS00000494 .
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Sources of variation for indoor nitrogen dioxide in rural residences of Ethiopia
2009-01-01
Background Unprocessed biomass fuel is the primary source of indoor air pollution (IAP) in developing countries. The use of biomass fuel has been linked with acute respiratory infections. This study assesses sources of variations associated with the level of indoor nitrogen dioxide (NO2). Materials and methods This study examines household factors affecting the level of indoor pollution by measuring NO2. Repeated measurements of NO2 were made using a passive diffusive sampler. A Saltzman colorimetric method using a spectrometer calibrated at 540 nm was employed to analyze the mass of NO2 on the collection filter that was then subjected to a mass transfer equation to calculate the level of NO2 for the 24 hours of sampling duration. Structured questionnaire was used to collect data on fuel use characteristics. Data entry and cleaning was done in EPI INFO version 6.04, while data was analyzed using SPSS version 15.0. Analysis of variance, multiple linear regression and linear mixed model were used to isolate determining factors contributing to the variation of NO2 concentration. Results A total of 17,215 air samples were fully analyzed during the study period. Wood and crop were principal source of household energy. Biomass fuel characteristics were strongly related to indoor NO2 concentration in one-way analysis of variance. There was variation in repeated measurements of indoor NO2 over time. In a linear mixed model regression analysis, highland setting, wet season, cooking, use of fire events at least twice a day, frequency of cooked food items, and interaction between ecology and season were predictors of indoor NO2 concentration. The volume of the housing unit and the presence of kitchen showed little relevance in the level of NO2 concentration. Conclusion Agro-ecology, season, purpose of fire events, frequency of fire activities, frequency of cooking and physical conditions of housing are predictors of NO2 concentration. Improved kitchen conditions and ventilation are highly recommended. PMID:19922645
Estimation of the linear mixed integrated Ornstein–Uhlenbeck model
Hughes, Rachael A.; Kenward, Michael G.; Sterne, Jonathan A. C.; Tilling, Kate
2017-01-01
ABSTRACT The linear mixed model with an added integrated Ornstein–Uhlenbeck (IOU) process (linear mixed IOU model) allows for serial correlation and estimation of the degree of derivative tracking. It is rarely used, partly due to the lack of available software. We implemented the linear mixed IOU model in Stata and using simulations we assessed the feasibility of fitting the model by restricted maximum likelihood when applied to balanced and unbalanced data. We compared different (1) optimization algorithms, (2) parameterizations of the IOU process, (3) data structures and (4) random-effects structures. Fitting the model was practical and feasible when applied to large and moderately sized balanced datasets (20,000 and 500 observations), and large unbalanced datasets with (non-informative) dropout and intermittent missingness. Analysis of a real dataset showed that the linear mixed IOU model was a better fit to the data than the standard linear mixed model (i.e. independent within-subject errors with constant variance). PMID:28515536
Case-mix groups for VA hospital-based home care.
Smith, M E; Baker, C R; Branch, L G; Walls, R C; Grimes, R M; Karklins, J M; Kashner, M; Burrage, R; Parks, A; Rogers, P
1992-01-01
The purpose of this study is to group hospital-based home care (HBHC) patients homogeneously by their characteristics with respect to cost of care to develop alternative case mix methods for management and reimbursement (allocation) purposes. Six Veterans Affairs (VA) HBHC programs in Fiscal Year (FY) 1986 that maximized patient, program, and regional variation were selected, all of which agreed to participate. All HBHC patients active in each program on October 1, 1987, in addition to all new admissions through September 30, 1988 (FY88), comprised the sample of 874 unique patients. Statistical methods include the use of classification and regression trees (CART software: Statistical Software; Lafayette, CA), analysis of variance, and multiple linear regression techniques. The resulting algorithm is a three-factor model that explains 20% of the cost variance (R2 = 20%, with a cross validation R2 of 12%). Similar classifications such as the RUG-II, which is utilized for VA nursing home and intermediate care, the VA outpatient resource allocation model, and the RUG-HHC, utilized in some states for reimbursing home health care in the private sector, explained less of the cost variance and, therefore, are less adequate for VA home care resource allocation.
Laboratory Headphone Studies of Human Response to Low-Amplitude Sonic Booms and Rattle Heard Indoors
NASA Technical Reports Server (NTRS)
Loubeau, Alexandra; Sullivan, Brenda M.; Klos, Jacob; Rathsam, Jonathan; Gavin, Joseph R.
2013-01-01
Human response to sonic booms heard indoors is affected by the generation of contact-induced rattle noise. The annoyance caused by sonic boom-induced rattle noise was studied in a series of psychoacoustics tests. Stimuli were divided into three categories and presented in three different studies: isolated rattles at the same calculated Perceived Level (PL), sonic booms combined with rattles with the mixed sound at a single PL, and sonic booms combined with rattles with the mixed sound at three different PL. Subjects listened to sounds over headphones and were asked to report their annoyance. Annoyance to different rattles was shown to vary significantly according to rattle object size. In addition, the combination of low-amplitude sonic booms and rattles can be more annoying than the sonic boom alone. Correlations and regression analyses for the combined sonic boom and rattle sounds identified the Moore and Glasberg Stationary Loudness (MGSL) metric as a primary predictor of annoyance for the tested sounds. Multiple linear regression models were developed to describe annoyance to the tested sounds, and simplifications for applicability to a wider range of sounds are presented.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
An overview of longitudinal data analysis methods for neurological research.
Locascio, Joseph J; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Spatial Characteristics of Small Green Spaces' Mitigating Effects on Microscopic Urban Heat Islands
NASA Astrophysics Data System (ADS)
Park, J.; Lee, D. K.; Jeong, W.; Kim, J. H.; Huh, K. Y.
2015-12-01
The purpose of the study is to find small greens' disposition, types and sizes to reduce air temperature effectively in urban blocks. The research sites were six high developed blocks in Seoul, Korea. Air temperature was measured with mobile loggers in clear daytime during summer, from August to September, at screen level. Also the measurement repeated over three times a day during three days by walking and circulating around the experimental blocks and the control blocks at the same time. By analyzing spatial characteristics, the averaged air temperatures were classified with three spaces, sunny spaces, building-shaded spaces and small green spaces by using Kruskal-Wallis Test; and small green spaces in 6 blocks were classified into their outward forms, polygonal or linear and single or mixed. The polygonal and mixed types of small green spaces mitigated averaged air temperature of each block which they belonged with a simple linear regression model with adjusted R2 = 0.90**. As the area and volume of these types increased, the effect of air temperature reduction (ΔT; Air temperature difference between sunny space and green space in a block) also increased in a linear relationship. The experimental range of this research is 100m2 ~ 2,000m2 of area, and 1,000m3 ~ 10,000m3 of volume of small green space. As a result, more than 300m2 and 2,300m3 of polygonal green spaces with mixed vegetation is required to lower 1°C; 650m2 and 5,000m3 of them to lower 2°C; about 2,000m2 and about 10,000m3 of them to lower 4°C air temperature reduction in an urban block.
Schaid, Daniel J
2010-01-01
Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
Item Purification in Differential Item Functioning Using Generalized Linear Mixed Models
ERIC Educational Resources Information Center
Liu, Qian
2011-01-01
For this dissertation, four item purification procedures were implemented onto the generalized linear mixed model for differential item functioning (DIF) analysis, and the performance of these item purification procedures was investigated through a series of simulations. Among the four procedures, forward and generalized linear mixed model (GLMM)…
Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide
2009-01-01
The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
Use of a tracing task to assess visuomotor performance for evidence of concussion and recuperation.
Kelty-Stephen, Damian G; Qureshi Ahmad, Mona; Stirling, Leia
2015-12-01
The likelihood of suffering a concussion while playing a contact sport ranges from 15-45% per year of play. These rates are highly variable as athletes seldom report concussive symptoms, or do not recognize their symptoms. We performed a prospective cohort study (n = 206, aged 10-17) to examine visuomotor tracing to determine the sensitivity for detecting neuromotor components of concussion. Tracing variability measures were investigated for a mean shift with presentation of concussion-related symptoms and a linear return toward baseline over subsequent return visits. Furthermore, previous research relating brain injury to the dissociation of smooth movements into "submovements" led to the expectation that cumulative micropause duration, a measure of motion continuity, might detect likelihood of injury. Separate linear mixed effects regressions of tracing measures indicated that 4 of the 5 tracing measures captured both short-term effects of injury and longer-term effects of recovery with subsequent visits. Cumulative micropause duration has a positive relationship with likelihood of participants having had a concussion. The present results suggest that future research should evaluate how well the coefficients for the tracing parameter in the logistic regression help to detect concussion in novel cases. (c) 2015 APA, all rights reserved).
Jesus, Gilmar Mercês de; Assis, Maria Alice Altenburg de; Kupek, Emil; Dias, Lizziane Andrade
2017-01-01
The quality control of data entry in computerized questionnaires is an important step in the validation of new instruments. The study assessed the consistency of recorded weight and height on the Food Intake and Physical Activity of School Children (Web-CAAFE) between repeated measures and against directly measured data. Students from the 2nd to the 5th grade (n = 390) had their weight and height directly measured and then filled out the Web-CAAFE. A subsample (n = 92) filled out the Web-CAAFE twice, three hours apart. The analysis included hierarchical linear regression, mixed linear regression model, to evaluate the bias, and intraclass correlation coefficient (ICC), to assess consistency. Univariate linear regression assessed the effect of gender, reading/writing performance, and computer/internet use and possession on residuals of fixed and random effects. The Web-CAAFE showed high values of ICC between repeated measures (body weight = 0.996, height = 0.937, body mass index - BMI = 0.972), and regarding the checked measures (body weight = 0.962, height = 0.882, BMI = 0.828). The difference between means of body weight, height, and BMI directly measured and recorded was 208 g, -2 mm, and 0.238 kg/m², respectively, indicating slight BMI underestimation due to underestimation of weight and overestimation of height. This trend was related to body weight and age. Height and weight data entered in the Web-CAAFE by children were highly correlated with direct measurements and with the repeated entry. The bias found was similar to validation studies of self-reported weight and height in comparison to direct measurements.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.
2016-01-01
Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W
2016-01-01
External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Martin, Guillaume; Magne, Marie-Angélina; Cristobal, Magali San
2017-01-01
The need to adapt to decrease farm vulnerability to adverse contextual events has been extensively discussed on a theoretical basis. We developed an integrated and operational method to assess farm vulnerability to multiple and interacting contextual changes and explain how this vulnerability can best be reduced according to farm configurations and farmers' technical adaptations over time. Our method considers farm vulnerability as a function of the raw measurements of vulnerability variables (e.g., economic efficiency of production), the slope of the linear regression of these measurements over time, and the residuals of this linear regression. The last two are extracted from linear mixed models considering a random regression coefficient (an intercept common to all farms), a global trend (a slope common to all farms), a random deviation from the general mean for each farm, and a random deviation from the general trend for each farm. Among all possible combinations, the lowest farm vulnerability is obtained through a combination of high values of measurements, a stable or increasing trend and low variability for all vulnerability variables considered. Our method enables relating the measurements, trends and residuals of vulnerability variables to explanatory variables that illustrate farm exposure to climatic and economic variability, initial farm configurations and farmers' technical adaptations over time. We applied our method to 19 cattle (beef, dairy, and mixed) farms over the period 2008-2013. Selected vulnerability variables, i.e., farm productivity and economic efficiency, varied greatly among cattle farms and across years, with means ranging from 43.0 to 270.0 kg protein/ha and 29.4-66.0% efficiency, respectively. No farm had a high level, stable or increasing trend and low residuals for both farm productivity and economic efficiency of production. Thus, the least vulnerable farms represented a compromise among measurement value, trend, and variability of both performances. No specific combination of farmers' practices emerged for reducing cattle farm vulnerability to climatic and economic variability. In the least vulnerable farms, the practices implemented (stocking rate, input use…) were more consistent with the objective of developing the properties targeted (efficiency, robustness…). Our method can be used to support farmers with sector-specific and local insights about most promising farm adaptations.
Martin, Guillaume; Magne, Marie-Angélina; Cristobal, Magali San
2017-01-01
The need to adapt to decrease farm vulnerability to adverse contextual events has been extensively discussed on a theoretical basis. We developed an integrated and operational method to assess farm vulnerability to multiple and interacting contextual changes and explain how this vulnerability can best be reduced according to farm configurations and farmers’ technical adaptations over time. Our method considers farm vulnerability as a function of the raw measurements of vulnerability variables (e.g., economic efficiency of production), the slope of the linear regression of these measurements over time, and the residuals of this linear regression. The last two are extracted from linear mixed models considering a random regression coefficient (an intercept common to all farms), a global trend (a slope common to all farms), a random deviation from the general mean for each farm, and a random deviation from the general trend for each farm. Among all possible combinations, the lowest farm vulnerability is obtained through a combination of high values of measurements, a stable or increasing trend and low variability for all vulnerability variables considered. Our method enables relating the measurements, trends and residuals of vulnerability variables to explanatory variables that illustrate farm exposure to climatic and economic variability, initial farm configurations and farmers’ technical adaptations over time. We applied our method to 19 cattle (beef, dairy, and mixed) farms over the period 2008–2013. Selected vulnerability variables, i.e., farm productivity and economic efficiency, varied greatly among cattle farms and across years, with means ranging from 43.0 to 270.0 kg protein/ha and 29.4–66.0% efficiency, respectively. No farm had a high level, stable or increasing trend and low residuals for both farm productivity and economic efficiency of production. Thus, the least vulnerable farms represented a compromise among measurement value, trend, and variability of both performances. No specific combination of farmers’ practices emerged for reducing cattle farm vulnerability to climatic and economic variability. In the least vulnerable farms, the practices implemented (stocking rate, input use…) were more consistent with the objective of developing the properties targeted (efficiency, robustness…). Our method can be used to support farmers with sector-specific and local insights about most promising farm adaptations. PMID:28900435
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Yokoo, Takeshi; Serai, Suraj D; Pirasteh, Ali; Bashir, Mustafa R; Hamilton, Gavin; Hernando, Diego; Hu, Houchun H; Hetterich, Holger; Kühn, Jens-Peter; Kukuk, Guido M; Loomba, Rohit; Middleton, Michael S; Obuchowski, Nancy A; Song, Ji Soo; Tang, An; Wu, Xinhuai; Reeder, Scott B; Sirlin, Claude B
2018-02-01
Purpose To determine the linearity, bias, and precision of hepatic proton density fat fraction (PDFF) measurements by using magnetic resonance (MR) imaging across different field strengths, imager manufacturers, and reconstruction methods. Materials and Methods This meta-analysis was performed in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. A systematic literature search identified studies that evaluated the linearity and/or bias of hepatic PDFF measurements by using MR imaging (hereafter, MR imaging-PDFF) against PDFF measurements by using colocalized MR spectroscopy (hereafter, MR spectroscopy-PDFF) or the precision of MR imaging-PDFF. The quality of each study was evaluated by using the Quality Assessment of Studies of Diagnostic Accuracy 2 tool. De-identified original data sets from the selected studies were pooled. Linearity was evaluated by using linear regression between MR imaging-PDFF and MR spectroscopy-PDFF measurements. Bias, defined as the mean difference between MR imaging-PDFF and MR spectroscopy-PDFF measurements, was evaluated by using Bland-Altman analysis. Precision, defined as the agreement between repeated MR imaging-PDFF measurements, was evaluated by using a linear mixed-effects model, with field strength, imager manufacturer, reconstruction method, and region of interest as random effects. Results Twenty-three studies (1679 participants) were selected for linearity and bias analyses and 11 studies (425 participants) were selected for precision analyses. MR imaging-PDFF was linear with MR spectroscopy-PDFF (R 2 = 0.96). Regression slope (0.97; P < .001) and mean Bland-Altman bias (-0.13%; 95% limits of agreement: -3.95%, 3.40%) indicated minimal underestimation by using MR imaging-PDFF. MR imaging-PDFF was precise at the region-of-interest level, with repeatability and reproducibility coefficients of 2.99% and 4.12%, respectively. Field strength, imager manufacturer, and reconstruction method each had minimal effects on reproducibility. Conclusion MR imaging-PDFF has excellent linearity, bias, and precision across different field strengths, imager manufacturers, and reconstruction methods. © RSNA, 2017 Online supplemental material is available for this article. An earlier incorrect version of this article appeared online. This article was corrected on October 2, 2017.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Hines, C J; Deddens, J A; Coble, J; Alavanja, M C R
2007-04-01
Fungicides are routinely applied to deciduous tree fruits for disease management. Seventy-four private orchard applicators enrolled in the Agricultural Health Study participated in the Orchard Fungicide Exposure Study in 2002-2003. During 144 days of observation, information was obtained on chemicals applied and applicator mixing, application, personal protective, and hygiene practices. At least half of the applicators had orchards with <100 trees. Air blast was the most frequent application method used (55%), followed by hand spray (44%). Rubber gloves were the most frequently worn protective equipment (68% mix; 59% apply), followed by respirators (45% mix; 49% apply), protective outerwear (36% mix; 37% apply), and rubber boots (35% mix; 36% apply). Eye protection was worn while mixing and applying on only 35% and 41% of the days, respectively. Bivariate analyses were performed using repeated logistic or repeated linear regression. Mean duration of mixing, pounds of captan applied, total acres sprayed, and number of tank mixes sprayed were greater for air blast than for hand spray (p < 0.05). Spraying from a tractor/vehicle without an enclosed cab was associated with wearing some type of coverall (p < 0.05). Applicators often did not wash their hands after mixing (77%), a finding not explained by glove use. Glove use during mixing was associated with younger age, while wearing long-sleeve shirts was associated with older age (p < 0.05 each). Self-reported unusually high fungicide exposures were more likely on days applicators performed repairs (p < 0.05). These data will be useful for evaluating fungicide exposure determinants among orchard applicators.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Omedo, Irene; Mogeni, Polycarp; Bousema, Teun; Rockett, Kirk; Amambua-Ngwa, Alfred; Oyier, Isabella; C. Stevenson, Jennifer; Y. Baidjoe, Amrish; de Villiers, Etienne P.; Fegan, Greg; Ross, Amanda; Hubbart, Christina; Jeffreys, Anne; N. Williams, Thomas; Kwiatkowski, Dominic; Bejon, Philip
2017-01-01
Background: The first models of malaria transmission assumed a completely mixed and homogeneous population of parasites. Recent models include spatial heterogeneity and variably mixed populations. However, there are few empiric estimates of parasite mixing with which to parametize such models. Methods: Here we genotype 276 single nucleotide polymorphisms (SNPs) in 5199 P. falciparum isolates from two Kenyan sites (Kilifi county and Rachuonyo South district) and one Gambian site (Kombo coastal districts) to determine the spatio-temporal extent of parasite mixing, and use Principal Component Analysis (PCA) and linear regression to examine the relationship between genetic relatedness and distance in space and time for parasite pairs. Results: Using 107, 177 and 82 SNPs that were successfully genotyped in 133, 1602, and 1034 parasite isolates from The Gambia, Kilifi and Rachuonyo South district, respectively, we show that there are no discrete geographically restricted parasite sub-populations, but instead we see a diffuse spatio-temporal structure to parasite genotypes. Genetic relatedness of sample pairs is predicted by relatedness in space and time. Conclusions: Our findings suggest that targeted malaria control will benefit the surrounding community, but unfortunately also that emerging drug resistance will spread rapidly through the population. PMID:28612053
NASA Technical Reports Server (NTRS)
Merenyi, E.; Miller, J. S.; Singer, R. B.
1992-01-01
The linear mixing model approach was successfully applied to data sets of various natures. In these sets, the measured radiance could be assumed to be a linear combination of radiance contributions. The present work is an attempt to analyze a spectral image of Mars with linear mixing modeling.
Spectroscopic studies on Solvatochromism of mixed-chelate copper(II) complexes using MLR technique
NASA Astrophysics Data System (ADS)
Golchoubian, Hamid; Moayyedi, Golasa; Fazilati, Hakimeh
2012-01-01
Mixed-chelate copper(II) complexes with a general formula [Cu(acac)(diamine)]X where acac = acetylacetonate ion, diamine = N,N-dimethyl,N'-benzyl-1,2-diaminoethane and X = BPh 4-, PF 6-, ClO 4- and BF 4- have been prepared. The complexes were characterized on the basis of elemental analysis, molar conductance, UV-vis and IR spectroscopies. The complexes are solvatochromic and their solvatochromism were investigated by visible spectroscopy. All complexes demonstrated the positive solvatochromism and among the complexes [Cu(acac)(diamine)]BPh 4·H 2O showed the highest Δ νmax value. To explore the mechanism of interaction between solvent molecules and the complexes, different solvent parameters such as DN, AN, α and β using multiple linear regression (MLR) method were employed. The statistical results suggested that the DN parameter of the solvent plays a dominate contribution to the shift of the d-d absorption band of the complexes.
Samsuddin, Niza; Rampal, Krishna Gopal; Ismail, Noor Hassim; Abdullah, Nor Zamzila; Nasreen, Hashima E
2016-02-01
Research findings have linked exposure to pesticides to an increased risk of cardiovascular (CVS) diseases. Therefore, this study aimed to assess the impact of chronic mix-pesticides exposure on CVS hemodynamic parameters. A total of 198 male Malay pesticide-exposed and 195 male Malay nonexposed workers were examined. Data were collected through exposure-matrix assessment, questionnaire, blood analyses, and CVS assessment. Explanatory variables comprised of lipid profiles, paraoxonase 1 (PON1), and oxidized low-density lipoprotein (ox-LDL). Outcome measures comprised of brachial and aortic diastolic blood pressure (DBP) and systolic BP (SBP), heart rate, and pulse wave velocity (PWV). Linear regressions identified the B coefficient showing how many units of CVS parameters are associated with each unit of covariates. Diazoxonase was significantly lower and ox-LDL was higher among pesticide-exposed workers than the comparison group. The final multivariate linear regression model revealed that age, body mass index (BMI), smoking, and pesticide exposure were independent predictors of brachial and aortic DBP and SBP. Pesticide exposure was also associated with heart rate, but not with PWV. Lipid profiles, PON1 enzymes, and ox-LDL showed no association with any of the CVS parameters. Chronic mix-pesticide exposure among workers involved in mosquito control has possible association with depression of diazoxonase and the increase in ox-LDL, brachial and aortic DBP and SBP, and heart rate. This study raises concerns that those using pesticides may be exposed to hitherto unrecognized CVS risks among others. If this is confirmed by further studies, greater efforts will be needed to protect these workers. © American Journal of Hypertension, Ltd 2015. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello
2018-04-22
A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.
Boosting structured additive quantile regression for longitudinal childhood obesity data.
Fenske, Nora; Fahrmeir, Ludwig; Hothorn, Torsten; Rzehak, Peter; Höhle, Michael
2013-07-25
Childhood obesity and the investigation of its risk factors has become an important public health issue. Our work is based on and motivated by a German longitudinal study including 2,226 children with up to ten measurements on their body mass index (BMI) and risk factors from birth to the age of 10 years. We introduce boosting of structured additive quantile regression as a novel distribution-free approach for longitudinal quantile regression. The quantile-specific predictors of our model include conventional linear population effects, smooth nonlinear functional effects, varying-coefficient terms, and individual-specific effects, such as intercepts and slopes. Estimation is based on boosting, a computer intensive inference method for highly complex models. We propose a component-wise functional gradient descent boosting algorithm that allows for penalized estimation of the large variety of different effects, particularly leading to individual-specific effects shrunken toward zero. This concept allows us to flexibly estimate the nonlinear age curves of upper quantiles of the BMI distribution, both on population and on individual-specific level, adjusted for further risk factors and to detect age-varying effects of categorical risk factors. Our model approach can be regarded as the quantile regression analog of Gaussian additive mixed models (or structured additive mean regression models), and we compare both model classes with respect to our obesity data.
Wu, Xue; Sengupta, Kaushik
2018-03-19
This paper demonstrates a methodology to miniaturize THz spectroscopes into a single silicon chip by eliminating traditional solid-state architectural components such as complex tunable THz and optical sources, nonlinear mixing and amplifiers. The proposed method achieves this by extracting incident THz spectral signatures from the surface of an on-chip antenna itself. The information is sensed through the spectrally-sensitive 2D distribution of the impressed current surface under the THz incident field. By converting the antenna from a single-port to a massively multi-port architecture with integrated electronics and deep subwavelength sensing, THz spectral estimation is converted into a linear estimation problem. We employ rigorous regression techniques and analysis to demonstrate a single silicon chip system operating at room temperature across 0.04-0.99 THz with 10 MHz accuracy in spectrum estimation of THz tones across the entire spectrum.
An Overview of Longitudinal Data Analysis Methods for Neurological Research
Locascio, Joseph J.; Atri, Alireza
2011-01-01
The purpose of this article is to provide a concise, broad and readily accessible overview of longitudinal data analysis methods, aimed to be a practical guide for clinical investigators in neurology. In general, we advise that older, traditional methods, including (1) simple regression of the dependent variable on a time measure, (2) analyzing a single summary subject level number that indexes changes for each subject and (3) a general linear model approach with a fixed-subject effect, should be reserved for quick, simple or preliminary analyses. We advocate the general use of mixed-random and fixed-effect regression models for analyses of most longitudinal clinical studies. Under restrictive situations or to provide validation, we recommend: (1) repeated-measure analysis of covariance (ANCOVA), (2) ANCOVA for two time points, (3) generalized estimating equations and (4) latent growth curve/structural equation models. PMID:22203825
Naruse, Takashi; Taguchi, Atsuko; Kuwahara, Yuki; Nagata, Satoko; Sakai, Mahiro; Watai, Izumi; Murashima, Sachiyo
2015-05-01
This study evaluated the effect of a skill-mix programme intervention on work engagement in home visiting nurses. A skill-mix programme in which home visiting nurses are assisted by non-nursing workers is assumed to foster home visiting nurses' work engagement. Pre- and post-intervention evaluations of work engagement were conducted using self-administered questionnaires. A skill-mix programme was introduced in the intervention group of home visiting nurses. After 6 months, their pre- and post-intervention work engagement ratings were compared with those of a control group. Baseline questionnaires were returned by 174 home visiting nurses (44 in the intervention group, 130 in the control group). Post-intervention questionnaires were returned by 38 and 97 home visiting nurses from each group. The intervention group's average work engagement scores were 2.2 at baseline and 2.3 at post-intervention; the control group's were 3.3 and 2.6. Generalised linear regression showed significant between-group differences in score changes. The skill-mix programme might foster home visiting nurses' work engagement by improving the quality of care for each client. Future research is needed to explain the exact mechanisms that underlie its effectiveness. In order to improve the efficiency of services provided by home visiting nurses and foster their work engagement, skill-mix programmes might be beneficial. © 2014 John Wiley & Sons Ltd.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Xiaoqiu Zuo; Urs Buehlmann; R. Edward Thomas
2004-01-01
Solving the least-cost lumber grade mix problem allows dimension mills to minimize the cost of dimension part production. This problem, due to its economic importance, has attracted much attention from researchers and industry in the past. Most solutions used linear programming models and assumed that a simple linear relationship existed between lumber grade mix and...
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Equilibrium Climate Sensitivity Obtained From Multimillennial Runs of Two GFDL Climate Models
NASA Astrophysics Data System (ADS)
Paynter, D.; Frölicher, T. L.; Horowitz, L. W.; Silvers, L. G.
2018-02-01
Equilibrium climate sensitivity (ECS), defined as the long-term change in global mean surface air temperature in response to doubling atmospheric CO2, is usually computed from short atmospheric simulations over a mixed layer ocean, or inferred using a linear regression over a short-time period of adjustment. We report the actual ECS from multimillenial simulations of two Geophysical Fluid Dynamics Laboratory (GFDL) general circulation models (GCMs), ESM2M, and CM3 of 3.3 K and 4.8 K, respectively. Both values are 1 K higher than estimates for the same models reported in the Fifth Assessment Report of the Intergovernmental Panel on Climate Change obtained by regressing the Earth's energy imbalance against temperature. This underestimate is mainly due to changes in the climate feedback parameter (-α) within the first century after atmospheric CO2 has stabilized. For both GCMs it is possible to estimate ECS with linear regression to within 0.3 K by increasing CO2 at 1% per year to doubling and using years 51-350 after CO2 is constant. We show that changes in -α differ between the two GCMs and are strongly tied to the changes in both vertical velocity at 500 hPa (ω500) and estimated inversion strength that the GCMs experience during the progression toward the equilibrium. This suggests that while cloud physics parametrizations are important for determining the strength of -α, the substantially different atmospheric state resulting from a changed sea surface temperature pattern may be of equal importance.
NASA Astrophysics Data System (ADS)
Wang, Fei; Zhang, Yijun; Zheng, Dong; Xu, Liangtao; Zhang, Wenjuan; Meng, Qing
2017-10-01
A three-dimensional charge-discharge numerical model is used, in a semi-idealized mode, to simulate a thunder-storm cell. Characteristics of the graupel microphysics and vertical air motion associated with the lightning initiation are revealed, which could be useful in retrieving charge strength during lightning when no charge-discharge model is available. The results show that the vertical air motion at the lightning initiation sites ( W ini) has a cubic polynomial correlation with the maximum updraft of the storm cell ( W cell-max), with the adjusted regression coefficient R 2 of approximately 0.97. Meanwhile, the graupel mixing ratio at the lightning initiation sites ( q g-ini) has a linear correlation with the maximum graupel mixing ratio of the storm cell ( q g-cell-max) and the initiation height ( z ini), with the coefficients being 0.86 and 0.85, respectively. These linear correlations are more significant during the middle and late stages of lightning activity. A zero-charge zone, namely, the area with very low net charge density between the main positive and negative charge layers, appears above the area of q g-cell-max and below the upper edge of the graupel region, and is found to be an important area for lightning initiation. Inside the zero-charge zone, large electric intensity forms, and the ratio of q ice (ice crystal mixing ratio) to q g (graupel mixing ratio) illustrates an exponential relationship to q g-ini. These relationships provide valuable clues to more accurately locating the high-risk area of lightning initiation in thunderstorms when only dual-polarization radar data or outputs from numerical models without charging/discharging schemes are available. The results can also help understand the environmental conditions at lightning initiation sites.
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Statistical classification of drug incidents due to look-alike sound-alike mix-ups.
Wong, Zoie Shui Yee
2016-06-01
It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.
Aircraft measurements of trace gases between Japan and Singapore in October of 1993, 1996, and 1997
NASA Astrophysics Data System (ADS)
Matsueda, Hidekazu; Inoue, Hisayuki Y.
Carbon dioxide (CO2), methane (CH4), and carbon monoxide (CO) mixing ratios were measured in discrete air samples from aircraft between Japan and Singapore in October. The mixing ratios of all trace gases at 9-12 km were enhanced over the South China Sea in 1997 compared with those in 1993 and 1996. Vertical distributions of all trace gases over Singapore in 1997 also showed largely elevated mixing ratios at all altitudes. These distributions indicate a wide outflow of trace gases from intense biomass burning in the southeast Asia regions in the very strong El Niño year. The enhanced trace gases showed a strong linear correlation between CH4 and CO, and between CO and CO2, with the regression slopes of 0.051 (ΔCH4 ppb/ΔCOppb) and 0.089 (ΔCOppb/ΔCO2ppb). The emission ratios are characteristic of fires with relatively lower combustion efficiency from the tropical rain forest and peat lands in Kalimantan and Sumatra of Indonesia.
Su, Nan-Yao; Lee, Sang-Hee
2008-04-01
Marked termites were released in a linear-connected foraging arena, and the spatial heterogeneity of their capture probabilities was averaged for both directions at distance r from release point to obtain a symmetrical distribution, from which the density function of directionally averaged capture probability P(x) was derived. We hypothesized that as marked termites move into the population and given sufficient time, the directionally averaged capture probability may reach an equilibrium P(e) over the distance r and thus satisfy the equal mixing assumption of the mark-recapture protocol. The equilibrium capture probability P(e) was used to estimate the population size N. The hypothesis was tested in a 50-m extended foraging arena to simulate the distance factor of field colonies of subterranean termites. Over the 42-d test period, the density functions of directionally averaged capture probability P(x) exhibited four phases: exponential decline phase, linear decline phase, equilibrium phase, and postequilibrium phase. The equilibrium capture probability P(e), derived as the intercept of the linear regression during the equilibrium phase, correctly projected N estimates that were not significantly different from the known number of workers in the arena. Because the area beneath the probability density function is a constant (50% in this study), preequilibrium regression parameters and P(e) were used to estimate the population boundary distance 1, which is the distance between the release point and the boundary beyond which the population is absent.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
New methodology for modeling annual-aircraft emissions at airports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woodmansey, B.G.; Patterson, J.G.
An as-accurate-as-possible estimation of total-aircraft emissions are an essential component of any environmental-impact assessment done for proposed expansions at major airports. To determine the amount of emissions generated by aircraft using present models it is necessary to know the emission characteristics of all engines that are on all planes using the airport. However, the published data base does not cover all engine types and, therefore, a new methodology is needed to assist in estimating annual emissions from aircraft at airports. Linear regression equations relating quantity of emissions to aircraft weight using a known-fleet mix are developed in this paper. Total-annualmore » emissions for CO, NO[sub x], NMHC, SO[sub x], CO[sub 2], and N[sub 2]O are tabulated for Toronto's international airport for 1990. The regression equations are statistically significant for all emissions except for NMHC from large jets and NO[sub x] and NMHC for piston-engine aircraft. This regression model is a relatively simple, fast, and inexpensive method of obtaining an annual-emission inventory for an airport.« less
Anodic microbial community diversity as a predictor of the power output of microbial fuel cells.
Stratford, James P; Beecroft, Nelli J; Slade, Robert C T; Grüning, André; Avignone-Rossa, Claudio
2014-03-01
The relationship between the diversity of mixed-species microbial consortia and their electrogenic potential in the anodes of microbial fuel cells was examined using different diversity measures as predictors. Identical microbial fuel cells were sampled at multiple time-points. Biofilm and suspension communities were analysed by denaturing gradient gel electrophoresis to calculate the number and relative abundance of species. Shannon and Simpson indices and richness were examined for association with power using bivariate and multiple linear regression, with biofilm DNA as an additional variable. In simple bivariate regressions, the correlation of Shannon diversity of the biofilm and power is stronger (r=0.65, p=0.001) than between power and richness (r=0.39, p=0.076), or between power and the Simpson index (r=0.5, p=0.018). Using Shannon diversity and biofilm DNA as predictors of power, a regression model can be constructed (r=0.73, p<0.001). Ecological parameters such as the Shannon index are predictive of the electrogenic potential of microbial communities. Copyright © 2014 Elsevier Ltd. All rights reserved.
Covariate Selection for Multilevel Models with Missing Data
Marino, Miguel; Buxton, Orfeu M.; Li, Yi
2017-01-01
Missing covariate data hampers variable selection in multilevel regression settings. Current variable selection techniques for multiply-imputed data commonly address missingness in the predictors through list-wise deletion and stepwise-selection methods which are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply-imputed data for multilevel random effects models when missing data is present. Specifically, we propose to stack the multiply-imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyze the Healthy Directions-Small Business cancer prevention study, which evaluated a behavioral intervention program targeting multiple risk-related behaviors in a working-class, multi-ethnic population. PMID:28239457
NASA Astrophysics Data System (ADS)
Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo
2016-11-01
The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
NASA Astrophysics Data System (ADS)
Liu, Zhaoxin; Zhao, Liaoying; Li, Xiaorun; Chen, Shuhan
2018-04-01
Owing to the limitation of spatial resolution of the imaging sensor and the variability of ground surfaces, mixed pixels are widesperead in hyperspectral imagery. The traditional subpixel mapping algorithms treat all mixed pixels as boundary-mixed pixels while ignoring the existence of linear subpixels. To solve this question, this paper proposed a new subpixel mapping method based on linear subpixel feature detection and object optimization. Firstly, the fraction value of each class is obtained by spectral unmixing. Secondly, the linear subpixel features are pre-determined based on the hyperspectral characteristics and the linear subpixel feature; the remaining mixed pixels are detected based on maximum linearization index analysis. The classes of linear subpixels are determined by using template matching method. Finally, the whole subpixel mapping results are iteratively optimized by binary particle swarm optimization algorithm. The performance of the proposed subpixel mapping method is evaluated via experiments based on simulated and real hyperspectral data sets. The experimental results demonstrate that the proposed method can improve the accuracy of subpixel mapping.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Elliott, Marc N; Cohea, Christopher W; Lehrman, William G; Goldstein, Elizabeth H; Cleary, Paul D; Giordano, Laura A; Beckett, Megan K; Zaslavsky, Alan M
2015-12-01
Measure HCAHPS improvement in hospitals participating in the second and fifth years of HCAHPS public reporting; determine whether change is greater for some hospital types. Surveys from 4,822,960 adult inpatients discharged July 2007-June 2008 or July 2010-June 2011 from 3,541 U.S. hospitals. Linear mixed-effect regression models with fixed effects for time, patient mix, and hospital characteristics (bedsize, ownership, Census division, teaching status, Critical Access status); random effects for hospitals and hospital-time interactions; fixed-effect interactions of hospital characteristics and patient characteristics (gender, health, education) with time predicted HCAHPS measures correcting for regression-to-the-mean biases. National probability sample of adult inpatients in any of four approved survey modes. HCAHPS scores increased by 2.8 percentage points from 2008 to 2011 in the most positive response category. Among the middle 95 percent of hospitals, changes ranged from a 5.1 percent decrease to a 10.2 percent gain overall. The greatest improvement was in for-profit and larger (200 or more beds) hospitals. Five years after HCAHPS public reporting began, meaningful improvement of patients' hospital care experiences continues, especially among initially low-scoring hospitals, reducing some gaps among hospitals. © Health Research and Educational Trust.
Alcohol Policy Comprehension, Compliance and Consequences Among Young Adult Restaurant Workers
Ames, Genevieve M.; Cunradi, Carol B.; Duke, Michael R.
2012-01-01
SUMMARY This study explores relationships between young adult restaurant employees' understanding and compliance with workplace alcohol control policies and consequences of alcohol policy violation. A mixed method analysis of 67 semi-structured interviews and 1,294 telephone surveys from restaurant chain employees found that alcohol policy details confused roughly a third of employees. Among current drinkers (n=1,093), multivariable linear regression analysis found that frequency of alcohol policy violation was positively associated with frequency of experiencing problems at work; perceived supervisor enforcement of alcohol policy was negatively associated with this outcome. Implications for preventing workplace alcohol-related problems include streamlining confusing alcohol policy guidelines. PMID:22984360
NASA Technical Reports Server (NTRS)
Cromwell, R. L.; Zanello, S. B.; Yarbough, P. O.; Ploutz-Snyder, R.; Taibbi, G.; Brewer, J. L.; Vizzeri, G.
2013-01-01
Mean IOP significantly increased while at 6deg HDT and returned towards pre-bed rest values upon leaving bed rest. While mean IOP increased during bed rest, it remained within the normal limits for subject safety. A diuretic shift and cardiovascular deconditioning occurs during in-bed rest, as expected. There was no demonstrable correlation between the largest change in IOP (pre/post) and cardiovascular measure changes (pre/post). Additional mixed effects linear regression modeling may reveal some subclinical physiological changes that might assist in describing the VIIP syndrome pathophysiology.
Gas detection by correlation spectroscopy employing a multimode diode laser.
Lou, Xiutao; Somesfalean, Gabriel; Zhang, Zhiguo
2008-05-01
A gas sensor based on the gas-correlation technique has been developed using a multimode diode laser (MDL) in a dual-beam detection scheme. Measurement of CO(2) mixed with CO as an interfering gas is successfully demonstrated using a 1570 nm tunable MDL. Despite overlapping absorption spectra and occasional mode hops, the interfering signals can be effectively excluded by a statistical procedure including correlation analysis and outlier identification. The gas concentration is retrieved from several pair-correlated signals by a linear-regression scheme, yielding a reliable and accurate measurement. This demonstrates the utility of the unsophisticated MDLs as novel light sources for gas detection applications.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
An R2 statistic for fixed effects in the linear mixed model.
Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver
2008-12-20
Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.
Convex set and linear mixing model
NASA Technical Reports Server (NTRS)
Xu, P.; Greeley, R.
1993-01-01
A major goal of optical remote sensing is to determine surface compositions of the earth and other planetary objects. For assessment of composition, single pixels in multi-spectral images usually record a mixture of the signals from various materials within the corresponding surface area. In this report, we introduce a closed and bounded convex set as a mathematical model for linear mixing. This model has a clear geometric implication because the closed and bounded convex set is a natural generalization of a triangle in n-space. The endmembers are extreme points of the convex set. Every point in the convex closure of the endmembers is a linear mixture of those endmembers, which is exactly how linear mixing is defined. With this model, some general criteria for selecting endmembers could be described. This model can lead to a better understanding of linear mixing models.
Wodschow, Kirstine; Hansen, Birgitte; Schullehner, Jörg; Ersbøll, Annette Kjær
2018-06-08
Concentrations and spatial variations of the four cations Na, K, Mg and Ca are known to some extent for groundwater and to a lesser extent for drinking water. Using Denmark as case, the purpose of this study was to analyze the spatial and temporal variations in the major cations in drinking water. The results will contribute to a better exposure estimation in future studies of the association between cations and diseases. Spatial and temporal variations and the association with aquifer types, were analyzed with spatial scan statistics, linear regression and a multilevel mixed-effects linear regression model. About 65,000 water samples of each cation (1980⁻2017) were included in the study. Results of mean concentrations were 31.4 mg/L, 3.5 mg/L, 12.1 mg/L and 84.5 mg/L for 1980⁻2017 for Na, K, Mg and Ca, respectively. An expected west-east trend in concentrations were confirmed, mainly explained by variations in aquifer types. The trend in concentration was stable for about 31⁻45% of the public water supply areas. It is therefore recommended that the exposure estimate in future health related studies not only be based on a single mean value, but that temporal and spatial variations should also be included.
Amador, Julia; Hartel, Rich; Rankin, Scott
2017-08-01
The purpose of this work was to investigate iciness perception and other sensory textural attributes of ice cream due to ice and fat structures and mix viscosity. Two studies were carried out varying processing conditions and mix formulation. In the 1st study, ice creams were collected at -3, -5, and -7.5 °C draw temperatures. These ice creams contained 0%, 0.1%, or 0.2% emulsifier, an 80:20 blend of mono- and diglycerides: polysorbate 80. In the 2nd study, ice creams were collected at -3 °C draw temperature and contained 0%, 0.2%, or 0.4% stabilizer, a blend of guar gum, locust bean gum, and carrageenan. Multiple linear regressions were used to determine relationships between ice crystal size, destabilized fat, and sensory iciness. In the ice and fat structure study, an inverse correlation was found between fat destabilization and sensory iciness. Ice creams with no difference in ice crystal size were perceived to be less icy with increasing amounts of destabilized fat. Destabilized fat correlated inversely with drip-through rate and sensory greasiness. In the ice cream mix viscosity study, an inverse correlation was found between mix viscosity and sensory iciness. Ice creams with no difference in ice crystal size were perceived to be less icy when formulated with higher mix viscosity. A positive correlation was found between mix viscosity and sensory greasiness. These results indicate that fat structures and mix viscosity have significant effects on ice cream microstructure and sensory texture including the reduction of iciness perception. © 2017 Institute of Food Technologists®.
Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Gimelfarb, A.; Willis, J. H.
1994-01-01
An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Yoneoka, Daisuke; Henmi, Masayuki
2017-11-30
Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta-analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2-step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed-effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts. Copyright © 2017 John Wiley & Sons, Ltd.
Li, Yuan-Xiao; Zhao, Guang-Yong
2007-04-01
The objective of the present experiment was to study the relationship between in vitro utilizable true protein (uTP) and in vivo-uTP of sheep rations by regression analysis. A further aim was to analyse if in vivo-uTP of mixed rations could be predicted by regression analysis between in vitro-uTP and in vivo-uTP, using N-retention of sheep as important evaluation criteria of protein value. Three adult male sheep (body weight [BW] 46 + 1.3 kg) fitted with rumen cannulas and simple T-type duodenal cannulas were fed with twelve typical rations with graded levels of crude protein and true protein in four experiments according a 3 x 3 Latin square design. Each experimental period included an adaptation (7 days), a N balance trial (4 days) and a collection of duodenal digesta (3 days). During collection of duodenal digesta, polyethylene glycol and chromium oxide were used as dual markers for the measurement of duodenal digesta flow and calculation of the in vivo-uTP of duodenal digesta. The in vitro-uTP of the rations was determined using the in vitro incubation technique of Zhao and Lebzien (2000). It was found that both in vitro-uTP intake and in vivo-uTP intake were significantly correlated with N-retention (p < 0.001) and that there was a significant linear relationship between the content of in vitro-uTP and in vivo-uTP in rations (p < 0.001). Therefore, it was concluded that the used in vitro incubation technique is suitable for the determination of in vitro-uTP of mixed rations for sheep, and that the amount of in vivo-uTP can be predicted by regression between in vitro-uTP and in vivo-uTP.
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
Young, Vicki L.; Cole, Amy; Lecky, Donna M.; Fettis, Dennis; Pritchard, Beth; Verlander, Neville Q.; Eley, Charlotte V.; McNulty, Cliodna A. M.
2017-01-01
Abstract Background: Delivering health topics in schools through peer education is known to be beneficial for all students involved. In this study, we have evaluated a peer-education workshop that aims to educate primary and secondary school students on hygiene, the spread of infection and antibiotics. Methods: Four schools in south-west England, in a range of localities, took part in peer-education workshops, with students completing before, after and knowledge-retention questionnaires. Mixed-effect logistic regression and mixed-effect linear regression were used to analyse the data. Data were analysed by topic, region and peer/non-peer-educator status. Qualitative interviews and focus groups with students and educators were conducted to assess changes in participants’ skills, confidence and behaviour. Results: Qualitative data indicated improvements in peer-educator skills and behaviour, including confidence, team-working and communication. There was a significant improvement in knowledge for all topics covered in the intervention, although this varied by region. In the antibiotics topic, peer-educators’ knowledge increased in the retention questionnaire, whereas non-peer-educators’ knowledge decreased. Knowledge declined in the retention questionnaires for the other topics, although this was mostly not significant. Conclusions: This study indicates that peer education is an effective way to educate young people on important topics around health and hygiene, and to concurrently improve communication skills. Its use should be encouraged across schools to help in the implementation of the National Institute for Health and Care Excellence (NICE) guidance that recommends children are taught in an age-appropriate manner about hygiene and antibiotics. PMID:28333334
Srinivas, Nuggehally R
2016-01-01
Although an optimized delivery of rivastigmine for management of Alzhiemer disease (AD) is provided by the transdermal patch, it is critical to establish a limited sampling strategy for the measurement of exposure of rivastigmine/NAP226-90. The relationship Cmax versus AUC0-24h for rivastigmine/NAP226-90 was established by regression models. The derived regression equations enabled the prediction AUC0-24h for rivastigmine and NAP226-90. Models were evaluated using statistical criteria. Mixed model was used to predict AUC0-24h for rivastigmine/NAP226-90 from time points such as 8 (C8h), 12 (C12h), and 18 (C18h) hours. Excellent correlation was established for between Cmax and AUC0-24h for rivastigmine and NAP226-90. AUC0-24h predictions of either rivastigmine or NAP226-90 were within 0.8- to 1.25-fold difference. The RMSE in the AUC0-24h predictions ranged from 17.6% to 21.9%, and the R for prediction were greater than 0.96 for both rivastigmine and NAP226-90. Use of mixed model for C8h, C12h, and C18h resulted in AUC0-24h within 1.5-fold difference for rivastigmine or NAP226-90. Cmax of rivastigmine and NAP226-90 was highly correlated with the corresponding AUC0-24h values confirming the role of a time point closer to Cmax for an effective AUC measurement of rivastigmine or the metabolite.
Breivik, Cathrine Nansdal; Nilsen, Roy Miodini; Myrseth, Erling; Pedersen, Paal Henning; Varughese, Jobin K; Chaudhry, Aqeel Asghar; Lund-Johansen, Morten
2013-07-01
There are few reports about the course of vestibular schwannoma (VS) patients following gamma knife radiosurgery (GKRS) compared with the course following conservative management (CM). In this study, we present prospectively collected data of 237 patients with unilateral VS extending outside the internal acoustic canal who received either GKRS (113) or CM (124). The aim was to measure the effect of GKRS compared with the natural course on tumor growth rate and hearing loss. Secondary end points were postinclusion additional treatment, quality of life (QoL), and symptom development. The patients underwent magnetic resonance imaging scans, clinical examination, and QoL assessment by SF-36 questionnaire. Statistics were performed by using Spearman correlation coefficient, Kaplan-Meier plot, Poisson regression model, mixed linear regression models, and mixed logistic regression models. Mean follow-up time was 55.0 months (26.1 standard deviation, range 10-132). Thirteen patients were lost to follow-up. Serviceable hearing was lost in 54 of 71 (76%) (CM) and 34 of 53 (64%) (GKRS) patients during the study period (not significant, log-rank test). There was a significant reduction in tumor volume over time in the GKRS group. The need for treatment following initial GKRS or CM differed at highly significant levels (log-rank test, P < .001). Symptom and QoL development did not differ significantly between the groups. In VS patients, GKRS reduces the tumor growth rate and thereby the incidence rate of new treatment about tenfold. Hearing is lost at similar rates in both groups. Symptoms and QoL seem not to be significantly affected by GKRS.
Control Variate Selection for Multiresponse Simulation.
1987-05-01
M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
Evaluation of parameters in mixed male DNA profiles for the Identifiler® multiplex system
HU, NA; CONG, BIN; GAO, TAO; HU, RONG; CHEN, YI; TANG, HUI; XUE, LUYAN; LI, SHUJIN; MA, CHUNLING
2014-01-01
The analysis of complex DNA mixtures is challenging for forensic DNA testing. Accurate and sensitive methods for profiling these samples are urgently required. In this study, we developed 11 groups of mixed male DNA samples (n=297) with scientific validation of D-value [>95% of D-values ≤0.1 with average peak height (APH) of the active alleles ≤2,500 rfu]. A strong linear correlation was detected between the peak height (PH) and peak area (PA) in the curve fit using the least squares method (P<2e-16). The Kruskal-Wallis rank-sum test revealed significant differences in the heterozygote balance ratio (Hb) at 16 short tandem repeat (STR) loci (P=0.0063) and 9 mixed gradients (P=0.02257). Locally weighted regression fitting of APH and Hb (inflection point at APH = 1,250 rfu) showed 92.74% of Hb >0.6 with the APH ≥1,250. The variation of Hb distribution in the different STR loci suggested the different forensic efficiencies of these loci. Allelic drop-out (ADO) correlated with the APH and mixed gradient. All ADOs had an APH of <1,000 rfu, and the number of ADO increased when the APH of mixed DNA profiles gradually decreased. These results strongly suggest that calibration parameters should be introduced to correct the deviation in the APH at each STR locus during the analysis of mixed DNA samples. PMID:24821391
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.
Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D
2018-05-30
NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
Determination of stress intensity factors for interface cracks under mixed-mode loading
NASA Technical Reports Server (NTRS)
Naik, Rajiv A.; Crews, John H., Jr.
1992-01-01
A simple technique was developed using conventional finite element analysis to determine stress intensity factors, K1 and K2, for interface cracks under mixed-mode loading. This technique involves the calculation of crack tip stresses using non-singular finite elements. These stresses are then combined and used in a linear regression procedure to calculate K1 and K2. The technique was demonstrated by calculating three different bimaterial combinations. For the normal loading case, the K's were within 2.6 percent of an exact solution. The normalized K's under shear loading were shown to be related to the normalized K's under normal loading. Based on these relations, a simple equation was derived for calculating K1 and K2 for mixed-mode loading from knowledge of the K's under normal loading. The equation was verified by computing the K's for a mixed-mode case with equal and normal shear loading. The correlation between exact and finite element solutions is within 3.7 percent. This study provides a simple procedure to compute K2/K1 ratio which has been used to characterize the stress state at the crack tip for various combinations of materials and loadings. Tests conducted over a range of K2/K1 ratios could be used to fully characterize interface fracture toughness.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Haem, Elham; Harling, Kajsa; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf; Karlsson, Mats O
2017-02-01
One important aim in population pharmacokinetics (PK) and pharmacodynamics is identification and quantification of the relationships between the parameters and covariates. Lasso has been suggested as a technique for simultaneous estimation and covariate selection. In linear regression, it has been shown that Lasso possesses no oracle properties, which means it asymptotically performs as though the true underlying model was given in advance. Adaptive Lasso (ALasso) with appropriate initial weights is claimed to possess oracle properties; however, it can lead to poor predictive performance when there is multicollinearity between covariates. This simulation study implemented a new version of ALasso, called adjusted ALasso (AALasso), to take into account the ratio of the standard error of the maximum likelihood (ML) estimator to the ML coefficient as the initial weight in ALasso to deal with multicollinearity in non-linear mixed-effect models. The performance of AALasso was compared with that of ALasso and Lasso. PK data was simulated in four set-ups from a one-compartment bolus input model. Covariates were created by sampling from a multivariate standard normal distribution with no, low (0.2), moderate (0.5) or high (0.7) correlation. The true covariates influenced only clearance at different magnitudes. AALasso, ALasso and Lasso were compared in terms of mean absolute prediction error and error of the estimated covariate coefficient. The results show that AALasso performed better in small data sets, even in those in which a high correlation existed between covariates. This makes AALasso a promising method for covariate selection in nonlinear mixed-effect models.
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
Shirk, Andrew J; Landguth, Erin L; Cushman, Samuel A
2018-01-01
Anthropogenic migration barriers fragment many populations and limit the ability of species to respond to climate-induced biome shifts. Conservation actions designed to conserve habitat connectivity and mitigate barriers are needed to unite fragmented populations into larger, more viable metapopulations, and to allow species to track their climate envelope over time. Landscape genetic analysis provides an empirical means to infer landscape factors influencing gene flow and thereby inform such conservation actions. However, there are currently many methods available for model selection in landscape genetics, and considerable uncertainty as to which provide the greatest accuracy in identifying the true landscape model influencing gene flow among competing alternative hypotheses. In this study, we used population genetic simulations to evaluate the performance of seven regression-based model selection methods on a broad array of landscapes that varied by the number and type of variables contributing to resistance, the magnitude and cohesion of resistance, as well as the functional relationship between variables and resistance. We also assessed the effect of transformations designed to linearize the relationship between genetic and landscape distances. We found that linear mixed effects models had the highest accuracy in every way we evaluated model performance; however, other methods also performed well in many circumstances, particularly when landscape resistance was high and the correlation among competing hypotheses was limited. Our results provide guidance for which regression-based model selection methods provide the most accurate inferences in landscape genetic analysis and thereby best inform connectivity conservation actions. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Queiroz, Valterlinda A O; Assis, Ana Marlúcia O; Pinheiro, Sandra Maria C; Ribeiro, Hugo C Ribeiro
2012-01-01
To investigate covariates that could affect the variation in mean length/age z scores in the first year of life of children born full term with normal birth weight. This was a prospective study of a cohort of mother-infant pairs recruited at public maternity units in two municipalities in the Brazilian state of Bahia, from March 2005 to October 2006. This paper reports the results for linear growth of 489 children who were followed-up for the first 12 months of their lives. A mixed-effect regression model was used to investigate the influence of covariates of mean length/age z score during the first year of life. The multivariate mixed effect analysis indicated that mothers not cohabiting with a partner (beta = 0.2347; p = 0.004) and increased duration of exclusive breastfeeding (beta = 0.0031; p < 0.001) had a positive impact, whereas mother's height less than 150 cm (beta = -0.4393; p < 0.001), birth weight of 2,500-2,999 g (beta = -0.8084; p < 0.001) and anemia in the child (beta = -0.0875; p < 0.001) all had a negative impact on the variation in estimated length/age z score. Therefore, the results of this study indicate that short maternal stature, birth weight < 3,000 g and anemia in the infant had a negative effect on linear growth during the first year of life, whereas longer duration of exclusive breastfeeding and mothers who did not cohabit with a partner had a positive effect.
2015-07-15
Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Estimating linear temporal trends from aggregated environmental monitoring data
Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.
2017-01-01
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Jaworski, N W; Liu, D W; Li, D F; Stein, H H
2016-07-01
An experiment was conducted to determine effects on DE, ME, and NE for growing pigs of adding 15 or 30% wheat bran to a corn-soybean meal diet and to compare values for DE, ME, and NE calculated using the difference procedure with values obtained using linear regression. Eighteen barrows (54.4 ± 4.3 kg initial BW) were individually housed in metabolism crates. The experiment had 3 diets and 6 replicate pigs per diet. The control diet contained corn, soybean meal, and no wheat bran. Two additional diets were formulated by mixing 15 or 30% wheat bran with 85 or 70% of the control diet, respectively. The experimental period lasted 15 d. During the initial 7 d, pigs were adapted to their experimental diets and housed in metabolism crates and fed 573 kcal ME/kg BW per day. On d 8, metabolism crates with the pigs were moved into open-circuit respiration chambers for measurement of O consumption and CO and CH production. The feeding level was the same as in the adaptation period, and feces and urine were collected during this period. On d 13 and 14, pigs were fed 225 kcal ME/kg BW per day, and pigs were then fasted for 24 h to obtain fasting heat production. Results of the experiment indicated that the apparent total tract digestibility of DM, GE, crude fiber, ADF, and NDF linearly decreased ( ≤ 0.05) as wheat bran inclusion increased in the diets. The daily O consumption and CO and CH production by pigs fed increasing concentrations of wheat bran linearly decreased ( ≤ 0.05), resulting in a linear decrease ( ≤ 0.05) in heat production. The DE (3,454, 3,257, and 3,161 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively for diets containing 0, 15, and 30% wheat bran, respectively), ME (3,400, 3,209, and 3,091 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively), and NE (1,808, 1,575, and 1,458 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively) of diets decreased (linear, ≤ 0.05) as wheat bran inclusion increased. The DE, ME, and NE of wheat bran determined using the difference procedure were 2,168, 2,117, and 896 kcal/kg, respectively, and these values were within the 95% confidence interval of the DE (2,285 kcal/kg), ME (2,217 kcal/kg), and NE (961 kcal/kg) estimated by linear regression. In conclusion, increasing the inclusion of wheat bran in a corn-soybean meal based diet reduced energy and nutrient digestibility and heat production as well as DE, ME, and NE of diets, but values for DE, ME, and NE for wheat bran determined using the difference procedure were not different from values determined using linear regression.
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Ding, Chuan; Chen, Peng; Jiao, Junfeng
2018-03-01
Although a growing body of literature focuses on the relationship between the built environment and pedestrian crashes, limited evidence is provided about the relative importance of many built environment attributes by accounting for their mutual interaction effects and their non-linear effects on automobile-involved pedestrian crashes. This study adopts the approach of Multiple Additive Poisson Regression Trees (MAPRT) to fill such gaps using pedestrian collision data collected from Seattle, Washington. Traffic analysis zones are chosen as the analytical unit. The effects of various factors on pedestrian crash frequency investigated include characteristics the of road network, street elements, land use patterns, and traffic demand. Density and the degree of mixed land use have major effects on pedestrian crash frequency, accounting for approximately 66% of the effects in total. More importantly, some factors show clear non-linear relationships with pedestrian crash frequency, challenging the linearity assumption commonly used in existing studies which employ statistical models. With various accurately identified non-linear relationships between the built environment and pedestrian crashes, this study suggests local agencies to adopt geo-spatial differentiated policies to establish a safe walking environment. These findings, especially the effective ranges of the built environment, provide evidence to support for transport and land use planning, policy recommendations, and road safety programs. Copyright © 2018 Elsevier Ltd. All rights reserved.
Performance characteristics of LOX-H2, tangential-entry, swirl-coaxial, rocket injectors
NASA Technical Reports Server (NTRS)
Howell, Doug; Petersen, Eric; Clark, Jim
1993-01-01
Development of a high performing swirl-coaxial injector requires an understanding of fundamental performance characteristics. This paper addresses the findings of studies on cold flow atomic characterizations which provided information on the influence of fluid properties and element operating conditions on the produced droplet sprays. These findings are applied to actual rocket conditions. The performance characteristics of swirl-coaxial injection elements under multi-element hot-fire conditions were obtained by analysis of combustion performance data from three separate test series. The injection elements are described and test results are analyzed using multi-variable linear regression. A direct comparison of test results indicated that reduced fuel injection velocity improved injection element performance through improved propellant mixing.
Meyer, Ilan H.; Schwartz, Sharon; Frost, David M.
2008-01-01
Despite its centrality to social stress theory, research on the social patterning of stress exposure and coping resources has been sparse and existing research shows conflicting results. We interviewed 396 gay, lesbian and bisexual, and 128 heterosexual people in New York City to examine variability in exposure to stress related to sexual orientation, gender, and race/ethnicity. Multiple linear regression showed clear support for the social stress hypothesis with regard to race/ethnic minority status, somewhat mixed support with regard to sexual orientation, and no support with regard to gender. We discuss this lack of parsimony in social stress explanations for health disparities. PMID:18433961
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Hospital support services and the impacts of outsourcing on occupational health and safety.
Siganporia, Pearl; Astrakianakis, George; Alamgir, Hasanat; Ostry, Aleck; Nicol, Anne-Marie; Koehoorn, Mieke
2016-10-01
Outsourcing labor is linked to negative impacts on occupational health and safety (OHS). In British Columbia, Canada, provincial health care service providers outsource support services such as cleaners and food service workers (CFSWs) to external contractors. This study investigates the impact of outsourcing on the occupational health safety of hospital CFSWs through a mixed methods approach. Worker's compensation data for hospital CFSWs were analyzed by negative binomial and multiple linear regressions supplemented by iterative thematic analysis of telephone interviews of the same job groups. Non-significant decreases in injury rates and days lost per injury were observed in outsourced CFSWs post outsourcing. Significant decreases (P < 0.05) were observed in average costs per injury for cleaners post outsourcing. Outsourced workers interviewed implied instances of underreporting workplace injuries. This mixed methods study describes the impact of outsourcing on OHS of healthcare workers in British Columbia. Results will be helpful for policy-makers and workplace regulators to assess program effectiveness for outsourced workers.
Functional Additive Mixed Models
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2014-01-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach. PMID:26347592
Functional Additive Mixed Models.
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2015-04-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach.
Hospital support services and the impacts of outsourcing on occupational health and safety
Alamgir, Hasanat; Ostry, Aleck; Nicol, Anne-Marie; Koehoorn, Mieke
2016-01-01
Background Outsourcing labor is linked to negative impacts on occupational health and safety (OHS). In British Columbia, Canada, provincial health care service providers outsource support services such as cleaners and food service workers (CFSWs) to external contractors. Objectives This study investigates the impact of outsourcing on the occupational health safety of hospital CFSWs through a mixed methods approach. Methods Worker’s compensation data for hospital CFSWs were analyzed by negative binomial and multiple linear regressions supplemented by iterative thematic analysis of telephone interviews of the same job groups. Results Non-significant decreases in injury rates and days lost per injury were observed in outsourced CFSWs post outsourcing. Significant decreases (P < 0.05) were observed in average costs per injury for cleaners post outsourcing. Outsourced workers interviewed implied instances of underreporting workplace injuries. Conclusions This mixed methods study describes the impact of outsourcing on OHS of healthcare workers in British Columbia. Results will be helpful for policy-makers and workplace regulators to assess program effectiveness for outsourced workers. PMID:27696988
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
2017-10-01
ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies.
Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre
2018-03-15
Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. We propose a methodology based on Cox mixed models and written under the R language. This semiparametric model is indeed flexible enough to fit duration data. To compare log-linear and Cox mixed models in terms of goodness-of-fit on real data sets, we also provide a procedure based on simulations and quantile-quantile plots. We present two examples from a data set of speech and gesture interactions, which illustrate the limitations of linear and log-linear mixed models, as compared to Cox models. The linear models are not validated on our data, whereas Cox models are. Moreover, in the second example, the Cox model exhibits a significant effect that the linear model does not. We provide methods to select the best-fitting models for repeated duration data and to compare statistical methodologies. In this study, we show that Cox models are best suited to the analysis of our data set.
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Unequal views of inequality: Cross-national support for redistribution 1985-2011.
VanHeuvelen, Tom
2017-05-01
This research examines public views on government responsibility to reduce income inequality, support for redistribution. While individual-level correlates of support for redistribution are relatively well understood, many questions remain at the country-level. Therefore, I examine how country-level characteristics affect aggregate support for redistribution. I test explanations of aggregate support using a unique dataset combining 18 waves of the International Social Survey Programme and European Social Survey. Results from mixed-effects logistic regression and fixed-effects linear regression models show two primary and contrasting effects. States that reduce inequality through bundles of tax and transfer policies are rewarded with more supportive publics. In contrast, economic development has a seemingly equivalent and dampening effect on public support. Importantly, the effect of economic development grows at higher levels of development, potentially overwhelming the amplifying effect of state redistribution. My results therefore suggest a fundamental challenge to proponents of egalitarian politics. Copyright © 2016 Elsevier Inc. All rights reserved.
Hanks, Ephraim M.; Schliep, Erin M.; Hooten, Mevin B.; Hoeting, Jennifer A.
2015-01-01
In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
The Relationship Between Oxygen Reserve Index and Arterial Partial Pressure of Oxygen During Surgery
Dorotta, Ihab L.; Wells, Briana; Juma, David; Applegate, Patricia M.
2016-01-01
BACKGROUND: The use of intraoperative pulse oximetry (Spo2) enhances hypoxia detection and is associated with fewer perioperative hypoxic events. However, Spo2 may be reported as 98% when arterial partial pressure of oxygen (Pao2) is as low as 70 mm Hg. Therefore, Spo2 may not provide advance warning of falling arterial oxygenation until Pao2 approaches this level. Multiwave pulse co-oximetry can provide a calculated oxygen reserve index (ORI) that may add to information from pulse oximetry when Spo2 is >98%. This study evaluates the ORI to Pao2 relationship during surgery. METHODS: We studied patients undergoing scheduled surgery in which arterial catheterization and intraoperative arterial blood gas analysis were planned. Data from multiple pulse co-oximetry sensors on each patient were continuously collected and stored on a research computer. Regression analysis was used to compare ORI with Pao2 obtained from each arterial blood gas measurement and changes in ORI with changes in Pao2 from sequential measurements. Linear mixed-effects regression models for repeated measures were then used to account for within-subject correlation across the repeatedly measured Pao2 and ORI and for the unequal time intervals of Pao2 determination over elapsed surgical time. Regression plots were inspected for ORI values corresponding to Pao2 of 100 and 150 mm Hg. ORI and Pao2 were compared using mixed-effects models with a subject-specific random intercept. RESULTS: ORI values and Pao2 measurements were obtained from intraoperative data collected from 106 patients. Regression analysis showed that the ORI to Pao2 relationship was stronger for Pao2 to 240 mm Hg (r2 = 0.536) than for Pao2 over 240 mm Hg (r2 = 0.0016). Measured Pao2 was ≥100 mm Hg for all ORI over 0.24. Measured Pao2 was ≥150 mm Hg in 96.6% of samples when ORI was over 0.55. A random intercept variance component linear mixed-effects model for repeated measures indicated that Pao2 was significantly related to ORI (β[95% confidence interval] = 0.002 [0.0019–0.0022]; P < 0.0001). A similar analysis indicated a significant relationship between change in Pao2 and change in ORI (β [95% confidence interval] = 0.0044 [0.0040–0.0048]; P < 0.0001). CONCLUSIONS: These findings suggest that ORI >0.24 can distinguish Pao2 ≥100 mm Hg when Spo2 is over 98%. Similarly, ORI > 0.55 appears to be a threshold to distinguish Pao2 ≥150 mm Hg. The usefulness of these values should be evaluated prospectively. Decreases in ORI to near 0.24 may provide advance indication of falling Pao2 approaching 100 mm Hg when Spo2 is >98%. The clinical utility of interventions based on continuous ORI monitoring should be studied prospectively. PMID:27007078
Applegate, Richard L; Dorotta, Ihab L; Wells, Briana; Juma, David; Applegate, Patricia M
2016-09-01
The use of intraoperative pulse oximetry (SpO2) enhances hypoxia detection and is associated with fewer perioperative hypoxic events. However, SpO2 may be reported as 98% when arterial partial pressure of oxygen (PaO2) is as low as 70 mm Hg. Therefore, SpO2 may not provide advance warning of falling arterial oxygenation until PaO2 approaches this level. Multiwave pulse co-oximetry can provide a calculated oxygen reserve index (ORI) that may add to information from pulse oximetry when SpO2 is >98%. This study evaluates the ORI to PaO2 relationship during surgery. We studied patients undergoing scheduled surgery in which arterial catheterization and intraoperative arterial blood gas analysis were planned. Data from multiple pulse co-oximetry sensors on each patient were continuously collected and stored on a research computer. Regression analysis was used to compare ORI with PaO2 obtained from each arterial blood gas measurement and changes in ORI with changes in PaO2 from sequential measurements. Linear mixed-effects regression models for repeated measures were then used to account for within-subject correlation across the repeatedly measured PaO2 and ORI and for the unequal time intervals of PaO2 determination over elapsed surgical time. Regression plots were inspected for ORI values corresponding to PaO2 of 100 and 150 mm Hg. ORI and PaO2 were compared using mixed-effects models with a subject-specific random intercept. ORI values and PaO2 measurements were obtained from intraoperative data collected from 106 patients. Regression analysis showed that the ORI to PaO2 relationship was stronger for PaO2 to 240 mm Hg (r = 0.536) than for PaO2 over 240 mm Hg (r = 0.0016). Measured PaO2 was ≥100 mm Hg for all ORI over 0.24. Measured PaO2 was ≥150 mm Hg in 96.6% of samples when ORI was over 0.55. A random intercept variance component linear mixed-effects model for repeated measures indicated that PaO2 was significantly related to ORI (β[95% confidence interval] = 0.002 [0.0019-0.0022]; P < 0.0001). A similar analysis indicated a significant relationship between change in PaO2 and change in ORI (β [95% confidence interval] = 0.0044 [0.0040-0.0048]; P < 0.0001). These findings suggest that ORI >0.24 can distinguish PaO2 ≥100 mm Hg when SpO2 is over 98%. Similarly, ORI > 0.55 appears to be a threshold to distinguish PaO2 ≥150 mm Hg. The usefulness of these values should be evaluated prospectively. Decreases in ORI to near 0.24 may provide advance indication of falling PaO2 approaching 100 mm Hg when SpO2 is >98%. The clinical utility of interventions based on continuous ORI monitoring should be studied prospectively.
Duane, B G; Freeman, R; Richards, D; Crosbie, S; Patel, P; White, S; Humphris, G
2017-03-01
To commission dental services for vulnerable (special care) patient groups effectively, consistently and fairly an evidence base is needed of the costs involved. The simplified Case Mixed Tool (sCMT) can assess treatment mode complexity for these patient groups. To determine if the sCMT can be used to identify costs of service provision. Patients (n=495) attending the Sussex Community NHS Trust Special Care Dental Service for care were assessed using the sCMT. sCMT score and costs (staffing, laboratory fees, etc.) besides patient age, whether a new patient and use of general anaesthetic/intravenous sedation. Statistical analysis (adjusted linear regression modelling) compared sCMT score and costs then sensitivity analyses of the costings to age, being a new patient and sedation use were undertaken. Regression tables were produced to present estimates of service costs. Costs increased with sCMT total scale and single item values in a predictable manner in all analyses except for 'cooperation'. Costs increased with the use of IV sedation; with each rising level of the sCMT, and with complexity in every sCMT category, except cooperation. Costs increased with increase in complexity of treatment mode as measured by sCMT scores. Measures such as the sCMT can provide predictions of the resource allocations required when commissioning special care dental services. Copyright© 2017 Dennis Barber Ltd.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
NASA Astrophysics Data System (ADS)
Astuti, H. N.; Saputro, D. R. S.; Susanti, Y.
2017-06-01
MGWR model is combination of linear regression model and geographically weighted regression (GWR) model, therefore, MGWR model could produce parameter estimation that had global parameter estimation, and other parameter that had local parameter in accordance with its observation location. The linkage between locations of the observations expressed in specific weighting that is adaptive bi-square. In this research, we applied MGWR model with weighted adaptive bi-square for case of DHF in Surakarta based on 10 factors (variables) that is supposed to influence the number of people with DHF. The observation unit in the research is 51 urban villages and the variables are number of inhabitants, number of houses, house index, many public places, number of healthy homes, number of Posyandu, area width, level population density, welfare of the family, and high-region. Based on this research, we obtained 51 MGWR models. The MGWR model were divided into 4 groups with significant variable is house index as a global variable, an area width as a local variable and the remaining variables vary in each. Global variables are variables that significantly affect all locations, while local variables are variables that significantly affect a specific location.
Non-Linear Concentration-Response Relationships between Ambient Ozone and Daily Mortality.
Bae, Sanghyuk; Lim, Youn-Hee; Kashima, Saori; Yorifuji, Takashi; Honda, Yasushi; Kim, Ho; Hong, Yun-Chul
2015-01-01
Ambient ozone (O3) concentration has been reported to be significantly associated with mortality. However, linearity of the relationships and the presence of a threshold has been controversial. The aim of the present study was to examine the concentration-response relationship and threshold of the association between ambient O3 concentration and non-accidental mortality in 13 Japanese and Korean cities from 2000 to 2009. We selected Japanese and Korean cities which have population of over 1 million. We constructed Poisson regression models adjusting daily mean temperature, daily mean PM10, humidity, time trend, season, year, day of the week, holidays and yearly population. The association between O3 concentration and mortality was examined using linear, spline and linear-threshold models. The thresholds were estimated for each city, by constructing linear-threshold models. We also examined the city-combined association using a generalized additive mixed model. The mean O3 concentration did not differ greatly between Korea and Japan, which were 26.2 ppb and 24.2 ppb, respectively. Seven out of 13 cities showed better fits for the spline model compared with the linear model, supporting a non-linear relationships between O3 concentration and mortality. All of the 7 cities showed J or U shaped associations suggesting the existence of thresholds. The range of city-specific thresholds was from 11 to 34 ppb. The city-combined analysis also showed a non-linear association with a threshold around 30-40 ppb. We have observed non-linear concentration-response relationship with thresholds between daily mean ambient O3 concentration and daily number of non-accidental death in Japanese and Korean cities.
NASA Astrophysics Data System (ADS)
Zulvia, Pepi; Kurnia, Anang; Soleh, Agus M.
2017-03-01
Individual and environment are a hierarchical structure consist of units grouped at different levels. Hierarchical data structures are analyzed based on several levels, with the lowest level nested in the highest level. This modeling is commonly call multilevel modeling. Multilevel modeling is widely used in education research, for example, the average score of National Examination (UN). While in Indonesia UN for high school student is divided into natural science and social science. The purpose of this research is to develop multilevel and panel data modeling using linear mixed model on educational data. The first step is data exploration and identification relationships between independent and dependent variable by checking correlation coefficient and variance inflation factor (VIF). Furthermore, we use a simple model approach with highest level of the hierarchy (level-2) is regency/city while school is the lowest of hierarchy (level-1). The best model was determined by comparing goodness-of-fit and checking assumption from residual plots and predictions for each model. Our finding that for natural science and social science, the regression with random effects of regency/city and fixed effects of the time i.e multilevel model has better performance than the linear mixed model in explaining the variability of the dependent variable, which is the average scores of UN.
Patterson, Brent R.; Anderson, Morgan L.; Rodgers, Arthur R.; Vander Vennen, Lucas M.; Fryxell, John M.
2017-01-01
Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism–a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors–has negative consequences for the viability of woodland caribou. PMID:29117234
Newton, Erica J; Patterson, Brent R; Anderson, Morgan L; Rodgers, Arthur R; Vander Vennen, Lucas M; Fryxell, John M
2017-01-01
Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism-a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors-has negative consequences for the viability of woodland caribou.
Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.
Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P
2017-03-01
The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.
ERIC Educational Resources Information Center
Schafer, William D.; Wang, Yuh-Yin
A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
[Application of artificial neural networks on the prediction of surface ozone concentrations].
Shen, Lu-Lu; Wang, Yu-Xuan; Duan, Lei
2011-08-01
Ozone is an important secondary air pollutant in the lower atmosphere. In order to predict the hourly maximum ozone one day in advance based on the meteorological variables for the Wanqingsha site in Guangzhou, Guangdong province, a neural network model (Multi-Layer Perceptron) and a multiple linear regression model were used and compared. Model inputs are meteorological parameters (wind speed, wind direction, air temperature, relative humidity, barometric pressure and solar radiation) of the next day and hourly maximum ozone concentration of the previous day. The OBS (optimal brain surgeon) was adopted to prune the neutral work, to reduce its complexity and to improve its generalization ability. We find that the pruned neural network has the capacity to predict the peak ozone, with an agreement index of 92.3%, the root mean square error of 0.0428 mg/m3, the R-square of 0.737 and the success index of threshold exceedance 77.0% (the threshold O3 mixing ratio of 0.20 mg/m3). When the neural classifier was added to the neural network model, the success index of threshold exceedance increased to 83.6%. Through comparison of the performance indices between the multiple linear regression model and the neural network model, we conclud that that neural network is a better choice to predict peak ozone from meteorological forecast, which may be applied to practical prediction of ozone concentration.
Locally linear regression for pose-invariant face recognition.
Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen
2007-07-01
The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
Continuing Medical Education Speakers with High Evaluation Scores Use more Image-based Slides.
Ferguson, Ian; Phillips, Andrew W; Lin, Michelle
2017-01-01
Although continuing medical education (CME) presentations are common across health professions, it is unknown whether slide design is independently associated with audience evaluations of the speaker. Based on the conceptual framework of Mayer's theory of multimedia learning, this study aimed to determine whether image use and text density in presentation slides are associated with overall speaker evaluations. This retrospective analysis of six sequential CME conferences (two annual emergency medicine conferences over a three-year period) used a mixed linear regression model to assess whether post-conference speaker evaluations were associated with image fraction (percentage of image-based slides per presentation) and text density (number of words per slide). A total of 105 unique lectures were given by 49 faculty members, and 1,222 evaluations (70.1% response rate) were available for analysis. On average, 47.4% (SD=25.36) of slides had at least one educationally-relevant image (image fraction). Image fraction significantly predicted overall higher evaluation scores [F(1, 100.676)=6.158, p=0.015] in the mixed linear regression model. The mean (SD) text density was 25.61 (8.14) words/slide but was not a significant predictor [F(1, 86.293)=0.55, p=0.815]. Of note, the individual speaker [χ 2 (1)=2.952, p=0.003] and speaker seniority [F(3, 59.713)=4.083, p=0.011] significantly predicted higher scores. This is the first published study to date assessing the linkage between slide design and CME speaker evaluations by an audience of practicing clinicians. The incorporation of images was associated with higher evaluation scores, in alignment with Mayer's theory of multimedia learning. Contrary to this theory, however, text density showed no significant association, suggesting that these scores may be multifactorial. Professional development efforts should focus on teaching best practices in both slide design and presentation skills.
Marcotte, Thomas D.; Deutsch, Reena; Michael, Benedict Daniel; Franklin, Donald; Cookson, Debra Rosario; Bharti, Ajay R.; Grant, Igor; Letendre, Scott L.
2013-01-01
Background Neurocognitive (NC) impairment (NCI) occurs commonly in people living with HIV. Despite substantial effort, no biomarkers have been sufficiently validated for diagnosis and prognosis of NCI in the clinic. The goal of this project was to identify diagnostic or prognostic biomarkers for NCI in a comprehensively characterized HIV cohort. Methods Multidisciplinary case review selected 98 HIV-infected individuals and categorized them into four NC groups using normative data: stably normal (SN), stably impaired (SI), worsening (Wo), or improving (Im). All subjects underwent comprehensive NC testing, phlebotomy, and lumbar puncture at two timepoints separated by a median of 6.2 months. Eight biomarkers were measured in CSF and blood by immunoassay. Results were analyzed using mixed model linear regression and staged recursive partitioning. Results At the first visit, subjects were mostly middle-aged (median 45) white (58%) men (84%) who had AIDS (70%). Of the 73% who took antiretroviral therapy (ART), 54% had HIV RNA levels below 50 c/mL in plasma. Mixed model linear regression identified that only MCP-1 in CSF was associated with neurocognitive change group. Recursive partitioning models aimed at diagnosis (i.e., correctly classifying neurocognitive status at the first visit) were complex and required most biomarkers to achieve misclassification limits. In contrast, prognostic models were more efficient. A combination of three biomarkers (sCD14, MCP-1, SDF-1α) correctly classified 82% of Wo and SN subjects, including 88% of SN subjects. A combination of two biomarkers (MCP-1, TNF-α) correctly classified 81% of Im and SI subjects, including 100% of SI subjects. Conclusions This analysis of well-characterized individuals identified concise panels of biomarkers associated with NC change. Across all analyses, the two most frequently identified biomarkers were sCD14 and MCP-1, indicators of monocyte/macrophage activation. While the panels differed depending on the outcome and on the degree of misclassification, nearly all stable patients were correctly classified. PMID:24101401
Zeng, Yanni; Navarro, Pau; Fernandez-Pujals, Ana M; Hall, Lynsey S; Clarke, Toni-Kim; Thomson, Pippa A; Smith, Blair H; Hocking, Lynne J; Padmanabhan, Sandosh; Hayward, Caroline; MacIntyre, Donald J; Wray, Naomi R; Deary, Ian J; Porteous, David J; Haley, Chris S; McIntosh, Andrew M
2017-02-15
Genome-wide association studies (GWASs) of major depressive disorder (MDD) have identified few significant associations. Testing the aggregation of genetic variants, in particular biological pathways, may be more powerful. Regional heritability analysis can be used to detect genomic regions that contribute to disease risk. We integrated pathway analysis and multilevel regional heritability analyses in a pipeline designed to identify MDD-associated pathways. The pipeline was applied to two independent GWAS samples [Generation Scotland: The Scottish Family Health Study (GS:SFHS, N = 6455) and Psychiatric Genomics Consortium (PGC:MDD) (N = 18,759)]. A polygenic risk score (PRS) composed of single nucleotide polymorphisms from the pathway most consistently associated with MDD was created, and its accuracy to predict MDD, using area under the curve, logistic regression, and linear mixed model analyses, was tested. In GS:SFHS, four pathways were significantly associated with MDD, and two of these explained a significant amount of pathway-level regional heritability. In PGC:MDD, one pathway was significantly associated with MDD. Pathway-level regional heritability was significant in this pathway in one subset of PGC:MDD. For both samples the regional heritabilities were further localized to the gene and subregion levels. The NETRIN1 signaling pathway showed the most consistent association with MDD across the two samples. PRSs from this pathway showed competitive predictive accuracy compared with the whole-genome PRSs when using area under the curve statistics, logistic regression, and linear mixed model. These post-GWAS analyses highlight the value of combining multiple methods on multiple GWAS data for the identification of risk pathways for MDD. The NETRIN1 signaling pathway is identified as a candidate pathway for MDD and should be explored in further large population studies. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Young, Vicki L; Cole, Amy; Lecky, Donna M; Fettis, Dennis; Pritchard, Beth; Verlander, Neville Q; Eley, Charlotte V; McNulty, Cliodna A M
2017-07-01
Delivering health topics in schools through peer education is known to be beneficial for all students involved. In this study, we have evaluated a peer-education workshop that aims to educate primary and secondary school students on hygiene, the spread of infection and antibiotics. Four schools in south-west England, in a range of localities, took part in peer-education workshops, with students completing before, after and knowledge-retention questionnaires. Mixed-effect logistic regression and mixed-effect linear regression were used to analyse the data. Data were analysed by topic, region and peer/non-peer-educator status. Qualitative interviews and focus groups with students and educators were conducted to assess changes in participants' skills, confidence and behaviour. Qualitative data indicated improvements in peer-educator skills and behaviour, including confidence, team-working and communication. There was a significant improvement in knowledge for all topics covered in the intervention, although this varied by region. In the antibiotics topic, peer-educators' knowledge increased in the retention questionnaire, whereas non-peer-educators' knowledge decreased. Knowledge declined in the retention questionnaires for the other topics, although this was mostly not significant. This study indicates that peer education is an effective way to educate young people on important topics around health and hygiene, and to concurrently improve communication skills. Its use should be encouraged across schools to help in the implementation of the National Institute for Health and Care Excellence (NICE) guidance that recommends children are taught in an age-appropriate manner about hygiene and antibiotics. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.
Validation of ACG Case-mix for equitable resource allocation in Swedish primary health care.
Zielinski, Andrzej; Kronogård, Maria; Lenhoff, Håkan; Halling, Anders
2009-09-18
Adequate resource allocation is an important factor to ensure equity in health care. Previous reimbursement models have been based on age, gender and socioeconomic factors. An explanatory model based on individual need of primary health care (PHC) has not yet been used in Sweden to allocate resources. The aim of this study was to examine to what extent the ACG case-mix system could explain concurrent costs in Swedish PHC. Diagnoses were obtained from electronic PHC records of inhabitants in Blekinge County (approx. 150,000) listed with public PHC (approx. 120,000) for three consecutive years, 2004-2006. The inhabitants were then classified into six different resource utilization bands (RUB) using the ACG case-mix system. The mean costs for primary health care were calculated for each RUB and year. Using linear regression models and log-cost as dependent variable the adjusted R2 was calculated in the unadjusted model (gender) and in consecutive models where age, listing with specific PHC and RUB were added. In an additional model the ACG groups were added. Gender, age and listing with specific PHC explained 14.48-14.88% of the variance in individual costs for PHC. By also adding information on level of co-morbidity, as measured by the ACG case-mix system, to specific PHC the adjusted R2 increased to 60.89-63.41%. The ACG case-mix system explains patient costs in primary care to a high degree. Age and gender are important explanatory factors, but most of the variance in concurrent patient costs was explained by the ACG case-mix system.
Determinants of isocyanate exposures in auto body repair and refinishing shops.
Woskie, S R; Sparer, J; Gore, R J; Stowe, M; Bello, D; Liu, Y; Youngs, F; Redlich, C; Eisen, E; Cullen, M
2004-07-01
As part of the Survey of Painters and Repairers of Auto bodies by Yale (SPRAY), the determinants of isocyanate exposure in auto body repair shops were evaluated. Measurements (n = 380) of hexamethylene diisocyanate-based monomer and polyisocyanate and isophorone diisocyanate-based polyisocyanate were collected from 33 auto body shops. The median total reactive isocyanate concentrations expressed as mass concentration of the NCO functional group were: 206 microg NCO/m3 for spray operations; 0.93 microg NCO/m3 for samples collected in the vicinity of spray operations done on the shop floor (near spray); 0.05 microg NCO/m3 for office or other shop areas adjacent to spray areas (workplace background); 0.17 microg NCO/m3 for paint mixing and gun cleaning operations (mixing); 0.27 microg NCO/m3 for sanding operations. Exposure determinants for the sample NCO mass load were identified using linear regression, tobit regression and logistic regression models. For spray samples in a spray booth the significant determinants were the number of milliliters of NCO applied, the gallons of clear coat used by the shop each month and the type of spray booth used (custom built crossdraft, prefabricated crossdraft or downdraft/semi-downdraft). For near spray (bystander) samples, outdoor temperature >65 degrees F (18 degrees C) and shop size >5000 feet2 (465 m2) were significant determinants of exposure levels. For workplace background samples the shop annual income was the most important determinant. For sanding samples, the shop annual income and outdoor temperature >65 degrees F (18 degrees C) were the most significant determinants. Identification of these key exposure determinants will be useful in targeting exposure evaluation and control efforts to reduce isocyanate exposures.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes predictors from a MGLMM are always preferable to scatterplots of empirical Bayes predictors generated by separate models, unless the true association between outcomes is zero.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
NASA Astrophysics Data System (ADS)
Wang, Qiyuan; Huang, Rujin; Zhao, Zhuzi; Cao, Junji; Ni, Haiyan; Tie, Xuexi; Zhu, Chongshu; Shen, Zhenxing; Wang, Meng; Dai, Wenting; Han, Yongming; Zhang, Ningning; Prévôt, André S. H.
2017-04-01
The relationship between the refractory black carbon (rBC) aerosol mixing state and the atmospheric oxidation capacity was investigated to assess the possible influence of oxidants on the particles’ light absorption effects at two large cities in China. The number fraction of thickly-coated rBC particles (F rBC) was positively correlated with a measure of the oxidant concentrations (OX = O3 + NO2), indicating an enhancement of coated rBC particles under more oxidizing conditions. The slope of a linear regression of F rBC versus OX was 0.58% ppb-1 for Beijing and 0.84% ppb-1 for Xi’an, and these relationships provide some insights into the evolution of rBC mixing state in relation to atmospheric oxidation processes. The mass absorption cross-section of rBC (MACrBC) increased with OX during the daytime at Xi’an, at a rate of 0.26 m2 g-1 ppb-1, suggesting that more oxidizing conditions lead to internal mixing that enhances the light-absorbing capacity of rBC particles. Understanding the dependence of the increasing rates of F rBC and MACrBC as a function of OX may lead to improvements of climate models that deal with the warming effects, but more studies in different cities and seasons are needed to gauge the broader implications of these findings.
Sensitivity of Chemical Shift-Encoded Fat Quantification to Calibration of Fat MR Spectrum
Wang, Xiaoke; Hernando, Diego; Reeder, Scott B.
2015-01-01
Purpose To evaluate the impact of different fat spectral models on proton density fat-fraction (PDFF) quantification using chemical shift-encoded (CSE) MRI. Material and Methods Simulations and in vivo imaging were performed. In a simulation study, spectral models of fat were compared pairwise. Comparison of magnitude fitting and mixed fitting was performed over a range of echo times and fat fractions. In vivo acquisitions from 41 patients were reconstructed using 7 published spectral models of fat. T2-corrected STEAM-MRS was used as reference. Results Simulations demonstrate that imperfectly calibrated spectral models of fat result in biases that depend on echo times and fat fraction. Mixed fitting is more robust against this bias than magnitude fitting. Multi-peak spectral models showed much smaller differences among themselves than when compared to the single-peak spectral model. In vivo studies show all multi-peak models agree better (for mixed fitting, slope ranged from 0.967–1.045 using linear regression) with reference standard than the single-peak model (for mixed fitting, slope=0.76). Conclusion It is essential to use a multi-peak fat model for accurate quantification of fat with CSE-MRI. Further, fat quantification techniques using multi-peak fat models are comparable and no specific choice of spectral model is shown to be superior to the rest. PMID:25845713
Borsatti, L; Vieira, S L; Stefanello, C; Kindlein, L; Oviedo-Rondón, E O; Angel, C R
2018-01-01
A study was conducted to determine the AMEn contents of fat by-products from the soybean oil industry for broiler chickens. A total of 390 slow-feathering Cobb × Cobb 500 male broilers were randomly distributed into 13 treatments having 6 replicates of 5 birds each. Birds were fed a common starter diet from placement to 21 d. Experimental corn-soy diets were composed of four fat sources, added at 3 increasing levels each, and were fed from 21 to 28 d. Fat sources utilized were acidulated soybean soapstock (ASS), glycerol (GLY), lecithin (LEC), and a mixture (MIX) containing 85% ASS, 10% GLY and 5% LEC. A 4 × 3 + 1 factorial arrangement was used with 4 by-products (ASS, GLY, LEC, or MIX), 3 inclusion levels and 1 basal diet. Each of the four fat by-product sources was included in the diets as follow: 2% of by-products (98% basal + 2% by-product), 4% (96% basal + 4% by-product), or 6% (94% basal + 6% by-product). Birds were submitted to 94, 96, 98, and 100% of ad libitum feed intake; therefore, the differences in AMEn consumption were only due to the added by-product. Total excreta were collected twice daily for 72 h to determine apparent metabolizable energy contents starting at 25 d. The AMEn intake was regressed against feed intake and the slope was used to estimate AMEn values for each fat source. Linear regression equations (P < 0.05) estimated for each by-product were as follow: 7,153X - 451.9 for ASS; 3,916X - 68.2 for GLY; 7,051X - 448.3 for LEC, and 8,515X - 622.3 for MIX. Values of AMEn were 7,153, 3,916, 7,051, and 8,515 kcal/kg DM for ASS, GLY, LEC, and MIX, respectively. The present study generated AMEn for fat by-products data that can be used in poultry feed formulation. It also provides indications that, by adding the 3 by-products in the proportions present in the MIX, considerable economic advantage can be attained. © 2017 Poultry Science Association Inc.
NASA Astrophysics Data System (ADS)
Benhalouche, Fatima Zohra; Karoui, Moussa Sofiane; Deville, Yannick; Ouamri, Abdelaziz
2015-10-01
In this paper, a new Spectral-Unmixing-based approach, using Nonnegative Matrix Factorization (NMF), is proposed to locally multi-sharpen hyperspectral data by integrating a Digital Surface Model (DSM) obtained from LIDAR data. In this new approach, the nature of the local mixing model is detected by using the local variance of the object elevations. The hyper/multispectral images are explored using small zones. In each zone, the variance of the object elevations is calculated from the DSM data in this zone. This variance is compared to a threshold value and the adequate linear/linearquadratic spectral unmixing technique is used in the considered zone to independently unmix hyperspectral and multispectral data, using an adequate linear/linear-quadratic NMF-based approach. The obtained spectral and spatial information thus respectively extracted from the hyper/multispectral images are then recombined in the considered zone, according to the selected mixing model. Experiments based on synthetic hyper/multispectral data are carried out to evaluate the performance of the proposed multi-sharpening approach and literature linear/linear-quadratic approaches used on the whole hyper/multispectral data. In these experiments, real DSM data are used to generate synthetic data containing linear and linear-quadratic mixed pixel zones. The DSM data are also used for locally detecting the nature of the mixing model in the proposed approach. Globally, the proposed approach yields good spatial and spectral fidelities for the multi-sharpened data and significantly outperforms the used literature methods.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siewicki, T.C.; Chandler, G.T.
1995-12-31
Eastern oysters (Crassostrea virginica) were continuously exposed to suspended {sup 14}C-fluoranthene spiked-sediment for either: (1) five days followed by 24 days deputation, or (2) 28 days exposure. Sediment less than 63 um contained fluoranthene concentrations one or ten times that measured at suburbanized sites in southeastern estuaries (133 or 1,300 ng/g). The data were evaluated both raw and normalized for tissue lipid and sediment organic carbon concentrations. Uptake rate constants were estimated using non-linear regression methods. Depuration rate constants were estimated by linear regression of the deputation phase following five-days exposure and as the second partial derivative of the non-linearmore » regression for the 28-day exposures. Uptake and deputation rate constants, bioconcentration factors and half-lives were similar regardless of exposure time, sediment fluoranthene concentration or use of data normalization. Uptake and deputation rate constants, bioconcentration factors and half-lives (days) were similar and low for all experiments, ranging from 0.02 to 0.10, 0.14 to 0.30, 0.09 to 0.46, and 2.4 to 5.0, respectively. Degradation by the mixed function oxidase system is not expected in oysters allowing the use of radiotracers for measuring very low concentrations of fluoranthene. The results suggest that short-term exposures followed by deputation are effective for estimating kinetic rate constants and that normalization provides little benefit in these controlled studies. The results further show that bioconcentration of sediment-associated fluoranthene, and possibly other polycyclic aromatic hydrocarbons, is very low compared to either dissolved forms or levels commonly used in regulatory actions.« less
Climate change at upper treeline: How do trees on the edge react to increasing temperatures?
NASA Astrophysics Data System (ADS)
Jochner, Matthias; Bugmann, Harald; Nötzli, Magdalena; Bigler, Christof
2017-04-01
Treeline ecotones are thought to be particularly sensitive to climate warming, and an alteration of their growth conditions may have important implications for the ecosystem services they supply in mountain regions. We use a novel approach to quantify effects of a changing climate on tree growth, using case studies in the European Alps. We compiled tree-ring data from almost 600 trees of four species at treeline in three climate regions of Switzerland. Temperature loggers installed along transects provided data for a precise interpolation of temperatures experienced by the sampled trees. To assess the influence of temperature on annual growth, we used linear mixed-effects models, allowing us to quantify effect sizes and to account for between-tree growth variability. After removing biological growth trends, we isolated temporal trends of ring-width indices. Furthermore, we fitted non-linear regression models to radial growth rates of individual years with temperature and tree age as predicting covariates for a fine-scale investigation of the temperature dependency of tree growth. For all species, climate-growth linear mixed-effects models indicated strong positive responses of ring-width indices to temperature in early summer and previous year's autumn, featuring considerable between-tree variability. All species showed positive ring-width index trends at treeline but different interactions with elevation: Larix decidua exhibited a declining ring-width index trend with decreasing elevation, whereas Picea abies, Pinus cembra and Pinus mugo showed increasing and/or stable trends. Not only reflected our findings the effects of ameliorated growth conditions, they might have also revealed suspected negative and positive feedbacks of climate change on growth, and increased the knowledge about the functional form and parameterization of the temperature dependency of tree growth.
Jameson, John; Smith, Peter; Harris, Gerald
2015-01-01
Osteogenesis Imperfecta is a genetic disorder resulting in bone fragility. The mechanisms behind this fragility are not well understood. In addition to characteristic bone mass deficiencies, research suggests that bone material properties are compromised in individuals with this disorder. However, little data exists regarding bone properties beyond the microstructural scale in individuals with this disorder. Specimens were obtained from long bone diaphyses of nine children with osteogenesis imperfecta during routine osteotomy procedures. Small rectangular beams, oriented longitudinally and transversely to the diaphyseal axis, were machined from these specimens and elastic modulus, yield strength, and maximum strength were measured in three-point bending. Intracortical vascular porosity, bone volume fraction, osteocyte lacuna density, and volumetric tissue mineral density were determined by synchrotron micro-computed tomography, and relationships among these mechanical properties and structural parameters were explored. Modulus and strength were on average 64–68% lower in the transverse vs. longitudinal beams (P<0.001, linear mixed model). Vascular porosity ranged between 3–42% of total bone volume. Longitudinal properties were associated negatively with porosity (P≤0.006, linear regressions). Mechanical properties, however, were not associated with osteocyte lacuna density or volumetric tissue mineral density (P≥0.167). Bone properties and structural parameters were not associated significantly with donor age (p≥0.225, linear mixed models). This study presents novel data regarding bone material strength in children with osteogenesis imperfecta. Results confirm that these properties are anisotropic. Elevated vascular porosity was observed in most specimens, and this parameter was associated with reduced bone material strength. These results offer insight towards understanding bone fragility and the role of intracortical porosity on the strength of bone tissue in children with osteogenesis imperfecta. PMID:24928496
Hu, Rongrong; Wang, Chenkun; Gu, Yangshun; Racette, Lyne
2016-01-01
Abstract Detection of progression is paramount to the clinical management of glaucoma. Our goal is to compare the performance of standard automated perimetry (SAP), short-wavelength automated perimetry (SWAP), and frequency-doubling technology (FDT) perimetry in monitoring glaucoma progression. Longitudinal data of paired SAP, SWAP, and FDT from 113 eyes with primary open-angle glaucoma enrolled in the Diagnostic Innovations in Glaucoma Study or the African Descent and Glaucoma Evaluation Study were included. Data from all tests were expressed in comparable units by converting the sensitivity from decibels to unitless contrast sensitivity and by expressing sensitivity values in percent of mean normal based on an independent dataset of 207 healthy eyes with aging deterioration taken into consideration. Pointwise linear regression analysis was performed and 3 criteria (conservative, moderate, and liberal) were used to define progression and improvement. Global mean sensitivity (MS) was fitted with linear mixed models. No statistically significant difference in the proportion of progressing and improving eyes was observed across tests using the conservative criterion. Fewer eyes showed improvement on SAP compared to SWAP and FDT using the moderate criterion; and FDT detected less progressing eyes than SAP and SWAP using the liberal criterion. The agreement between these test types was poor. The linear mixed model showed a progressing trend of global MS overtime for SAP and SWAP, but not for FDT. The baseline estimate of SWAP MS was significantly lower than SAP MS by 21.59% of mean normal. FDT showed comparable estimation of baseline MS with SAP. SWAP and FDT do not appear to have significant benefits over SAP in monitoring glaucoma progression. SAP, SWAP, and FDT may, however, detect progression in different glaucoma eyes. PMID:26886602
Blood biomarkers in male and female participants after an Ironman-distance triathlon.
Danielsson, Tom; Carlsson, Jörg; Schreyer, Hendrik; Ahnesjö, Jonas; Ten Siethoff, Lasse; Ragnarsson, Thony; Tugetam, Åsa; Bergman, Patrick
2017-01-01
While overall physical activity is clearly associated with a better short-term and long-term health, prolonged strenuous physical activity may result in a rise in acute levels of blood-biomarkers used in clinical practice for diagnosis of various conditions or diseases. In this study, we explored the acute effects of a full Ironman-distance triathlon on biomarkers related to heart-, liver-, kidney- and skeletal muscle damage immediately post-race and after one week's rest. We also examined if sex, age, finishing time and body composition influenced the post-race values of the biomarkers. A sample of 30 subjects was recruited (50% women) to the study. The subjects were evaluated for body composition and blood samples were taken at three occasions, before the race (T1), immediately after (T2) and one week after the race (T3). Linear regression models were fitted to analyse the independent contribution of sex and finishing time controlled for weight, body fat percentage and age, on the biomarkers at the termination of the race (T2). Linear mixed models were fitted to examine if the biomarkers differed between the sexes over time (T1-T3). Being male was a significant predictor of higher post-race (T2) levels of myoglobin, CK, and creatinine levels and body weight was negatively associated with myoglobin. In general, the models were unable to explain the variation of the dependent variables. In the linear mixed models, an interaction between time (T1-T3) and sex was seen for myoglobin and creatinine, in which women had a less pronounced response to the race. Overall women appear to tolerate the effects of prolonged strenuous physical activity better than men as illustrated by their lower values of the biomarkers both post-race as well as during recovery.
Albert, Carolyne; Jameson, John; Smith, Peter; Harris, Gerald
2014-09-01
Osteogenesis imperfecta is a genetic disorder resulting in bone fragility. The mechanisms behind this fragility are not well understood. In addition to characteristic bone mass deficiencies, research suggests that bone material properties are compromised in individuals with this disorder. However, little data exists regarding bone properties beyond the microstructural scale in individuals with this disorder. Specimens were obtained from long bone diaphyses of nine children with osteogenesis imperfecta during routine osteotomy procedures. Small rectangular beams, oriented longitudinally and transversely to the diaphyseal axis, were machined from these specimens and elastic modulus, yield strength, and maximum strength were measured in three-point bending. Intracortical vascular porosity, bone volume fraction, osteocyte lacuna density, and volumetric tissue mineral density were determined by synchrotron micro-computed tomography, and relationships among these mechanical properties and structural parameters were explored. Modulus and strength were on average 64-68% lower in the transverse vs. longitudinal beams (P<0.001, linear mixed model). Vascular porosity ranged between 3 and 42% of total bone volume. Longitudinal properties were associated negatively with porosity (P≤0.006, linear regressions). Mechanical properties, however, were not associated with osteocyte lacuna density or volumetric tissue mineral density (P≥0.167). Bone properties and structural parameters were not associated significantly with donor age (P≥0.225, linear mixed models). This study presents novel data regarding bone material strength in children with osteogenesis imperfecta. Results confirm that these properties are anisotropic. Elevated vascular porosity was observed in most specimens, and this parameter was associated with reduced bone material strength. These results offer insight toward understanding bone fragility and the role of intracortical porosity on the strength of bone tissue in children with osteogenesis imperfecta. Copyright © 2014 Elsevier Inc. All rights reserved.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Schimmel, Martin; Christou, Panagiotis; Miyazaki, Hideo; Halazonetis, Demetrios; Herrmann, François R; Müller, Frauke
2015-08-01
Chewing efficiency may be evaluated using cohesive specimen, especially in elderly or dysphagic patients. The aim of this study was to evaluate three two-coloured chewing gums for a colour-mixing ability test and to validate a new purpose built software (ViewGum©). Dentate participants (dentate-group) and edentulous patients with mandibular two-implant overdentures (IOD-group) were recruited. First, the dentate-group chewed three different types of two-coloured gum (gum1-gum3) for 5, 10, 20, 30 and 50 chewing cycles. Subsequently the number of chewing cycles with the highest intra- and inter-rater agreement was determined visually by applying a scale (SA) and opto-electronically (ViewGum©, Bland-Altman analysis). The ViewGum© software determines semi-automatically the variance of hue (VOH); inadequate mixing presents with larger VOH than complete mixing. Secondly, the dentate-group and the IOD-group were compared. The dentate-group comprised 20 participants (10 female, 30.3±6.7 years); the IOD-group 15 participants (10 female, 74.6±8.3 years). Intra-rater and inter-rater agreement (SA) was very high at 20 chewing cycles (95.00-98.75%). Gums 1-3 showed different colour-mixing characteristics as a function of chewing cycles, gum1 showed a logarithmic association; gum2 and gum3 demonstrated more linear behaviours. However, the number of chewing cycles could be predicted in all specimens from VOH (all p<0.0001, mixed linear regression models). Both analyses proved discriminative to the dental state. ViewGum© proved to be a reliable and discriminative tool to opto-electronically assess chewing efficiency, given an elastic specimen is chewed for 20 cycles and could be recommended for the evaluation of chewing efficiency in a clinical and research setting. Chewing is a complex function of the oro-facial structures and the central nervous system. The application of the proposed assessments of the chewing function in geriatrics or special care dentistry could help visualising oro-functional or dental comorbidities in dysphagic patients or those suffering from protein-energy malnutrition. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Effect of water-based recovery on blood lactate removal after high-intensity exercise.
Lucertini, Francesco; Gervasi, Marco; D'Amen, Giancarlo; Sisti, Davide; Rocchi, Marco Bruno Luigi; Stocchi, Vilberto; Benelli, Piero
2017-01-01
This study assessed the effectiveness of water immersion to the shoulders in enhancing blood lactate removal during active and passive recovery after short-duration high-intensity exercise. Seventeen cyclists underwent active water- and land-based recoveries and passive water and land-based recoveries. The recovery conditions lasted 31 minutes each and started after the identification of each cyclist's blood lactate accumulation peak, induced by a 30-second all-out sprint on a cycle ergometer. Active recoveries were performed on a cycle ergometer at 70% of the oxygen consumption corresponding to the lactate threshold (the control for the intensity was oxygen consumption), while passive recoveries were performed with subjects at rest and seated on the cycle ergometer. Blood lactate concentration was measured 8 times during each recovery condition and lactate clearance was modeled over a negative exponential function using non-linear regression. Actual active recovery intensity was compared to the target intensity (one sample t-test) and passive recovery intensities were compared between environments (paired sample t-tests). Non-linear regression parameters (coefficients of the exponential decay of lactate; predicted resting lactates; predicted delta decreases in lactate) were compared between environments (linear mixed model analyses for repeated measures) separately for the active and passive recovery modes. Active recovery intensities did not differ significantly from the target oxygen consumption, whereas passive recovery resulted in a slightly lower oxygen consumption when performed while immersed in water rather than on land. The exponential decay of blood lactate was not significantly different in water- or land-based recoveries in either active or passive recovery conditions. In conclusion, water immersion at 29°C would not appear to be an effective practice for improving post-exercise lactate removal in either the active or passive recovery modes.
Liu, Shu; Yu, Marco; Weinreb, Robert N; Lai, Gilda; Lam, Dennis Shun-Chiu; Leung, Christopher Kai-Shun
2014-05-02
We compared the detection of visual field progression and its rate of change between standard automated perimetry (SAP) and Matrix frequency doubling technology perimetry (FDTP) in glaucoma. We followed prospectively 217 eyes (179 glaucoma and 38 normal eyes) for SAP and FDTP testing at 4-month intervals for ≥36 months. Pointwise linear regression analysis was performed. A test location was considered progressing when the rate of change of visual sensitivity was ≤-1 dB/y for nonedge and ≤-2 dB/y for edge locations. Three criteria were used to define progression in an eye: ≥3 adjacent nonedge test locations (conservative), any three locations (moderate), and any two locations (liberal) progressed. The rate of change of visual sensitivity was calculated with linear mixed models. Of the 217 eyes, 6.1% and 3.9% progressed with the conservative criteria, 14.5% and 5.6% of eyes progressed with the moderate criteria, and 20.1% and 11.7% of eyes progressed with the liberal criteria by FDTP and SAP, respectively. Taking all test locations into consideration (total, 54 × 179 locations), FDTP detected more progressing locations (176) than SAP (103, P < 0.001). The rate of change of visual field mean deviation (MD) was significantly faster for FDTP (all with P < 0.001). No eyes showed progression in the normal group using the conservative and the moderate criteria. With a faster rate of change of visual sensitivity, FDTP detected more progressing eyes than SAP at a comparable level of specificity. Frequency doubling technology perimetry can provide a useful alternative to monitor glaucoma progression.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Nikoloulopoulos, Aristidis K
2017-10-01
A bivariate copula mixed model has been recently proposed to synthesize diagnostic test accuracy studies and it has been shown that it is superior to the standard generalized linear mixed model in this context. Here, we call trivariate vine copulas to extend the bivariate meta-analysis of diagnostic test accuracy studies by accounting for disease prevalence. Our vine copula mixed model includes the trivariate generalized linear mixed model as a special case and can also operate on the original scale of sensitivity, specificity, and disease prevalence. Our general methodology is illustrated by re-analyzing the data of two published meta-analyses. Our study suggests that there can be an improvement on trivariate generalized linear mixed model in fit to data and makes the argument for moving to vine copula random effects models especially because of their richness, including reflection asymmetric tail dependence, and computational feasibility despite their three dimensionality.
Allen, J.L.; Wesser, S.; Markon, C.J.; Winterberger, K.C.
2006-01-01
From 1989 to 2003, a widespread outbreak of spruce beetles (Dendroctonus rufipennis) in the Copper River Basin, Alaska, infested over 275,000 ha of forests in the region. During 1997 and 1998, we measured forest vegetation structure and composition on one hundred and thirty-six 20-m ?? 20-m plots to assess both the immediate stand and landscape level effects of the spruce beetle infestation. A photo-interpreted vegetation and infestation map was produced using color-infrared aerial photography at a scale of 1:40,000. We used linear regression to quantify the effects of the outbreak on forest structure and composition. White spruce (Picea glauca) canopy cover and basal area of medium-to-large trees [???15 cm diameter-at-breast height (1.3 m, dbh)] were reduced linearly as the number of trees attacked by spruce beetles increased. Black spruce (Picea mariana) and small diameter white spruce (<15 cm dbh) were infrequently attacked and killed by spruce beetles. This selective attack of mature white spruce reduced structural complexity of stands to earlier stages of succession and caused mixed tree species stands to lose their white spruce and become more homogeneous in overstory composition. Using the resulting regressions, we developed a transition matrix to describe changes in vegetation types under varying levels of spruce beetle infestations, and applied the model to the vegetation map. Prior to the outbreak, our study area was composed primarily of stands of mixed white and black spruce (29% of area) and pure white spruce (25%). However, the selective attack on white spruce caused many of these stands to transition to black spruce dominated stands (73% increase in area) or shrublands (26% increase in area). The post-infestation landscape was thereby composed of more even distributions of shrubland and white, black, and mixed spruce communities (17-22% of study area). Changes in the cover and composition of understory vegetation were less evident in this study. However, stands with the highest mortality due to spruce beetles had the lowest densities of white spruce seedlings suggesting a longer forest regeneration time without an increase in seedling germination, growth, or survival. ?? 2006 Elsevier B.V. All rights reserved.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Karimi, Hamid Reza; Gao, Huijun
2008-07-01
A mixed H2/Hinfinity output-feedback control design methodology is presented in this paper for second-order neutral linear systems with time-varying state and input delays. Delay-dependent sufficient conditions for the design of a desired control are given in terms of linear matrix inequalities (LMIs). A controller, which guarantees asymptotic stability and a mixed H2/Hinfinity performance for the closed-loop system of the second-order neutral linear system, is then developed directly instead of coupling the model to a first-order neutral system. A Lyapunov-Krasovskii method underlies the LMI-based mixed H2/Hinfinity output-feedback control design using some free weighting matrices. The simulation results illustrate the effectiveness of the proposed methodology.
MULTIVARIATE LINEAR MIXED MODELS FOR MULTIPLE OUTCOMES. (R824757)
We propose a multivariate linear mixed (MLMM) for the analysis of multiple outcomes, which generalizes the latent variable model of Sammel and Ryan. The proposed model assumes a flexible correlation structure among the multiple outcomes, and allows a global test of the impact of ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun Wei; Huang, Guo H., E-mail: huang@iseis.org; Institute for Energy, Environment and Sustainable Communities, University of Regina, Regina, Saskatchewan, S4S 0A2
2012-06-15
Highlights: Black-Right-Pointing-Pointer Inexact piecewise-linearization-based fuzzy flexible programming is proposed. Black-Right-Pointing-Pointer It's the first application to waste management under multiple complexities. Black-Right-Pointing-Pointer It tackles nonlinear economies-of-scale effects in interval-parameter constraints. Black-Right-Pointing-Pointer It estimates costs more accurately than the linear-regression-based model. Black-Right-Pointing-Pointer Uncertainties are decreased and more satisfactory interval solutions are obtained. - Abstract: To tackle nonlinear economies-of-scale (EOS) effects in interval-parameter constraints for a representative waste management problem, an inexact piecewise-linearization-based fuzzy flexible programming (IPFP) model is developed. In IPFP, interval parameters for waste amounts and transportation/operation costs can be quantified; aspiration levels for net system costs, as well as tolerancemore » intervals for both capacities of waste treatment facilities and waste generation rates can be reflected; and the nonlinear EOS effects transformed from objective function to constraints can be approximated. An interactive algorithm is proposed for solving the IPFP model, which in nature is an interval-parameter mixed-integer quadratically constrained programming model. To demonstrate the IPFP's advantages, two alternative models are developed to compare their performances. One is a conventional linear-regression-based inexact fuzzy programming model (IPFP2) and the other is an IPFP model with all right-hand-sides of fussy constraints being the corresponding interval numbers (IPFP3). The comparison results between IPFP and IPFP2 indicate that the optimized waste amounts would have the similar patterns in both models. However, when dealing with EOS effects in constraints, the IPFP2 may underestimate the net system costs while the IPFP can estimate the costs more accurately. The comparison results between IPFP and IPFP3 indicate that their solutions would be significantly different. The decreased system uncertainties in IPFP's solutions demonstrate its effectiveness for providing more satisfactory interval solutions than IPFP3. Following its first application to waste management, the IPFP can be potentially applied to other environmental problems under multiple complexities.« less
GIS Tools to Estimate Average Annual Daily Traffic
DOT National Transportation Integrated Search
2012-06-01
This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Vu, Cung Khac; Nihei, Kurt; Johnson, Paul A; Guyer, Robert; Ten Cate, James A; Le Bas, Pierre-Yves; Larmat, Carene S
2014-12-30
A system and a method for investigating rock formations includes generating, by a first acoustic source, a first acoustic signal comprising a first plurality of pulses, each pulse including a first modulated signal at a central frequency; and generating, by a second acoustic source, a second acoustic signal comprising a second plurality of pulses. A receiver arranged within the borehole receives a detected signal including a signal being generated by a non-linear mixing process from the first-and-second acoustic signal in a non-linear mixing zone within the intersection volume. The method also includes-processing the received signal to extract the signal generated by the non-linear mixing process over noise or over signals generated by a linear interaction process, or both.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Cheng, Xiao-Fei; Shi, Pei-Jian; Hui, Cang; Wang, Fu-Sheng; Liu, Guo-Hua; Li, Bai-Lian
2015-04-01
Moso bamboos (Phyllostachys edulis) are important forestry plants in southern China, with substantial roles to play in regional economic and ecological systems. Mixing broad-leaved forests and moso bamboos is a common management practice in China, and it is fundamental to elucidate the interactions between broad-leaved trees and moso bamboos for ensuring the sustainable provision of ecosystem services. We examine how the proportion of broad-leaved forest in a mixed managed zone, topology, and soil profile affects the effective productivity of moso bamboos (i.e., those with significant economic value), using linear regression and generalized additive models. Bamboo's diameter at breast height follows a Weibull distribution. The importance of these variables to bamboo productivity is, respectively, slope (25.9%), the proportion of broad-leaved forest (24.8%), elevation (23.3%), gravel content by volume (16.6%), slope location (8.3%), and soil layer thickness (1.2%). Highest productivity is found on the 25° slope, with a 600-m elevation, and 30% broad-leaved forest. As such, broad-leaved forest in the upper slope can have a strong influence on the effective productivity of moso bamboo, ranking only after slope and before elevation. These factors can be considered in future management practice.
Breastfeeding associated with higher lung function in African American youths with asthma.
Oh, Sam S; Du, Randal; Zeiger, Andrew M; McGarry, Meghan E; Hu, Donglei; Thakur, Neeta; Pino-Yanes, Maria; Galanter, Joshua M; Eng, Celeste; Nishimura, Katherine Keiko; Huntsman, Scott; Farber, Harold J; Meade, Kelley; Avila, Pedro; Serebrisky, Denise; Bibbins-Domingo, Kirsten; Lenoir, Michael A; Ford, Jean G; Brigino-Buenaventura, Emerita; Rodriguez-Cintron, William; Thyne, Shannon M; Sen, Saunak; Rodriguez-Santana, Jose R; Williams, Keoki; Kumar, Rajesh; Burchard, Esteban G
2017-10-01
In the United States, Puerto Ricans and African Americans have lower prevalence of breastfeeding and worse clinical outcomes for asthma compared with other racial/ethnic groups. We hypothesize that the history of breastfeeding is associated with increased forced expiratory volume in 1 second (FEV 1 ) % predicted and reduced asthma exacerbations in Latino and African American youths with asthma. As part of the Genes-environments & Admixture in Latino Americans (GALA II) Study and the Study of African Americans, asthma, Genes & Environments (SAGE II), we conducted case-only analyses in children and adolescents aged 8-21 years with asthma from four different racial/ethnic groups: African Americans (n = 426), Mexican Americans (n = 424), mixed/other Latinos (n = 255), and Puerto Ricans (n = 629). We investigated the association between any breastfeeding in infancy and FEV 1 % predicted using multivariable linear regression; Poisson regression was used to determine the association between breastfeeding and asthma exacerbations. Prevalence of breastfeeding was lower in African Americans (59.4%) and Puerto Ricans (54.9%) compared to Mexican Americans (76.2%) and mixed/other Latinos (66.9%; p < 0.001). After adjusting for covariates, breastfeeding was associated with a 3.58% point increase in FEV 1 % predicted (p = 0.01) and a 21% reduction in asthma exacerbations (p = 0.03) in African Americans only. Breastfeeding was associated with higher FEV 1 % predicted in asthma and reduced number of asthma exacerbations in African American youths, calling attention to continued support for breastfeeding.
Yang, Yingbao; Li, Xiaolong; Pan, Xin; Zhang, Yong; Cao, Chen
2017-01-01
Many downscaling algorithms have been proposed to address the issue of coarse-resolution land surface temperature (LST) derived from available satellite-borne sensors. However, few studies have focused on improving LST downscaling in urban areas with several mixed surface types. In this study, LST was downscaled by a multiple linear regression model between LST and multiple scale factors in mixed areas with three or four surface types. The correlation coefficients (CCs) between LST and the scale factors were used to assess the importance of the scale factors within a moving window. CC thresholds determined which factors participated in the fitting of the regression equation. The proposed downscaling approach, which involves an adaptive selection of the scale factors, was evaluated using the LST derived from four Landsat 8 thermal imageries of Nanjing City in different seasons. Results of the visual and quantitative analyses show that the proposed approach achieves relatively satisfactory downscaling results on 11 August, with coefficient of determination and root-mean-square error of 0.87 and 1.13 °C, respectively. Relative to other approaches, our approach shows the similar accuracy and the availability in all seasons. The best (worst) availability occurred in the region of vegetation (water). Thus, the approach is an efficient and reliable LST downscaling method. Future tasks include reliable LST downscaling in challenging regions and the application of our model in middle and low spatial resolutions. PMID:28368301
Strategies for improving water use efficiency of livestock production in rain-fed systems.
Kebebe, E G; Oosting, S J; Haileslassie, A; Duncan, A J; de Boer, I J M
2015-05-01
Livestock production is a major consumer of fresh water, and the influence of livestock production on global fresh water resources is increasing because of the growing demand for livestock products. Increasing water use efficiency of livestock production, therefore, can contribute to the overall water use efficiency of agriculture. Previous studies have reported significant variation in livestock water productivity (LWP) within and among farming systems. Underlying causes of this variation in LWP require further investigation. The objective of this paper was to identify the factors that explain the variation in LWP within and among farming systems in Ethiopia. We quantified LWP for various farms in mixed-crop livestock systems and explored the effect of household demographic characteristics and farm assets on LWP using ANOVA and multilevel mixed-effect linear regression. We focused on water used to cultivate feeds on privately owned agricultural lands. There was a difference in LWP among farming systems and wealth categories. Better-off households followed by medium households had the highest LWP, whereas poor households had the lowest LWP. The variation in LWP among wealth categories could be explained by the differences in the ownership of livestock and availability of family labor. Regression results showed that the age of the household head, the size of the livestock holding and availability of family labor affected LWP positively. The results suggest that water use efficiency could be improved by alleviating resource constraints such as access to farm labor and livestock assets, oxen in particular.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Generalized Multilevel Structural Equation Modeling
ERIC Educational Resources Information Center
Rabe-Hesketh, Sophia; Skrondal, Anders; Pickles, Andrew
2004-01-01
A unifying framework for generalized multilevel structural equation modeling is introduced. The models in the framework, called generalized linear latent and mixed models (GLLAMM), combine features of generalized linear mixed models (GLMM) and structural equation models (SEM) and consist of a response model and a structural model for the latent…
Random regression analyses using B-splines to model growth of Australian Angus cattle
Meyer, Karin
2005-01-01
Regression on the basis function of B-splines has been advocated as an alternative to orthogonal polynomials in random regression analyses. Basic theory of splines in mixed model analyses is reviewed, and estimates from analyses of weights of Australian Angus cattle from birth to 820 days of age are presented. Data comprised 84 533 records on 20 731 animals in 43 herds, with a high proportion of animals with 4 or more weights recorded. Changes in weights with age were modelled through B-splines of age at recording. A total of thirteen analyses, considering different combinations of linear, quadratic and cubic B-splines and up to six knots, were carried out. Results showed good agreement for all ages with many records, but fluctuated where data were sparse. On the whole, analyses using B-splines appeared more robust against "end-of-range" problems and yielded more consistent and accurate estimates of the first eigenfunctions than previous, polynomial analyses. A model fitting quadratic B-splines, with knots at 0, 200, 400, 600 and 821 days and a total of 91 covariance components, appeared to be a good compromise between detailedness of the model, number of parameters to be estimated, plausibility of results, and fit, measured as residual mean square error. PMID:16093011
Brumback, Babette A; Cai, Zhuangyu; Dailey, Amy B
2014-05-15
Reasons for health disparities may include neighborhood-level factors, such as availability of health services, social norms, and environmental determinants, as well as individual-level factors. Investigating health inequalities using nationally or locally representative data often requires an approach that can accommodate a complex sampling design, in which individuals have unequal probabilities of selection into the study. The goal of the present article is to review and compare methods of estimating or accounting for neighborhood influences with complex survey data. We considered 3 types of methods, each generalized for use with complex survey data: ordinary regression, conditional likelihood regression, and generalized linear mixed-model regression. The relative strengths and weaknesses of each method differ from one study to another; we provide an overview of the advantages and disadvantages of each method theoretically, in terms of the nature of the estimable associations and the plausibility of the assumptions required for validity, and also practically, via a simulation study and 2 epidemiologic data analyses. The first analysis addresses determinants of repeat mammography screening use using data from the 2005 National Health Interview Survey. The second analysis addresses disparities in preventive oral health care using data from the 2008 Florida Behavioral Risk Factor Surveillance System Survey.
Optimization of Orifice Geometry for Cross-Flow Mixing in a Cylindrical Duct
NASA Technical Reports Server (NTRS)
Sowa, W. A.; Kroll, J. T.; Samuelsen, G. S.; Holdeman, J. D.
1994-01-01
Mixing of gaseous jets in a cross-flow has significant applications in engineering, one example of which is the dilution zone of a gas turbine combustor. Despite years of study, the design of jet injection in combustors is largely based on practical experience. A series of experiments was undertaken to delineate the optimal mixer orifice geometry. A cross-flow to core-flow momentum-flux ratio of 40 and a mass flow ratio of 2.5 were selected as representative of an advanced design. An experimental test matrix was designed around three variables: the number of orifices, the orifice aspect ratio (long-to-short dimension), and the orifice angle. A regression analysis was performed on the data to arrive at an interpolating equation that predicted the mixing performance of orifice geometry combinations within the range of the test matrix parameters. Results indicate that mixture uniformity is a non-linear function of the number of orifices, the orifice aspect ratio, and the orifice angle. Optimum mixing occurs when the asymptotic mean jet trajectories are in the range of 0.35 less than r/R less than 0.5 (where r = 0 is at the mixer wall) at z/R = 1.0. At the optimum number of orifices, the difference between shallow-angled slots with large aspect ratios and round holes is minimal and either approach will lead to good mixing performance. At the optimum number of orifices, it appears possible to have two local optimums where one corresponds to an aspect ratio of 1.0 and the other to a high aspect ratio.
Leung, Michael; Bassani, Diego G; Racine-Poon, Amy; Goldenberg, Anna; Ali, Syed Asad; Kang, Gagandeep; Premkumar, Prasanna S; Roth, Daniel E
2017-09-10
Conditioning child growth measures on baseline accounts for regression to the mean (RTM). Here, we present the "conditional random slope" (CRS) model, based on a linear-mixed effects model that incorporates a baseline-time interaction term that can accommodate multiple data points for a child while also directly accounting for RTM. In two birth cohorts, we applied five approaches to estimate child growth velocities from 0 to 12 months to assess the effect of increasing data density (number of measures per child) on the magnitude of RTM of unconditional estimates, and the correlation and concordance between the CRS and four alternative metrics. Further, we demonstrated the differential effect of the choice of velocity metric on the magnitude of the association between infant growth and stunting at 2 years. RTM was minimally attenuated by increasing data density for unconditional growth modeling approaches. CRS and classical conditional models gave nearly identical estimates with two measures per child. Compared to the CRS estimates, unconditional metrics had moderate correlation (r = 0.65-0.91), but poor agreement in the classification of infants with relatively slow growth (kappa = 0.38-0.78). Estimates of the velocity-stunting association were the same for CRS and classical conditional models but differed substantially between conditional versus unconditional metrics. The CRS can leverage the flexibility of linear mixed models while addressing RTM in longitudinal analyses. © 2017 The Authors American Journal of Human Biology Published by Wiley Periodicals, Inc.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Bai, Lu; Chan, Ching-Yao; Liu, Pan; Xu, Chengcheng
2017-10-03
Electric bikes (e-bikes) have been one of the fastest growing trip modes in Southeast Asia over the past 2 decades. The increasing popularity of e-bikes raised some safety concerns regarding urban transport systems. The primary objective of this study was to identify whether and how the generalized linear regression model (GLM) could be used to relate cyclists' safety with various contributing factors when riding in a mid-block bike lane. The types of 2-wheeled vehicles in the study included bicycle-style electric bicycles (BSEBs), scooter-style electric bicycles (SSEBs), and regular bicycles (RBs). Traffic conflict technology was applied as a surrogate measure to evaluate the safety of 2-wheeled vehicles. The safety performance model was developed by adopting a generalized linear regression model for relating the frequency of rear-end conflicts between e-bikes and regular bikes to the operating speeds of BSEBs, SSEBs, and RBs in mid-block bike lanes. The frequency of rear-end conflicts between e-bikes and bikes increased with an increase in the operating speeds of e-bikes and the volume of e-bikes and bikes and decreased with an increase in the width of bike lanes. The large speed difference between e-bikes and bikes increased the frequency of rear-end conflicts between e-bikes and bikes in mid-block bike lanes. A 1% increase in the average operating speed of e-bikes would increase the expected number of rear-end conflicts between e-bikes and bikes by 1.48%. A 1% increase in the speed difference between e-bikes and bikes would increase the expected number of rear-end conflicts between e-bikes/bikes by 0.16%. The conflict frequency in mid-block bike lanes can be modeled using generalized linear regression models. The factors that significantly affected the frequency of rear-end conflicts included the operating speeds of e-bikes, the speed difference between e-bikes and regular bikes, the volume of e-bikes, the volume of bikes, and the width of bike lanes. The safety performance model can help better understand the causes of crash occurrences in mid-block bike lanes.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Meng, Xia; Fu, Qingyan; Ma, Zongwei; Chen, Li; Zou, Bin; Zhang, Yan; Xue, Wenbo; Wang, Jinnan; Wang, Dongfang; Kan, Haidong; Liu, Yang
2016-01-01
Development of exposure assessment model is the key component for epidemiological studies concerning air pollution, but the evidence from China is limited. Therefore, a linear mixed effects (LME) model was established in this study in a Chinese metropolis by incorporating aerosol optical depth (AOD), meteorological information and the land use regression (LUR) model to predict ground PM10 levels on high spatiotemporal resolution. The cross validation (CV) R(2) and the RMSE of the LME model were 0.87 and 19.2 μg/m(3), respectively. The relative prediction error (RPE) of daily and annual mean predicted PM10 concentrations were 19.1% and 7.5%, respectively. This study was the first attempt in China to estimate both short-term and long-term variation of PM10 levels with high spatial resolution in a Chinese metropolis with the LME model. The results suggested that the LME model could provide exposure assessment for short-term and long-term epidemiological studies in China. Copyright © 2015 Elsevier Ltd. All rights reserved.
Online EEG artifact removal for BCI applications by adaptive spatial filtering.
Guarnieri, Roberto; Marino, Marco; Barban, Federico; Ganzetti, Marco; Mantini, Dante
2018-06-28
The performance of brain computer interfaces (BCIs) based on electroencephalography (EEG) data strongly depends on the effective attenuation of artifacts that are mixed in the recordings. To address this problem, we have developed a novel online EEG artifact removal method for BCI applications, which combines blind source separation (BSS) and regression (REG) analysis. The BSS-REG method relies on the availability of a calibration dataset of limited duration for the initialization of a spatial filter using BSS. Online artifact removal is implemented by dynamically adjusting the spatial filter in the actual experiment, based on a linear regression technique. Our results showed that the BSS-REG method is capable of attenuating different kinds of artifacts, including ocular and muscular, while preserving true neural activity. Thanks to its low computational requirements, BSS-REG can be applied to low-density as well as high-density EEG data. We argue that BSS-REG may enable the development of novel BCI applications requiring high-density recordings, such as source-based neurofeedback and closed-loop neuromodulation. © 2018 IOP Publishing Ltd.
McKone, Edward F; Velentgas, Priscilla; Swenson, Anna J; Goss, Christopher H
2015-09-01
The extent to which sweat chloride concentration predicts survival and clinical phenotype independently of CFTR genotype in cystic fibrosis is not well understood. We analyzed the US Cystic Fibrosis Foundation Patient Registry data using Cox regression to examine the relationship between sweat chloride concentration (<60, 60-<80, ≥80mmol/L), CFTR genotype (high and lower risk for lung function decline), and survival and mixed linear regression to examine the relationship between sweat chloride, CFTR genotype, and measures of lung function and growth. When included in the same model, CFTR genotype, but not sweat chloride, was independently associated with survival and with lung function, height, and BMI. Among patients with unclassified CFTR genotype, sweat chloride was an independent predictor of survival (<60 HR 0.53 [0.37, 0.77], 60-<80 0.51 [0.42, 0.63]). Sweat chloride concentration may be a useful predictor of mortality and clinical phenotype when CFTR genotype functional class is unclassified. Copyright © 2015 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.
Hejri-Zarifi, Sudiyeh; Ahmadian-Kouchaksaraei, Zahra; Pourfarzad, Amir; Khodaparast, Mohammad Hossein Haddad
2014-12-01
Germinated palm date seeds were milled into two fractions: germ and residue. Dough rheological characteristics, baking (specific volume and sensory evaluation), and textural properties (at first day and during storage for 5 days) were determined in Barbari flat bread. Germ and residue fractions were incorporated at various levels ranged in 0.5-3 g/100 g of wheat flour. Water absorption, arrival time and gelatination temperature were decreased by germ fraction but accompanied by an increasing effect on the mixing tolerance index and degree of softening in most levels. Although improvement in dough stability was monitored but specific volume of bread was not affected by both fractions. Texture analysis of bread samples during 5 days of storage indicated that both fractions of germinated date seeds were able to diminish bread staling. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among dough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.
Evaluation of third-degree and fourth-degree laceration rates as quality indicators.
Friedman, Alexander M; Ananth, Cande V; Prendergast, Eri; D'Alton, Mary E; Wright, Jason D
2015-04-01
To examine the patterns and predictors of third-degree and fourth-degree laceration in women undergoing vaginal delivery. We identified a population-based cohort of women in the United States who underwent a vaginal delivery between 1998 and 2010 using the Nationwide Inpatient Sample. Multivariable log-linear regression models were developed to account for patient, obstetric, and hospital factors related to lacerations. Between-hospital variability of laceration rates was calculated using generalized log-linear mixed models. Among 7,096,056 women who underwent vaginal delivery in 3,070 hospitals, 3.3% (n=232,762) had a third-degree laceration and 1.1% (n=76,347) had a fourth-degree laceration. In an adjusted model for fourth-degree lacerations, important risk factors included shoulder dystocia and forceps and vacuum deliveries with and without episiotomy. Other demographic, obstetric, medical, and hospital variables, although statistically significant, were not major determinants of lacerations. Risk factors in a multivariable model for third-degree lacerations were similar to those in the fourth-degree model. Regression analysis of hospital rates (n=3,070) of lacerations demonstrated limited between-hospital variation. Risk of third-degree and fourth-degree laceration was most strongly related to operative delivery and shoulder dystocia. Between-hospital variation was limited. Given these findings and that the most modifiable practice related to lacerations would be reduction in operative vaginal deliveries (and a possible increase in cesarean delivery), third-degree and fourth-degree laceration rates may be a quality metric of limited utility.
Choudhary, Pushpa; Velaga, Nagendra R
2017-09-01
This study analysed and modelled the effects of conversation and texting (each with two difficulty levels) on driving performance of Indian drivers in terms of their mean speed and accident avoiding abilities; and further explored the relationship between speed reduction strategy of the drivers and their corresponding accident frequency. 100 drivers of three different age groups (young, mid-age and old-age) participated in the simulator study. Two sudden events of Indian context: unexpected crossing of pedestrians and joining of parked vehicles from road side, were simulated for estimating the accident probabilities. Generalized linear mixed models approach was used for developing linear regression models for mean speed and binary logistic regression models for accident probability. The results of the models showed that the drivers significantly compensated the increased workload by reducing their mean speed by 2.62m/s and 5.29m/s in the presence of conversation and texting tasks respectively. The logistic models for accident probabilities showed that the accident probabilities increased by 3 and 4 times respectively when the drivers were conversing or texting on a phone during driving. Further, the relationship between the speed reduction patterns and their corresponding accident frequencies showed that all the drivers compensated differently; but, among all the drivers, only few drivers, who compensated by reducing the speed by 30% or more, were able to fully offset the increased accident risk associated with the phone use. Copyright © 2017 Elsevier Ltd. All rights reserved.
Contribution of Sand-Associated Enterococci to Dry Weather Water Quality
2015-01-01
Culturable enterococci and a suite of environmental variables were collected during a predominantly dry summer at a beach impacted by nonpoint source pollution. These data were used to evaluate sands as a source of enterococci to nearshore waters, and to assess the relationship between environmental factors and dry-weather enterococci abundance. Best-fit multiple linear regressions used environmental variables to explain more than half of the observed variation in enterococci in water and dry sands. Notably, during dry weather the abundance of enterococci in dry sands at the mean high-tide line was significantly positively related to sand moisture content (ranging from <1–4%), and the daily mean ENT in water could be predicted by a linear regression with turbidity alone. Temperature was also positively correlated with ENT abundance in this study, which may indicate an important role of seasonal warming in temperate regions. Inundation by spring tides was the primary rewetting mechanism that sustained culturable enterococci populations in high-tide sands. Tidal forcing modulated the abundance of enterococci in the water, as both turbidity and enterococci were elevated during ebb and flood tides. The probability of samples violating the single-sample maximum was significantly greater when collected during periods with increased tidal range: spring ebb and flood tides. Tidal forcing also affected groundwater mixing zones, mobilizing enterococci from sand to water. These data show that routine monitoring programs using discrete enterococci measurements may be biased by tides and other environmental factors, providing a flawed basis for beach closure decisions. PMID:25479559
Contribution of sand-associated enterococci to dry weather water quality.
Halliday, Elizabeth; Ralston, David K; Gast, Rebecca J
2015-01-06
Culturable enterococci and a suite of environmental variables were collected during a predominantly dry summer at a beach impacted by nonpoint source pollution. These data were used to evaluate sands as a source of enterococci to nearshore waters, and to assess the relationship between environmental factors and dry-weather enterococci abundance. Best-fit multiple linear regressions used environmental variables to explain more than half of the observed variation in enterococci in water and dry sands. Notably, during dry weather the abundance of enterococci in dry sands at the mean high-tide line was significantly positively related to sand moisture content (ranging from <1-4%), and the daily mean ENT in water could be predicted by a linear regression with turbidity alone. Temperature was also positively correlated with ENT abundance in this study, which may indicate an important role of seasonal warming in temperate regions. Inundation by spring tides was the primary rewetting mechanism that sustained culturable enterococci populations in high-tide sands. Tidal forcing modulated the abundance of enterococci in the water, as both turbidity and enterococci were elevated during ebb and flood tides. The probability of samples violating the single-sample maximum was significantly greater when collected during periods with increased tidal range: spring ebb and flood tides. Tidal forcing also affected groundwater mixing zones, mobilizing enterococci from sand to water. These data show that routine monitoring programs using discrete enterococci measurements may be biased by tides and other environmental factors, providing a flawed basis for beach closure decisions.
Cigarette smoking and perception of a movie character in a film trailer.
Hanewinkel, Reiner
2009-01-01
To study the effects of smoking in a film trailer. Experimental study. Ten secondary schools in Northern Germany. A sample of 1051 adolescents with a mean (SD) age of 14.2 (1.8) years. Main Exposures Participants were randomized to view a 42-second film trailer in which the attractive female character either smoked for about 3 seconds or did not smoke. Perception of the character was measured via an 8-item semantic differential scale. Each item consisted of a polar-opposite pair (eg, "sexy/unsexy") divided on a 7-point scale. Responses to individual items were summed and averaged. This scale was named "attractiveness." The Cronbach alpha for the attractiveness rating was 0.85. Multilevel mixed-effects linear regression was used to test the effect of smoking in a film trailer. Smoking in the film trailer did not reach significance in the linear regression model (z = 0.73; P = .47). Smoking status of the recipient (z = 3.81; P < .001) and the interaction between smoking in the film trailer and smoking status of the recipient (z = 2.21; P = .03) both reached statistical significance. Ever smokers and never smokers did not differ in their perception of the female character in the nonsmoking film trailer. In the smoking film trailer, ever smokers judged the character significantly more attractive than never smokers. Even incidental smoking in a very short film trailer might strengthen the attractiveness of smokers in youth who have already tried their first cigarettes.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan
2014-02-10
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.
Time and frequency domain analysis of sampled data controllers via mixed operation equations
NASA Technical Reports Server (NTRS)
Frisch, H. P.
1981-01-01
Specification of the mathematical equations required to define the dynamic response of a linear continuous plant, subject to sampled data control, is complicated by the fact that the digital components of the control system cannot be modeled via linear ordinary differential equations. This complication can be overcome by introducing two new mathematical operations; namely, the operation of zero order hold and digial delay. It is shown that by direct utilization of these operations, a set of linear mixed operation equations can be written and used to define the dynamic response characteristics of the controlled system. It also is shown how these linear mixed operation equations lead, in an automatable manner, directly to a set of finite difference equations which are in a format compatible with follow on time and frequency domain analysis methods.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Missing Data in Clinical Studies: Issues and Methods
Ibrahim, Joseph G.; Chu, Haitao; Chen, Ming-Hui
2012-01-01
Missing data are a prevailing problem in any type of data analyses. A participant variable is considered missing if the value of the variable (outcome or covariate) for the participant is not observed. In this article, various issues in analyzing studies with missing data are discussed. Particularly, we focus on missing response and/or covariate data for studies with discrete, continuous, or time-to-event end points in which generalized linear models, models for longitudinal data such as generalized linear mixed effects models, or Cox regression models are used. We discuss various classifications of missing data that may arise in a study and demonstrate in several situations that the commonly used method of throwing out all participants with any missing data may lead to incorrect results and conclusions. The methods described are applied to data from an Eastern Cooperative Oncology Group phase II clinical trial of liver cancer and a phase III clinical trial of advanced non–small-cell lung cancer. Although the main area of application discussed here is cancer, the issues and methods we discuss apply to any type of study. PMID:22649133
Zhai, Hong Lin; Zhai, Yue Yuan; Li, Pei Zhen; Tian, Yue Li
2013-01-21
A very simple approach to quantitative analysis is proposed based on the technology of digital image processing using three-dimensional (3D) spectra obtained by high-performance liquid chromatography coupled with a diode array detector (HPLC-DAD). As the region-based shape features of a grayscale image, Zernike moments with inherently invariance property were employed to establish the linear quantitative models. This approach was applied to the quantitative analysis of three compounds in mixed samples using 3D HPLC-DAD spectra, and three linear models were obtained, respectively. The correlation coefficients (R(2)) for training and test sets were more than 0.999, and the statistical parameters and strict validation supported the reliability of established models. The analytical results suggest that the Zernike moment selected by stepwise regression can be used in the quantitative analysis of target compounds. Our study provides a new idea for quantitative analysis using 3D spectra, which can be extended to the analysis of other 3D spectra obtained by different methods or instruments.
Final Report---Optimization Under Nonconvexity and Uncertainty: Algorithms and Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Linderoth
2011-11-06
the goal of this work was to develop new algorithmic techniques for solving large-scale numerical optimization problems, focusing on problems classes that have proven to be among the most challenging for practitioners: those involving uncertainty and those involving nonconvexity. This research advanced the state-of-the-art in solving mixed integer linear programs containing symmetry, mixed integer nonlinear programs, and stochastic optimization problems. The focus of the work done in the continuation was on Mixed Integer Nonlinear Programs (MINLP)s and Mixed Integer Linear Programs (MILP)s, especially those containing a great deal of symmetry.
Shimotsu, Scott T; Jones-Webb, Rhonda J; MacLehose, Richard F; Nelson, Toben F; Forster, Jean L; Lytle, Leslie A
2013-10-01
The neighborhoods where people live can influence their drinking behavior. We hypothesized that living in a neighborhood with lower median income, higher alcohol outlet density, and only liquor stores and no grocery stores would be associated with higher alcohol consumption after adjusting for individual demographic and lifestyle factors. We used two self-report measures to assess alcohol consumption in a sample of 9959 adults living in a large Midwestern county: volume of alcohol consumed (count) and binge drinking (5 or more drinks vs.<5 drinks). We measured census tract median annual household income based on U.S. Census data. Alcohol outlet density was measured using the number of liquor stores divided by the census tract roadway miles. The mix of liquor and food stores in census tracts was assessed using a categorical variable based on the number of liquor and number of food stores using data from InfoUSA. Weighted hierarchical linear and Poisson regression were used to test our study hypothesis. Retail mix was associated with binge drinking. Individuals living in census tracts with only liquor stores had a 46% higher risk of binge drinking than individuals living in census tracts with food stores only after controlling for demographic and lifestyle factors. Census tract characteristics such as retail mix may partly explain variability in drinking behavior. Future research should explore the mix of stores, not just the over-concentration of liquor stores in census tracts. Published by Elsevier Ireland Ltd.
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
Flow of nanofluid past a Riga plate
NASA Astrophysics Data System (ADS)
Ahmad, Adeel; Asghar, Saleem; Afzal, Sumaira
2016-03-01
This paper studies the mixed convection boundary layer flow of a nanofluid past a vertical Riga plate in the presence of strong suction. The mathematical model incorporates the Brownian motion and thermophoresis effects due to nanofluid and the Grinberg-term for the wall parallel Lorentz force due to Riga plate. The analytical solution of the problem is presented using the perturbation method for small Brownian and thermophoresis diffusion parameters. The numerical solution is also presented to ensure the reliability of the asymptotic method. The comparison of the two solutions shows an excellent agreement. The correlation expressions for skin friction, Nusselt number and Sherwood number are developed by performing linear regression on the obtained numerical data. The effects of nanofluid and the Lorentz force due to Riga plate, on the skin friction are discussed.
General functioning predicts reward and punishment learning in schizophrenia.
Somlai, Zsuzsanna; Moustafa, Ahmed A; Kéri, Szabolcs; Myers, Catherine E; Gluck, Mark A
2011-04-01
Previous studies investigating feedback-driven reinforcement learning in patients with schizophrenia have provided mixed results. In this study, we explored the clinical predictors of reward and punishment learning using a probabilistic classification learning task. Patients with schizophrenia (n=40) performed similarly to healthy controls (n=30) on the classification learning task. However, more severe negative and general symptoms were associated with lower reward-learning performance, whereas poorer general psychosocial functioning was correlated with both lower reward- and punishment-learning performances. Multiple linear regression analyses indicated that general psychosocial functioning was the only significant predictor of reinforcement learning performance when education, antipsychotic dose, and positive, negative and general symptoms were included in the analysis. These results suggest a close relationship between reinforcement learning and general psychosocial functioning in schizophrenia. Published by Elsevier B.V.
Marek K. Jakubowksi; Qinghua Guo; Brandon Collins; Scott Stephens; Maggi Kelly
2013-01-01
We compared the ability of several classification and regression algorithms to predict forest stand structure metrics and standard surface fuel models. Our study area spans a dense, topographically complex Sierra Nevada mixed-conifer forest. We used clustering, regression trees, and support vector machine algorithms to analyze high density (average 9 pulses/m
Rajeswaran, Jeevanantham; Blackstone, Eugene H
2017-02-01
In medical sciences, we often encounter longitudinal temporal relationships that are non-linear in nature. The influence of risk factors may also change across longitudinal follow-up. A system of multiphase non-linear mixed effects model is presented to model temporal patterns of longitudinal continuous measurements, with temporal decomposition to identify the phases and risk factors within each phase. Application of this model is illustrated using spirometry data after lung transplantation using readily available statistical software. This application illustrates the usefulness of our flexible model when dealing with complex non-linear patterns and time-varying coefficients.
Model Selection with the Linear Mixed Model for Longitudinal Data
ERIC Educational Resources Information Center
Ryoo, Ji Hoon
2011-01-01
Model building or model selection with linear mixed models (LMMs) is complicated by the presence of both fixed effects and random effects. The fixed effects structure and random effects structure are codependent, so selection of one influences the other. Most presentations of LMM in psychology and education are based on a multilevel or…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth
ERIC Educational Resources Information Center
Jeon, Minjeong
2012-01-01
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
Visual, Algebraic and Mixed Strategies in Visually Presented Linear Programming Problems.
ERIC Educational Resources Information Center
Shama, Gilli; Dreyfus, Tommy
1994-01-01
Identified and classified solution strategies of (n=49) 10th-grade students who were presented with linear programming problems in a predominantly visual setting in the form of a computerized game. Visual strategies were developed more frequently than either algebraic or mixed strategies. Appendix includes questionnaires. (Contains 11 references.)…
Mixed H∞ and passive control for linear switched systems via hybrid control approach
NASA Astrophysics Data System (ADS)
Zheng, Qunxian; Ling, Youzhu; Wei, Lisheng; Zhang, Hongbin
2018-03-01
This paper investigates the mixed H∞ and passive control problem for linear switched systems based on a hybrid control strategy. To solve this problem, first, a new performance index is proposed. This performance index can be viewed as the mixed weighted H∞ and passivity performance. Then, the hybrid controllers are used to stabilise the switched systems. The hybrid controllers consist of dynamic output-feedback controllers for every subsystem and state updating controllers at the switching instant. The design of state updating controllers not only depends on the pre-switching subsystem and the post-switching subsystem, but also depends on the measurable output signal. The hybrid controllers proposed in this paper can include some existing ones as special cases. Combine the multiple Lyapunov functions approach with the average dwell time technique, new sufficient conditions are obtained. Under the new conditions, the closed-loop linear switched systems are globally uniformly asymptotically stable with a mixed H∞ and passivity performance index. Moreover, the desired hybrid controllers can be constructed by solving a set of linear matrix inequalities. Finally, a numerical example and a practical example are given.
Specialization Agreements in the Council for Mutual Economic Assistance
1988-02-01
proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations
ERIC Educational Resources Information Center
Pek, Jolynn; Wong, Octavia; Wong, C. M.
2017-01-01
Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
Markov and semi-Markov switching linear mixed models used to identify forest tree growth components.
Chaubert-Pereira, Florence; Guédon, Yann; Lavergne, Christian; Trottier, Catherine
2010-09-01
Tree growth is assumed to be mainly the result of three components: (i) an endogenous component assumed to be structured as a succession of roughly stationary phases separated by marked change points that are asynchronous among individuals, (ii) a time-varying environmental component assumed to take the form of synchronous fluctuations among individuals, and (iii) an individual component corresponding mainly to the local environment of each tree. To identify and characterize these three components, we propose to use semi-Markov switching linear mixed models, i.e., models that combine linear mixed models in a semi-Markovian manner. The underlying semi-Markov chain represents the succession of growth phases and their lengths (endogenous component) whereas the linear mixed models attached to each state of the underlying semi-Markov chain represent-in the corresponding growth phase-both the influence of time-varying climatic covariates (environmental component) as fixed effects, and interindividual heterogeneity (individual component) as random effects. In this article, we address the estimation of Markov and semi-Markov switching linear mixed models in a general framework. We propose a Monte Carlo expectation-maximization like algorithm whose iterations decompose into three steps: (i) sampling of state sequences given random effects, (ii) prediction of random effects given state sequences, and (iii) maximization. The proposed statistical modeling approach is illustrated by the analysis of successive annual shoots along Corsican pine trunks influenced by climatic covariates. © 2009, The International Biometric Society.
Functional Mixed Effects Model for Small Area Estimation.
Maiti, Tapabrata; Sinha, Samiran; Zhong, Ping-Shou
2016-09-01
Functional data analysis has become an important area of research due to its ability of handling high dimensional and complex data structures. However, the development is limited in the context of linear mixed effect models, and in particular, for small area estimation. The linear mixed effect models are the backbone of small area estimation. In this article, we consider area level data, and fit a varying coefficient linear mixed effect model where the varying coefficients are semi-parametrically modeled via B-splines. We propose a method of estimating the fixed effect parameters and consider prediction of random effects that can be implemented using a standard software. For measuring prediction uncertainties, we derive an analytical expression for the mean squared errors, and propose a method of estimating the mean squared errors. The procedure is illustrated via a real data example, and operating characteristics of the method are judged using finite sample simulation studies.
Elastic properties and optical absorption studies of mixed alkali borogermanate glasses
NASA Astrophysics Data System (ADS)
Taqiullah, S. M.; Ahmmad, Shaik Kareem; Samee, M. A.; Rahman, Syed
2018-05-01
First time the mixed alkali effect (MAE) has been investigated in the glass system xNa2O-(30-x)Li2O-40B2O3- 30GeO2 (0≤x≤30 mol%) through density and optical absorption studies. The present glasses were prepared by melt quench technique. The density of the present glasses varies non-linearly exhibiting mixed alkali effect. Using the density data, the elastic moduli namely Young's modulus, bulk and shear modulus show strong linear dependence as a function of compositional parameter. From the absorption edge studies, the values of optical band gap energies for all transitions have been evaluated. It was established that the type of electronic transition in the present glass system is indirect allowed. The indirect optical band gap exhibit non-linear behavior with compositional parameter showing the mixed alkali effect.
Optimal clinical trial design based on a dichotomous Markov-chain mixed-effect sleep model.
Steven Ernest, C; Nyberg, Joakim; Karlsson, Mats O; Hooker, Andrew C
2014-12-01
D-optimal designs for discrete-type responses have been derived using generalized linear mixed models, simulation based methods and analytical approximations for computing the fisher information matrix (FIM) of non-linear mixed effect models with homogeneous probabilities over time. In this work, D-optimal designs using an analytical approximation of the FIM for a dichotomous, non-homogeneous, Markov-chain phase advanced sleep non-linear mixed effect model was investigated. The non-linear mixed effect model consisted of transition probabilities of dichotomous sleep data estimated as logistic functions using piecewise linear functions. Theoretical linear and nonlinear dose effects were added to the transition probabilities to modify the probability of being in either sleep stage. D-optimal designs were computed by determining an analytical approximation the FIM for each Markov component (one where the previous state was awake and another where the previous state was asleep). Each Markov component FIM was weighted either equally or by the average probability of response being awake or asleep over the night and summed to derive the total FIM (FIM(total)). The reference designs were placebo, 0.1, 1-, 6-, 10- and 20-mg dosing for a 2- to 6-way crossover study in six dosing groups. Optimized design variables were dose and number of subjects in each dose group. The designs were validated using stochastic simulation/re-estimation (SSE). Contrary to expectations, the predicted parameter uncertainty obtained via FIM(total) was larger than the uncertainty in parameter estimates computed by SSE. Nevertheless, the D-optimal designs decreased the uncertainty of parameter estimates relative to the reference designs. Additionally, the improvement for the D-optimal designs were more pronounced using SSE than predicted via FIM(total). Through the use of an approximate analytic solution and weighting schemes, the FIM(total) for a non-homogeneous, dichotomous Markov-chain phase advanced sleep model was computed and provided more efficient trial designs and increased nonlinear mixed-effects modeling parameter precision.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Shi, J Q; Wang, B; Will, E J; West, R M
2012-11-20
We propose a new semiparametric model for functional regression analysis, combining a parametric mixed-effects model with a nonparametric Gaussian process regression model, namely a mixed-effects Gaussian process functional regression model. The parametric component can provide explanatory information between the response and the covariates, whereas the nonparametric component can add nonlinearity. We can model the mean and covariance structures simultaneously, combining the information borrowed from other subjects with the information collected from each individual subject. We apply the model to dose-response curves that describe changes in the responses of subjects for differing levels of the dose of a drug or agent and have a wide application in many areas. We illustrate the method for the management of renal anaemia. An individual dose-response curve is improved when more information is included by this mechanism from the subject/patient over time, enabling a patient-specific treatment regime. Copyright © 2012 John Wiley & Sons, Ltd.
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Damman, Olga C; Stubbe, Janine H; Hendriks, Michelle; Arah, Onyebuchi A; Spreeuwenberg, Peter; Delnoij, Diana M J; Groenewegen, Peter P
2009-04-01
Ratings on the quality of healthcare from the consumer's perspective need to be adjusted for consumer characteristics to ensure fair and accurate comparisons between healthcare providers or health plans. Although multilevel analysis is already considered an appropriate method for analyzing healthcare performance data, it has rarely been used to assess case-mix adjustment of such data. The purpose of this article is to investigate whether multilevel regression analysis is a useful tool to detect case-mix adjusters in consumer assessment of healthcare. We used data on 11,539 consumers from 27 Dutch health plans, which were collected using the Dutch Consumer Quality Index health plan instrument. We conducted multilevel regression analyses of consumers' responses nested within health plans to assess the effects of consumer characteristics on consumer experience. We compared our findings to the results of another methodology: the impact factor approach, which combines the predictive effect of each case-mix variable with its heterogeneity across health plans. Both multilevel regression and impact factor analyses showed that age and education were the most important case-mix adjusters for consumer experience and ratings of health plans. With the exception of age, case-mix adjustment had little impact on the ranking of health plans. On both theoretical and practical grounds, multilevel modeling is useful for adequate case-mix adjustment and analysis of performance ratings.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Alfvén wave interactions in the solar wind
NASA Astrophysics Data System (ADS)
Webb, G. M.; McKenzie, J. F.; Hu, Q.; le Roux, J. A.; Zank, G. P.
2012-11-01
Alfvén wave mixing (interaction) equations used in locally incompressible turbulence transport equations in the solar wind are analyzed from the perspective of linear wave theory. The connection between the wave mixing equations and non-WKB Alfven wave driven wind theories are delineated. We discuss the physical wave energy equation and the canonical wave energy equation for non-WKB Alfven waves and the WKB limit. Variational principles and conservation laws for the linear wave mixing equations for the Heinemann and Olbert non-WKB wind model are obtained. The connection with wave mixing equations used in locally incompressible turbulence transport in the solar wind are discussed.
Phase mixing versus nonlinear advection in drift-kinetic plasma turbulence
NASA Astrophysics Data System (ADS)
Schekochihin, A. A.; Parker, J. T.; Highcock, E. G.; Dellar, P. J.; Dorland, W.; Hammett, G. W.
2016-04-01
> A scaling theory of long-wavelength electrostatic turbulence in a magnetised, weakly collisional plasma (e.g. drift-wave turbulence driven by ion temperature gradients) is proposed, with account taken both of the nonlinear advection of the perturbed particle distribution by fluctuating flows and of its phase mixing, which is caused by the streaming of the particles along the mean magnetic field and, in a linear problem, would lead to Landau damping. It is found that it is possible to construct a consistent theory in which very little free energy leaks into high velocity moments of the distribution function, rendering the turbulent cascade in the energetically relevant part of the wavenumber space essentially fluid-like. The velocity-space spectra of free energy expressed in terms of Hermite-moment orders are steep power laws and so the free-energy content of the phase space does not diverge at infinitesimal collisionality (while it does for a linear problem); collisional heating due to long-wavelength perturbations vanishes in this limit (also in contrast with the linear problem, in which it occurs at the finite rate equal to the Landau damping rate). The ability of the free energy to stay in the low velocity moments of the distribution function is facilitated by the `anti-phase-mixing' effect, whose presence in the nonlinear system is due to the stochastic version of the plasma echo (the advecting velocity couples the phase-mixing and anti-phase-mixing perturbations). The partitioning of the wavenumber space between the (energetically dominant) region where this is the case and the region where linear phase mixing wins its competition with nonlinear advection is governed by the `critical balance' between linear and nonlinear time scales (which for high Hermite moments splits into two thresholds, one demarcating the wavenumber region where phase mixing predominates, the other where plasma echo does).
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
A Second-Order Conditionally Linear Mixed Effects Model with Observed and Latent Variable Covariates
ERIC Educational Resources Information Center
Harring, Jeffrey R.; Kohli, Nidhi; Silverman, Rebecca D.; Speece, Deborah L.
2012-01-01
A conditionally linear mixed effects model is an appropriate framework for investigating nonlinear change in a continuous latent variable that is repeatedly measured over time. The efficacy of the model is that it allows parameters that enter the specified nonlinear time-response function to be stochastic, whereas those parameters that enter in a…
USDA-ARS?s Scientific Manuscript database
The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...
1994-09-01
Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System
1989-09-01
residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?
ERIC Educational Resources Information Center
Blankmeyer, Eric
2006-01-01
Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method
ERIC Educational Resources Information Center
Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev
2018-01-01
The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Imatoh, Takuya; Kamimura, Seiichiro; Miyazaki, Motonobu
2017-03-01
It has been reported that adipocytes secrete vascular endothelial growth factor. Therefore, we conducted a 5-year longitudinal epidemiological study to further elucidate the association between vascular endothelial growth factor levels and temporal changes in body mass index. Our study subjects were Japanese male workers, who had regular health check-ups. Vascular endothelial growth factor levels were measured at baseline. To examine the association between vascular endothelial growth factor levels and overweight, we calculated the odds ratio using a multivariate logistic regression model. Moreover, linear mixed effect models were used to assess the association between vascular endothelial growth factor level and temporal changes in body mass index during the 5-year follow-up period. Vascular endothelial growth factor levels were marginally higher in subjects with a body mass index greater than 25 kg/m 2 compared with in those with a body mass index less than 25 kg/m 2 (505.4 vs. 465.5 pg/mL, P = 0.1) and were weakly correlated with leptin levels (β: 0.05, P = 0.07). In multivariate logistic regression, subjects in the highest vascular endothelial growth factor quantile were significantly associated with an increased risk for overweight compared with those in the lowest quantile (odds ratio 1.65, 95 % confidential interval: 1.10-2.50). Moreover P for trend was significant (P for trend = 0.003). However, the linear mixed effect model revealed that vascular endothelial growth factor levels were not associated with changes in body mass index over a 5-year period (quantile 2, β: 0.06, P = 0.46; quantile 3, β: -0.06, P = 0.45; quantile 4, β: -0.10, P = 0.22; quantile 1 as reference). Our results suggested that high vascular endothelial growth factor levels were significantly associated with overweight in Japanese males but high vascular endothelial growth factor levels did not necessarily cause obesity.
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
A Multiphase Non-Linear Mixed Effects Model: An Application to Spirometry after Lung Transplantation
Rajeswaran, Jeevanantham; Blackstone, Eugene H.
2014-01-01
In medical sciences, we often encounter longitudinal temporal relationships that are non-linear in nature. The influence of risk factors may also change across longitudinal follow-up. A system of multiphase non-linear mixed effects model is presented to model temporal patterns of longitudinal continuous measurements, with temporal decomposition to identify the phases and risk factors within each phase. Application of this model is illustrated using spirometry data after lung transplantation using readily available statistical software. This application illustrates the usefulness of our flexible model when dealing with complex non-linear patterns and time varying coefficients. PMID:24919830
A Re-appraisal of Olivine Sorting and Accumulation in Hawaiian Magmas.
NASA Astrophysics Data System (ADS)
Rhodes, J. M.
2002-12-01
Bowen never used the m-words (magma mixing) in his highly influential book "The Origin of the Igneous Rocks". Yet, in the past 20-30 years, magma mixing has been proposed as an important, almost ubiquitous, process at volcanoes in all tectonic environments ranging from oceanic basalts to large silicic magma bodies, and as the possible trigger of eruptions. Bowen regarded Hawaiian olivine basalts and picrites as the result of olivine accumulation in a lower MgO magma that was crystallizing and fractionating olivine. This, with variants, has been the party line ever since, the only debate being over the MgO content of the proposed parental magmas. Although magma mixing has been recognized as an important process in differentiated, low-MgO (below 7 percent), Hawaiian magmas, the wide range in MgO (7-30 percent) in Hawaiian olivine tholeiites and picrites is invariably attributed to olivine crystallization, fractionation and accumulation. In this paper I will re-evaluate this hypothesis using well-documented examples from Kilauea, Mauna Kea and Mauna Loa that exhibit well-defined, coherent linear trends of major oxides and trace elements with MgO . If olivine control is the only factor responsible for these trends, then the intersection of the regression lines for each trend should intersect olivine compositions at a common forsterite composition, corresponding to the average accumulated olivine in each of the magmas. In some cases (the ongoing Puu Oo eruption) this simple test holds and olivine fractionation and accumulation can clearly be shown to be the dominant process. In other examples from Mauna Kea and Mauna Loa (1852, 1868, 1950 eruptions, and Mauna Loa in general) the test does not hold, and a more complicated process is required. Additionally, for those magmas that fail the test, CaO/Al2O3 invariably decreases with decreasing MgO content. This should not happen if only olivine fractionation and accumulation are involved. The explanation for these linear trends that approach, but fail to intersect, appropriate olivine compositions is a combination of magma mixing accompanied by olivine crystallization and accumulation. One of the mixing components is a is a high-MgO (about13-15 percent) magma laden with olivine phenocrysts and xenocrysts and the other is a consanguineous low-MgO (about 7 percent) quasi "steady-state" magma, with a prior history of clinopyroxene and plagioclase fractionation.
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Wankel, Scott D.; Kendall, Carol; Paytan, Adina
2009-01-01
Nitrate (NO-3 concentrations and dual isotopic composition (??15N and ??18O) were measured during various seasons and tidal conditions in Elkhorn Slough to evaluate mixing of sources of NO-3 within this California estuary. We found the isotopic composition of NO-3 was influenced most heavily by mixing of two primary sources with unique isotopic signatures, a marine (Monterey Bay) and terrestrial agricultural runoff source (Old Salinas River). However, our attempt to use a simple two end-member mixing model to calculate the relative contribution of these two NO-3 sources to the Slough was complicated by periods of nonconservative behavior and/or the presence of additional sources, particularly during the dry season when NO-3 concentrations were low. Although multiple linear regression generally yielded good fits to the observed data, deviations from conservative mixing were still evident. After consideration of potential alternative sources, we concluded that deviations from two end-member mixing were most likely derived from interactions with marsh sediments in regions of the Slough where high rates of NO-3 uptake and nitrification result in NO-3 with low ?? 15N and high ??18O values. A simple steady state dual isotope model is used to illustrate the impact of cycling processes in an estuarine setting which may play a primary role in controlling NO -3 isotopic composition when and where cycling rates and water residence times are high. This work expands our understanding of nitrogen and oxygen isotopes as biogeochemical tools for investigating NO -3 sources and cycling in estuaries, emphasizing the role that cycling processes may play in altering isotopic composition. Copyright 2009 by the American Geophysical Union.
Cernuda, Carlos; Lughofer, Edwin; Klein, Helmut; Forster, Clemens; Pawliczek, Marcin; Brandstetter, Markus
2017-01-01
During the production process of beer, it is of utmost importance to guarantee a high consistency of the beer quality. For instance, the bitterness is an essential quality parameter which has to be controlled within the specifications at the beginning of the production process in the unfermented beer (wort) as well as in final products such as beer and beer mix beverages. Nowadays, analytical techniques for quality control in beer production are mainly based on manual supervision, i.e., samples are taken from the process and analyzed in the laboratory. This typically requires significant lab technicians efforts for only a small fraction of samples to be analyzed, which leads to significant costs for beer breweries and companies. Fourier transform mid-infrared (FT-MIR) spectroscopy was used in combination with nonlinear multivariate calibration techniques to overcome (i) the time consuming off-line analyses in beer production and (ii) already known limitations of standard linear chemometric methods, like partial least squares (PLS), for important quality parameters Speers et al. (J I Brewing. 2003;109(3):229-235), Zhang et al. (J I Brewing. 2012;118(4):361-367) such as bitterness, citric acid, total acids, free amino nitrogen, final attenuation, or foam stability. The calibration models are established with enhanced nonlinear techniques based (i) on a new piece-wise linear version of PLS by employing fuzzy rules for local partitioning the latent variable space and (ii) on extensions of support vector regression variants (-PLSSVR and ν-PLSSVR), for overcoming high computation times in high-dimensional problems and time-intensive and inappropriate settings of the kernel parameters. Furthermore, we introduce a new model selection scheme based on bagged ensembles in order to improve robustness and thus predictive quality of the final models. The approaches are tested on real-world calibration data sets for wort and beer mix beverages, and successfully compared to linear methods, showing a clear out-performance in most cases and being able to meet the model quality requirements defined by the experts at the beer company. Figure Workflow for calibration of non-Linear model ensembles from FT-MIR spectra in beer production .
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
ERIC Educational Resources Information Center
McCluskey, James J.
1997-01-01
A study of 160 undergraduate journalism students trained to design projects (stacks) using HyperCard on Macintosh computers determined that right-brain dominant subjects outperformed left-brain and mixed-brain dominant subjects, whereas left-brain dominant subjects out performed mixed-brain dominant subjects in several areas. Recommends future…
Weigel, K A; VanRaden, P M; Norman, H D; Grosu, H
2017-12-01
In the early 1900s, breed society herdbooks had been established and milk-recording programs were in their infancy. Farmers wanted to improve the productivity of their cattle, but the foundations of population genetics, quantitative genetics, and animal breeding had not been laid. Early animal breeders struggled to identify genetically superior families using performance records that were influenced by local environmental conditions and herd-specific management practices. Daughter-dam comparisons were used for more than 30 yr and, although genetic progress was minimal, the attention given to performance recording, genetic theory, and statistical methods paid off in future years. Contemporary (herdmate) comparison methods allowed more accurate accounting for environmental factors and genetic progress began to accelerate when these methods were coupled with artificial insemination and progeny testing. Advances in computing facilitated the implementation of mixed linear models that used pedigree and performance data optimally and enabled accurate selection decisions. Sequencing of the bovine genome led to a revolution in dairy cattle breeding, and the pace of scientific discovery and genetic progress accelerated rapidly. Pedigree-based models have given way to whole-genome prediction, and Bayesian regression models and machine learning algorithms have joined mixed linear models in the toolbox of modern animal breeders. Future developments will likely include elucidation of the mechanisms of genetic inheritance and epigenetic modification in key biological pathways, and genomic data will be used with data from on-farm sensors to facilitate precision management on modern dairy farms. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Di Mauro, Michele; Dato, Guglielmo Mario Actis; Barili, Fabio; Gelsomino, Sandro; Santè, Pasquale; Corte, Alessandro Della; Carrozza, Antonio; Ratta, Ester Della; Cugola, Diego; Galletti, Lorenzo; Devotini, Roger; Casabona, Riccardo; Santini, Francesco; Salsano, Antonio; Scrofani, Roberto; Antona, Carlo; Botta, Luca; Russo, Claudio; Mancuso, Samuel; Rinaldi, Mauro; De Vincentiis, Carlo; Biondi, Andrea; Beghi, Cesare; Cappabianca, Giangiuseppe; Tarzia, Vincenzo; Gerosa, Gino; De Bonis, Michele; Pozzoli, Alberto; Nicolini, Francesco; Benassi, Filippo; Rosato, Francesco; Grasso, Elena; Livi, Ugolino; Sponga, Sandro; Pacini, Davide; Di Bartolomeo, Roberto; De Martino, Andrea; Bortolotti, Uberto; Onorati, Francesco; Faggian, Giuseppe; Lorusso, Roberto; Vizzardi, Enrico; Di Giammarco, Gabriele; Marinelli, Daniele; Villa, Emmanuel; Troise, Giovanni; Picichè, Marco; Musumeci, Francesco; Paparella, Domenico; Margari, Vito; Tritto, Francesco; Damiani, Girolamo; Scrascia, Giuseppe; Zaccaria, Salvatore; Renzulli, Attilio; Serraino, Giuseppe; Mariscalco, Giovanni; Maselli, Daniele; Foschi, Massimiliano; Parolari, Alessandro; Nappi, Giannantonio
2017-08-15
The aim of this large retrospective study was to provide a logistic risk model along an additive score to predict early mortality after surgical treatment of patients with heart valve or prosthesis infective endocarditis (IE). From 2000 to 2015, 2715 patients with native valve endocarditis (NVE) or prosthesis valve endocarditis (PVE) were operated on in 26 Italian Cardiac Surgery Centers. The relationship between early mortality and covariates was evaluated with logistic mixed effect models. Fixed effects are parameters associated with the entire population or with certain repeatable levels of experimental factors, while random effects are associated with individual experimental units (centers). Early mortality was 11.0% (298/2715); At mixed effect logistic regression the following variables were found associated with early mortality: age class, female gender, LVEF, preoperative shock, COPD, creatinine value above 2mg/dl, presence of abscess, number of treated valve/prosthesis (with respect to one treated valve/prosthesis) and the isolation of Staphylococcus aureus, Fungus spp., Pseudomonas Aeruginosa and other micro-organisms, while Streptococcus spp., Enterococcus spp. and other Staphylococci did not affect early mortality, as well as no micro-organisms isolation. LVEF was found linearly associated with outcomes while non-linear association between mortality and age was tested and the best model was found with a categorization into four classes (AUC=0.851). The following study provides a logistic risk model to predict early mortality in patients with heart valve or prosthesis infective endocarditis undergoing surgical treatment, called "The EndoSCORE". Copyright © 2017. Published by Elsevier B.V.
Environmental predictors of bovine Eimeria infection in western Kenya.
Makau, D N; Gitau, G K; Muchemi, G K; Thomas, L F; Cook, E A J; Wardrop, N A; Fèvre, E M; de Glanville, W A
2017-02-01
Eimeriosis is caused by a protozoan infection affecting most domestic animal species. Outbreaks in cattle are associated with various environmental factors in temperate climates but limited work has been done in tropical settings. The objective of this work was to determine the prevalence and environmental factors associated with bovine Eimeria spp. infection in a mixed farming area of western Kenya. A total of 983 cattle were sampled from 226 cattle-keeping households. Faecal samples were collected directly from the rectum via digital extraction and analysed for the presence of Eimeria spp. infection using the MacMaster technique. Individual and household level predictors of infection were explored using mixed effects logistic regression. The prevalence of individual animal Eimeria infection was 32.8% (95% CI 29.9-35.9). A positive linear relationship was found between risk of Eimeria infection and increasing temperature (OR = 1.4, 95% CI 1.06-1.86) and distance to areas at risk of flooding (OR = 1.49, 95% CI 1.17-1.91). There was weak evidence of non-linear relationship between Eimeria infection and the proportion of the area around a household that was classified as swamp (OR = 1.12, 95% CI 0.87-1.44; OR (quadratic term) = 0.85, 95% CI 0.73-1.00), and the sand content of the soil (OR = 1.18, 95% CI 0.91-1.53; OR (quadratic term) = 1.1, 95% CI 0.99-1.23). The risk of animal Eimeria spp. infection is influenced by a number of climatic and soil-associated conditions.
A comparison of bilingual education and generalist teachers' approaches to scientific biliteracy
NASA Astrophysics Data System (ADS)
Garza, Esther
The purpose of this study was to determine if educators were capitalizing on bilingual learners' use of their biliterate abilities to acquire scientific meaning and discourse that would formulate a scientific biliterate identity. Mixed methods were used to explore teachers' use of biliteracy and Funds of Knowledge (Moll, L., Amanti, C., Neff, D., & Gonzalez, N., 1992; Gonzales, Moll, & Amanti, 2005) from the students' Latino heritage while conducting science inquiry. The research study explored four constructs that conceptualized scientific biliteracy. The four constructs include science literacy, science biliteracy, reading comprehension strategies and students' cultural backgrounds. There were 156 4th-5th grade bilingual and general education teachers in South Texas that were surveyed using the Teacher Scientific Biliteracy Inventory (TSBI) and five teachers' science lessons were observed. Qualitative findings revealed that a variety of scientific biliteracy instructional strategies were frequently used in both bilingual and general education classrooms. The language used to deliver this instruction varied. A General Linear Model revealed that classroom assignment, bilingual or general education, had a significant effect on a teacher's instructional approach to employ scientific biliteracy. A simple linear regression found that the TSBI accounted for 17% of the variance on 4th grade reading benchmarks. Mixed methods results indicated that teachers were utilizing scientific biliteracy strategies in English, Spanish and/or both languages. Household items and science experimentation at home were encouraged by teachers to incorporate the students' cultural backgrounds. Finally, science inquiry was conducted through a universal approach to science learning versus a multicultural approach to science learning.
Liu, Danping; Yeung, Edwina H; McLain, Alexander C; Xie, Yunlong; Buck Louis, Germaine M; Sundaram, Rajeshwari
2017-09-01
Imperfect follow-up in longitudinal studies commonly leads to missing outcome data that can potentially bias the inference when the missingness is nonignorable; that is, the propensity of missingness depends on missing values in the data. In the Upstate KIDS Study, we seek to determine if the missingness of child development outcomes is nonignorable, and how a simple model assuming ignorable missingness would compare with more complicated models for a nonignorable mechanism. To correct for nonignorable missingness, the shared random effects model (SREM) jointly models the outcome and the missing mechanism. However, the computational complexity and lack of software packages has limited its practical applications. This paper proposes a novel two-step approach to handle nonignorable missing outcomes in generalized linear mixed models. We first analyse the missing mechanism with a generalized linear mixed model and predict values of the random effects; then, the outcome model is fitted adjusting for the predicted random effects to account for heterogeneity in the missingness propensity. Extensive simulation studies suggest that the proposed method is a reliable approximation to SREM, with a much faster computation. The nonignorability of missing data in the Upstate KIDS Study is estimated to be mild to moderate, and the analyses using the two-step approach or SREM are similar to the model assuming ignorable missingness. The two-step approach is a computationally straightforward method that can be conducted as sensitivity analyses in longitudinal studies to examine violations to the ignorable missingness assumption and the implications relative to health outcomes. © 2017 John Wiley & Sons Ltd.
Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe
2010-12-01
To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Rule, David L.
Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance
1989-12-01
v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
1990-03-01
and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course
ERIC Educational Resources Information Center
Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna
2010-01-01
Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Estimating the energetic cost of feeding excess dietary nitrogen to dairy cows.
Reed, K F; Bonfá, H C; Dijkstra, J; Casper, D P; Kebreab, E
2017-09-01
Feeding N in excess of requirement could require the use of additional energy to metabolize excess protein, and to synthesize and excrete urea; however, the amount and fate of this energy is unknown. Little progress has been made on this topic in recent decades, so an extension of work published in 1970 was conducted to quantify the effect of excess N on ruminant energetics. In part 1 of this study, the results of previous work were replicated using a simple linear regression to estimate the effect of excess N on energy balance. In part 2, mixed model methodology and a larger data set were used to improve upon the previously reported linear regression methods. In part 3, heat production, retained energy, and milk energy replaced the composite energy balance variable previously proposed as the dependent variable to narrow the effect of excess N. In addition, rumen degradable and undegradable protein intakes were estimated using table values and included as covariates in part 3. Excess N had opposite and approximately equal effects on heat production (+4.1 to +7.6 kcal/g of excess N) and retained energy (-4.2 to -6.6 kcal/g of excess N) but had a larger negative effect on milk gross energy (-52 to -68 kcal/g of excess N). The results suggest that feeding excess N increases heat production, but more investigation is required to determine why excess N has such a large effect on milk gross energy production. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
On-line pachymetry outcome of ablation in aberration free mode TransPRK.
Adib-Moghaddam, Soheil; Arba-Mosquera, Samuel; Salmanian, Bahram; Omidvari, Amir-Houshang; Noorizadeh, Farsad
2014-01-01
There are many independent factors that influence the outcome of refractive surgeries, consisting of patient characteristics and environmental factors. We studied the accuracy of central ablation depth compared to online pachymetry results. A total of 153 eyes that underwent TransPRK at Bina Eye Hospital, Tehran, Iran, were evaluated from November 2010 to January 2012 in a retrospective cross-sectional study. The relevant data were registered and bivariate correlations and linear regression association were investigated statistically. The mean age was 29 ± 5 years. Distribution of refractive errors was as follows: compound myopic astigmatism 123 (80.4%), simple myopia 24 (15.7%), and mixed astigmatism 6 (3.9%). Mean ambient temperature and humidity levels intraoperatively were 23.49 ± 1.16°C and 28.91 ± 6.16%, respectively. There was a significant difference (p<0.001) between the preassumed central ablation depth (131.68 ± 32.72 µm) and the net level of ablation depth (measured by online pachymetry, 168.04 ± 41.47 µm). Temperature and humidity levels were not in any statistically significant correlation with the net amount of difference found. The backward linear regression was done to reveal the association between ablation depth and several variables. This study showed that there is deviation in optical coherence pachymetry online measurements done with SCHWIND AMARIS laser. Ambient temperature and humidity levels intraoperatively do not influence the outcome. However, basic structural characteristics of patients along with change in refractive index and corneal shrinkage because of corneal dehydration are associated with the differences.
AGSuite: Software to conduct feature analysis of artificial grammar learning performance.
Cook, Matthew T; Chubala, Chrissy M; Jamieson, Randall K
2017-10-01
To simplify the problem of studying how people learn natural language, researchers use the artificial grammar learning (AGL) task. In this task, participants study letter strings constructed according to the rules of an artificial grammar and subsequently attempt to discriminate grammatical from ungrammatical test strings. Although the data from these experiments are usually analyzed by comparing the mean discrimination performance between experimental conditions, this practice discards information about the individual items and participants that could otherwise help uncover the particular features of strings associated with grammaticality judgments. However, feature analysis is tedious to compute, often complicated, and ill-defined in the literature. Moreover, the data violate the assumption of independence underlying standard linear regression models, leading to Type I error inflation. To solve these problems, we present AGSuite, a free Shiny application for researchers studying AGL. The suite's intuitive Web-based user interface allows researchers to generate strings from a database of published grammars, compute feature measures (e.g., Levenshtein distance) for each letter string, and conduct a feature analysis on the strings using linear mixed effects (LME) analyses. The LME analysis solves the inflation of Type I errors that afflicts more common methods of repeated measures regression analysis. Finally, the software can generate a number of graphical representations of the data to support an accurate interpretation of results. We hope the ease and availability of these tools will encourage researchers to take full advantage of item-level variance in their datasets in the study of AGL. We moreover discuss the broader applicability of the tools for researchers looking to conduct feature analysis in any field.
NASA Astrophysics Data System (ADS)
Banse, Karl; Yong, Marina
1990-05-01
As a proxy for satellite (coastal zone color scanner) observations and concurrent measurements of primary production rates, data from 138 stations occupied seasonally during 1967-1968 in the offshore, eastern tropical Pacific were analyzed in terms of six temporal groups and four current regimes. In multiple linear regressions on column production Pt, we found that simulated satellite pigment is generally weakly correlated, but sometimes not correlated with Pt, and that incident irradiance, sea surface temperature, nitrate, transparency, and depths of mixed layer or nitracline assume little or no importance. After a proxy for the light-saturated chlorophyll-specific photosynthetic rate pmax is added, the coefficient of determination (r2) ranges from 0.55 to 0.91 (median of 0.85) for the 10 cases. In stepwise multiple linear regressions the pmax proxy is the best predictor for Pt. Pt can be calculated fairly accurately (on the average, within 10-20%) from satellite pigment, the 10% light depth, and station values (but not from regional or seasonal means) of the pmax proxy; for individual stations the precision is 35-84% (median of 57% for the 10 groupings; p = 0.05) of the means of observed values. At present, pmax cannot be estimated from space; in the data set it is not even highly correlated with irradiance, temperature, and nitrate at depth of occurrence. Therefore extant models for calculating Pt in this tropical ocean have inherent limits of accuracy as well as of precision owing to ignorance about a physiological parameter.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Wennberg, Alexandra M V; Hagen, Clinton E; Edwards, Kelly; Roberts, Rosebud O; Machulda, Mary M; Knopman, David S; Petersen, Ronald C; Mielke, Michelle M
2018-06-05
To determine the cross-sectional and longitudinal associations between diabetes treatment type and cognitive outcomes among type II diabetics. We examined the association between metformin use, as compared to other diabetic treatment (ie, insulin, other oral medications, and diet/exercise) and cognitive test performance and mild cognitive impairment (MCI) diagnosis among 508 cognitively unimpaired at baseline type II diabetics enrolled in the Mayo Clinic Study of Aging. We created propensity scores to adjust for treatment effects. We used multivariate linear and logistic regression models to investigate the cross-sectional association between treatment type and cognitive test z scores, respectively. Mixed effects models and competing risk regression models were used to determine the longitudinal association between treatment type and change in cognitive test z scores and risk of developing incident MCI. In linear regression analyses adjusted for age, sex, education, body mass index, APOE ε4, insulin treatment, medical comorbidities, number of medications, duration of diabetes, and propensity score, we did not observe an association between metformin use and cognitive test performance. Additionally, we did not observe an association between metformin use and cognitive test performance over time (median = 3.7-year follow-up). Metformin was associated with an increased risk of MCI (subhazard ratio (SHR) = 2.75; 95% CI = 1.64, 4.63, P < .001). Similarly, other oral medications (SHR = 1.96; 95% CI = 1.19, 3.25; P = .009) and insulin (SHR = 3.17; 95% CI = 1.27, 7.92; P = .014) use were also associated with risk of MCI diagnosis. These findings suggest that metformin use, as compared to management of diabetes with other treatments, is not associated with cognitive test performance. However, metformin was associated with incident MCI diagnosis. Copyright © 2018 John Wiley & Sons, Ltd.
Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an
2013-01-01
Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present.
Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an
2013-01-01
Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present. PMID:23555040
Barzi, Federica; Jones, Graham R D; Hughes, Jaquelyne T; Lawton, Paul D; Hoy, Wendy; O'Dea, Kerin; Jerums, George; MacIsaac, Richard J; Cass, Alan; Maple-Brown, Louise J
2018-03-01
Being able to estimate kidney decline accurately is particularly important in Indigenous Australians, a population at increased risk of developing chronic kidney disease and end stage kidney disease. The aim of this analysis was to explore the trend of decline in estimated glomerular filtration rate (eGFR) over a four year period using multiple local creatinine measures, compared with estimates derived using centrally-measured enzymatic creatinine and with estimates derived using only two local measures. The eGFR study comprised a cohort of over 600 Aboriginal Australian participants recruited from over twenty sites in urban, regional and remote Australia across five strata of health, diabetes and kidney function. Trajectories of eGFR were explored on 385 participants with at least three local creatinine records using graphical methods that compared the linear trends fitted using linear mixed models with non-linear trends fitted using fractional polynomial equations. Temporal changes of local creatinine were also characterized using group-based modelling. Analyses were stratified by eGFR (<60; 60-89; 90-119 and ≥120ml/min/1.73m 2 ) and albuminuria categories (<3mg/mmol; 3-30mg/mmol; >30mg/mmol). Mean age of the participants was 48years, 64% were female and the median follow-up was 3years. Decline of eGFR was accurately estimated using simple linear regression models and locally measured creatinine was as good as centrally measured creatinine at predicting kidney decline in people with an eGFR<60 and an eGFR 60-90ml/min/1.73m 2 with albuminuria. Analyses showed that one baseline and one follow-up locally measured creatinine may be sufficient to estimate short term (up to four years) kidney function decline. The greatest yearly decline was estimated in those with eGFR 60-90 and macro-albuminuria: -6.21 (-8.20, -4.23) ml/min/1.73m 2 . Short term estimates of kidney function decline can be reliably derived using an easy to implement and simple to interpret linear mixed effect model. Locally measured creatinine did not differ to centrally measured creatinine, thus is an accurate cost-efficient and timely means to monitoring kidney function progression. Copyright © 2018 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Linear mixed-effects modeling approach to FMRI group analysis
Chen, Gang; Saad, Ziad S.; Britton, Jennifer C.; Pine, Daniel S.; Cox, Robert W.
2013-01-01
Conventional group analysis is usually performed with Student-type t-test, regression, or standard AN(C)OVA in which the variance–covariance matrix is presumed to have a simple structure. Some correction approaches are adopted when assumptions about the covariance structure is violated. However, as experiments are designed with different degrees of sophistication, these traditional methods can become cumbersome, or even be unable to handle the situation at hand. For example, most current FMRI software packages have difficulty analyzing the following scenarios at group level: (1) taking within-subject variability into account when there are effect estimates from multiple runs or sessions; (2) continuous explanatory variables (covariates) modeling in the presence of a within-subject (repeated measures) factor, multiple subject-grouping (between-subjects) factors, or the mixture of both; (3) subject-specific adjustments in covariate modeling; (4) group analysis with estimation of hemodynamic response (HDR) function by multiple basis functions; (5) various cases of missing data in longitudinal studies; and (6) group studies involving family members or twins. Here we present a linear mixed-effects modeling (LME) methodology that extends the conventional group analysis approach to analyze many complicated cases, including the six prototypes delineated above, whose analyses would be otherwise either difficult or unfeasible under traditional frameworks such as AN(C)OVA and general linear model (GLM). In addition, the strength of the LME framework lies in its flexibility to model and estimate the variance–covariance structures for both random effects and residuals. The intraclass correlation (ICC) values can be easily obtained with an LME model with crossed random effects, even at the presence of confounding fixed effects. The simulations of one prototypical scenario indicate that the LME modeling keeps a balance between the control for false positives and the sensitivity for activation detection. The importance of hypothesis formulation is also illustrated in the simulations. Comparisons with alternative group analysis approaches and the limitations of LME are discussed in details. PMID:23376789
Blood biomarkers in male and female participants after an Ironman-distance triathlon
Danielsson, Tom; Carlsson, Jörg; Schreyer, Hendrik; Ahnesjö, Jonas; Ten Siethoff, Lasse; Ragnarsson, Thony; Tugetam, Åsa
2017-01-01
Background While overall physical activity is clearly associated with a better short-term and long-term health, prolonged strenuous physical activity may result in a rise in acute levels of blood-biomarkers used in clinical practice for diagnosis of various conditions or diseases. In this study, we explored the acute effects of a full Ironman-distance triathlon on biomarkers related to heart-, liver-, kidney- and skeletal muscle damage immediately post-race and after one week’s rest. We also examined if sex, age, finishing time and body composition influenced the post-race values of the biomarkers. Methods A sample of 30 subjects was recruited (50% women) to the study. The subjects were evaluated for body composition and blood samples were taken at three occasions, before the race (T1), immediately after (T2) and one week after the race (T3). Linear regression models were fitted to analyse the independent contribution of sex and finishing time controlled for weight, body fat percentage and age, on the biomarkers at the termination of the race (T2). Linear mixed models were fitted to examine if the biomarkers differed between the sexes over time (T1-T3). Results Being male was a significant predictor of higher post-race (T2) levels of myoglobin, CK, and creatinine levels and body weight was negatively associated with myoglobin. In general, the models were unable to explain the variation of the dependent variables. In the linear mixed models, an interaction between time (T1-T3) and sex was seen for myoglobin and creatinine, in which women had a less pronounced response to the race. Conclusion Overall women appear to tolerate the effects of prolonged strenuous physical activity better than men as illustrated by their lower values of the biomarkers both post-race as well as during recovery. PMID:28609447
Terluin, Berend; de Boer, Michiel R; de Vet, Henrica C W
2016-01-01
The network approach to psychopathology conceives mental disorders as sets of symptoms causally impacting on each other. The strengths of the connections between symptoms are key elements in the description of those symptom networks. Typically, the connections are analysed as linear associations (i.e., correlations or regression coefficients). However, there is insufficient awareness of the fact that differences in variance may account for differences in connection strength. Differences in variance frequently occur when subgroups are based on skewed data. An illustrative example is a study published in PLoS One (2013;8(3):e59559) that aimed to test the hypothesis that the development of psychopathology through "staging" was characterized by increasing connection strength between mental states. Three mental states (negative affect, positive affect, and paranoia) were studied in severity subgroups of a general population sample. The connection strength was found to increase with increasing severity in six of nine models. However, the method used (linear mixed modelling) is not suitable for skewed data. We reanalysed the data using inverse Gaussian generalized linear mixed modelling, a method suited for positively skewed data (such as symptoms in the general population). The distribution of positive affect was normal, but the distributions of negative affect and paranoia were heavily skewed. The variance of the skewed variables increased with increasing severity. Reanalysis of the data did not confirm increasing connection strength, except for one of nine models. Reanalysis of the data did not provide convincing evidence in support of staging as characterized by increasing connection strength between mental states. Network researchers should be aware that differences in connection strength between symptoms may be caused by differences in variances, in which case they should not be interpreted as differences in impact of one symptom on another symptom.
Linear mixed-effects modeling approach to FMRI group analysis.
Chen, Gang; Saad, Ziad S; Britton, Jennifer C; Pine, Daniel S; Cox, Robert W
2013-06-01
Conventional group analysis is usually performed with Student-type t-test, regression, or standard AN(C)OVA in which the variance-covariance matrix is presumed to have a simple structure. Some correction approaches are adopted when assumptions about the covariance structure is violated. However, as experiments are designed with different degrees of sophistication, these traditional methods can become cumbersome, or even be unable to handle the situation at hand. For example, most current FMRI software packages have difficulty analyzing the following scenarios at group level: (1) taking within-subject variability into account when there are effect estimates from multiple runs or sessions; (2) continuous explanatory variables (covariates) modeling in the presence of a within-subject (repeated measures) factor, multiple subject-grouping (between-subjects) factors, or the mixture of both; (3) subject-specific adjustments in covariate modeling; (4) group analysis with estimation of hemodynamic response (HDR) function by multiple basis functions; (5) various cases of missing data in longitudinal studies; and (6) group studies involving family members or twins. Here we present a linear mixed-effects modeling (LME) methodology that extends the conventional group analysis approach to analyze many complicated cases, including the six prototypes delineated above, whose analyses would be otherwise either difficult or unfeasible under traditional frameworks such as AN(C)OVA and general linear model (GLM). In addition, the strength of the LME framework lies in its flexibility to model and estimate the variance-covariance structures for both random effects and residuals. The intraclass correlation (ICC) values can be easily obtained with an LME model with crossed random effects, even at the presence of confounding fixed effects. The simulations of one prototypical scenario indicate that the LME modeling keeps a balance between the control for false positives and the sensitivity for activation detection. The importance of hypothesis formulation is also illustrated in the simulations. Comparisons with alternative group analysis approaches and the limitations of LME are discussed in details. Published by Elsevier Inc.
Regression of non-linear coupling of noise in LIGO detectors
NASA Astrophysics Data System (ADS)
Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.
2018-03-01
In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
The impact of international service on the development of volunteers' intercultural relations.
Lough, Benjamin J; Sherraden, Margaret Sherrard; McBride, Amanda Moore; Xiang, Xiaoling
2014-07-01
Approximately one million people from the United States perform international volunteer service each year, representing a significant flow of ideas, people, resources, and aid across international borders. This quasi-experimental study assesses the longitudinal impact of international volunteer service on volunteers' intercultural relations, international social capital, and concern about international affairs. Using linear mixed regression models that control for a counterfactual comparison group of individuals that did not travel abroad, international volunteers are more likely to report significant increases in international social capital and international concern two to three years after returning from service. Results indicate that intercultural relations may also continue to increase years after returning from service. International service may be a useful approach to helping people gain skills and networks that are needed in an increasingly global society. Copyright © 2014 Elsevier Inc. All rights reserved.
Pedraza-Flechas, Ana María; Lope, Virginia; Moreo, Pilar; Ascunce, Nieves; Miranda-García, Josefa; Vidal, Carmen; Sánchez-Contador, Carmen; Santamariña, Carmen; Pedraz-Pingarrón, Carmen; Llobet, Rafael; Aragonés, Nuria; Salas-Trejo, Dolores; Pollán, Marina; Pérez-Gómez, Beatriz
2017-05-01
We explored the relationship between sleep patterns and sleep disorders and mammographic density (MD), a marker of breast cancer risk. Participants in the DDM-Spain/var-DDM study, which included 2878 middle-aged Spanish women, were interviewed via telephone and asked questions on sleep characteristics. Two radiologists assessed MD in their left craneo-caudal mammogram, assisted by a validated semiautomatic-computer tool (DM-scan). We used log-transformed percentage MD as the dependent variable and fitted mixed linear regression models, including known confounding variables. Our results showed that neither sleeping patterns nor sleep disorders were associated with MD. However, women with frequent changes in their bedtime due to anxiety or depression had higher MD (e β :1.53;95%CI:1.04-2.26). Copyright © 2017 Elsevier B.V. All rights reserved.
The effects of guided inquiry instruction on student achievement in high school biology
NASA Astrophysics Data System (ADS)
Vass, Laszlo
The purpose of this quantitative, quasi-experimental study was to measure the effect of a student-centered instructional method called guided inquiry on the achievement of students in a unit of study in high school biology. The study used a non-random sample of 109 students, the control group of 55 students enrolled in high school one, received teacher centered instruction while the experimental group of 54 students enrolled at high school two received student-centered, guided inquiry instruction. The pretest-posttest design of the study analyzed scores using an independent t-test, a dependent t-test (p = <.001), an ANCOVA (p = .007), mixed method ANOVA (p = .024) and hierarchical linear regression (p = <.001). The experimental group that received guided inquiry instruction had statistically significantly higher achievement than the control group.
Exploring the effects of coexisting amyloid in subcortical vascular cognitive impairment.
Dao, Elizabeth; Hsiung, Ging-Yuek Robin; Sossi, Vesna; Jacova, Claudia; Tam, Roger; Dinelle, Katie; Best, John R; Liu-Ambrose, Teresa
2015-10-12
Mixed pathology, particularly Alzheimer's disease with cerebrovascular lesions, is reported as the second most common cause of dementia. Research on mixed dementia typically includes people with a primary AD diagnosis and hence, little is known about the effects of co-existing amyloid pathology in people with vascular cognitive impairment (VCI). The purpose of this study was to understand whether individual differences in amyloid pathology might explain variations in cognitive impairment among individuals with clinical subcortical VCI (SVCI). Twenty-two participants with SVCI completed an (11)C Pittsburgh compound B (PIB) position emission tomography (PET) scan to quantify global amyloid deposition. Cognitive function was measured using: 1) MOCA; 2) ADAS-Cog; 3) EXIT-25; and 4) specific executive processes including a) Digits Forward and Backwards Test, b) Stroop-Colour Word Test, and c) Trail Making Test. To assess the effect of amyloid deposition on cognitive function we conducted Pearson bivariate correlations to determine which cognitive measures to include in our regression models. Cognitive variables that were significantly correlated with PIB retention values were entered in a hierarchical multiple linear regression analysis to determine the unique effect of amyloid on cognitive function. We controlled for age, education, and ApoE ε4 status. Bivariate correlation results showed that PIB binding was significantly correlated with ADAS-Cog (p < 0.01) and MOCA (p < 0.01); increased PIB binding was associated with worse cognitive function on both cognitive measures. PIB binding was not significantly correlated with the EXIT-25 or with specific executive processes (p > 0.05). Regression analyses controlling for age, education, and ApoE ε4 status indicated an independent association between PIB retention and the ADAS-Cog (adjusted R-square change of 15.0%, Sig F Change = 0.03). PIB retention was also independently associated with MOCA scores (adjusted R-Square Change of 27.0%, Sig F Change = 0.02). We found that increased global amyloid deposition was significantly associated with greater memory and executive dysfunctions as measured by the ADAS-Cog and MOCA. Our findings point to the important role of co-existing amyloid deposition for cognitive function in those with a primary SVCI diagnosis. As such, therapeutic approaches targeting SVCI must consider the potential role of amyloid for the optimal care of those with mixed dementia. NCT01027858.
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Zeiger, Sean; Hubbart, Jason A
2016-01-15
Suspended sediment (SS) remains the most pervasive water quality problem globally and yet, despite progress, SS process understanding remains relatively poor in watersheds with mixed-land-use practices. The main objective of the current work was to investigate relationships between suspended sediment and land use types at multiple spatial scales (n=5) using four years of suspended sediment data collected in a representative urbanized mixed-land-use (forest, agriculture, urban) watershed. Water samples were analyzed for SS using a nested-scale experimental watershed study design (n=836 samples×5 gauging sites). Kruskal-Wallis and Dunn's post-hoc multiple comparison tests were used to test for significant differences (CI=95%, p<0.05) in SS levels between gauging sites. Climate extremes (high precipitation/drought) were observed during the study period. Annual maximum SS concentrations exceeded 2387.6 mg/L. Median SS concentrations decreased by 60% from the agricultural headwaters to the rural/urban interface, and increased by 98% as urban land use increased. Multiple linear regression analysis results showed significant relationships between SS, annual total precipitation (positive correlate), forested land use (negative correlate), agricultural land use (negative correlate), and urban land use (negative correlate). Estimated annual SS yields ranged from 16.1 to 313.0 t km(-2) year(-1) mainly due to differences in annual total precipitation. Results highlight the need for additional studies, and point to the need for improved best management practices designed to reduce anthropogenic SS loading in mixed-land-use watersheds. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Song, Ping; Chan, Chuen-Yu; Geng, Fuhai; Yu, Qiong; Guo, Yifei; Yu, Lingwei
2012-02-01
Measurements of air samples collected at four urban sites in Shanghai, Taizhou, Liyang and Lin'an and a rural site in Chongming Island of the Yangtze River Delta (YRD) region of China revealed noticeably elevated mixing ratios of methyl chloride (CH3Cl). Median CH3Cl mixing ratios reached 0.9-3.9 ppbv at the five sampling locations, significantly higher than most of those reported for other regions in the world. Especially at Liyang site and Taizhou site, CH3Cl exhibited quite high levels with mixing ratios ranging from 0.9 up to 25.9 ppbv (n = 28) and 0.7 up to 17.3 ppbv (n = 29), respectively. With good correlation with methylene chloride (CH2Cl2) and ethylene dichloride (EDC), abundant CH3Cl in urban Shanghai, was mainly associated with industrial activities, although biomass burnings exist widely in rural areas of east China. The high concentrations and large variation of CH3Cl and EDC simultaneously appeared at Liyang site. Spikes of CH3Cl and EDC concentrations as well as toluene/benzene (T/B) ratios frequently present in easterly airflows indicated an important contribution from emissions of chemical plants clustering in the east of Liyang. Different emission sources may contribute to ambient CH3Cl at Taizhou site, which was suggested by the two kinds of linear regressions of CH3Cl to some other compounds detected. The substantially elevated CH3Cl levels suggest significant influence of intensive industrial activities on the YRD atmosphere.
Pedersen, Birgith; Groenkjaer, Mette; Falkmer, Ursula; Delmar, Charlotte
Changes in weight and body composition among women during and after adjuvant antineoplastic treatment for breast cancer may influence long-term survival and quality of life. Research on factual weight changes is diverse and contrasting, and their influence on women's perception of body and self seems to be insufficiently explored. The aim of this study was to expand the understanding of the association between changes in weight and body composition and the women's perception of body and selves. A mixed-methods research design was used. Data consisted of weight and body composition measures from 95 women with breast cancer during 18 months past surgery. Twelve women from this cohort were interviewed individually at 12 months. Linear mixed model and logistic regression were used to estimate changes of repeated measures and odds ratio. Interviews were analyzed guided by existential phenomenology. Joint displays and integrative mixed-methods interpretation demonstrated that even small weight gains, extended waist, and weight loss were associated with fearing recurrence of breast cancer. Perceiving an ambiguous transforming body, the women moved between a unified body subject and the body as an object dissociated in "I" and "it" while fighting against or accepting the body changes. Integrating findings demonstrated that factual weight changes do not correspond with the perceived changes and may trigger existential threats. Transition to a new habitual body demand health practitioners to enter a joint narrative work to reveal how the changes impact on the women's body and self-perception independent of how they are displayed quantitatively.
Random effects coefficient of determination for mixed and meta-analysis models
Demidenko, Eugene; Sargent, James; Onega, Tracy
2011-01-01
The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, Rr2, that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If Rr2 is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of Rr2 apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects—the model can be estimated using the dummy variable approach. We derive explicit formulas for Rr2 in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine. PMID:23750070
Shin, W; Mahmoud, S Y; Sakaie, K; Banks, S J; Lowe, M J; Phillips, M; Modic, M T; Bernick, C
2014-02-01
Traumatic brain injury is common in fighting athletes such as boxers, given the frequency of blows to the head. Because DTI is sensitive to microstructural changes in white matter, this technique is often used to investigate white matter integrity in patients with traumatic brain injury. We hypothesized that previous fight exposure would predict DTI abnormalities in fighting athletes after controlling for individual variation. A total of 74 boxers and 81 mixed martial arts fighters were included in the analysis and scanned by use of DTI. Individual information and data on fight exposures, including number of fights and knockouts, were collected. A multiple hierarchical linear regression model was used in region-of-interest analysis to test the hypothesis that fight-related exposure could predict DTI values separately in boxers and mixed martial arts fighters. Age, weight, and years of education were controlled to ensure that these factors would not account for the hypothesized effects. We found that the number of knockouts among boxers predicted increased longitudinal diffusivity and transversal diffusivity in white matter and subcortical gray matter regions, including corpus callosum, isthmus cingulate, pericalcarine, precuneus, and amygdala, leading to increased mean diffusivity and decreased fractional anisotropy in the corresponding regions. The mixed martial arts fighters had increased transversal diffusivity in the posterior cingulate. The number of fights did not predict any DTI measures in either group. These findings suggest that the history of fight exposure in a fighter population can be used to predict microstructural brain damage.
NASA Astrophysics Data System (ADS)
Terrano, Daniel; Tsuper, Ilona; Maraschky, Adam; Holland, Nolan; Streletzky, Kiril
Temperature sensitive nanoparticles were generated from a construct (H20F) of three chains of elastin-like polypeptides (ELP) linked to a negatively charged foldon domain. This ELP system was mixed at different ratios with linear chains of ELP (H40L) which lacks the foldon domain. The mixed system is soluble at room temperature and at a transition temperature (Tt) will form swollen micelles with the hydrophobic linear chains hidden inside. This system was studied using depolarized dynamic light scattering (DDLS) and static light scattering (SLS) to determine the size, shape, and internal structure of the mixed micelles. The mixed micelle in equal parts of H20F and H40L show a constant apparent hydrodynamic radius of 40-45 nm at the concentration window from 25:25 to 60:60 uM (1:1 ratio). At a fixed 50 uM concentration of the H20F, varying H40L concentration from 5 to 80 uM resulted in a linear growth in the hydrodynamic radius from about 11 to about 62 nm, along with a 1000-fold increase in VH signal. A possible simple model explaining the growth of the swollen micelles is considered. Lastly, the VH signal can indicate elongation in the geometry of the particle or could possibly be a result from anisotropic properties from the core of the micelle. SLS was used to study the molecular weight, and the radius of gyration of the micelle to help identify the structure and morphology of mixed micelles and the tangible cause of the VH signal.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES
Zhu, Liping; Huang, Mian; Li, Runze
2012-01-01
This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Leadership development practices and hospital financial outcomes.
Crowe, Daniel; Garman, Andrew N; Li, Chien-Ching; Helton, Jeff; Anderson, Matthew M; Butler, Peter
2017-08-01
Affordable Care Act legislation is requiring leaders in US health systems to adapt to new and very different approaches to improving operating performance. Research from other industries suggests leadership development can be a helpful component of organizational change strategies; however, there is currently very little healthcare-specific research available to guide design and deployment. The goal of this exploratory study is to examine potential relationships between specific leadership development practices and health system financial outcomes. Results from the National Center for Healthcare Leadership survey of leadership development practices were correlated with hospital and health system financial performance data from the 2013 Medicare Cost Reports. A general linear regression model, controlling for payer mix, case-mix index, and bed size, was used to assess possible relationships between leadership practices and three financial performance metrics: operating margin, days cash on hand, and debt to capitalization. Statistically significant associations were found between hospital-level operating margins and 5 of the 11 leadership practices as well as the composite score. Relationships at the health system level, however, were not statistically significant. Results provide preliminary evidence of an association between hospital financial performance and investments made in developing their leaders.
Thelen, Sven; Miekisch, Wolfram; Halmer, Daniel; Schubert, Jochen; Hering, Peter; Mürtz, Manfred
2008-04-15
Comparison of two different methods for the measurement of ethane at the parts-per-billion (ppb) level is reported. We used cavity leak-out spectroscopy (CALOS) in the 3 microm wavelength region and gas chromatography-flame ionization detection (GC-FID) for the analysis of various gas samples containing ethane fractions in synthetic air. Intraday and interday reproducibilities were studied. Intercomparing the results of two series involving seven samples with ethane mixing ratios ranging from 0.5 to 100 ppb, we found a reasonable agreement between both methods. The scatter plot of GC-FID data versus CALOS data yields a linear regression slope of 1.07 +/- 0.03. Furthermore, some of the ethane mixtures were checked over the course of 1 year, which proved the long-term stability of the ethane mixing ratio. We conclude that CALOS shows equivalent ethane analysis precision compared to GC-FID, with the significant advantage of a much higher time resolution (<1 s) since there is no requirement for sample preconcentration. This opens new analytical possibilities, e.g., for real-time monitoring of ethane traces in exhaled human breath.
Utility of Thermal Infrared Satellite Data For Urban Landscapes
NASA Astrophysics Data System (ADS)
Xian, G.; Crane, M.; Granneman, B.
2006-12-01
Urban landscapes are comprised of a variety of surfaces that are characterized by contrasting radiative, thermal, aerodynamic, and moisture properties. These different surfaces possess diverse physical and thermal attributes that directly influence surface energy balance and our ability to determine surface characteristics in urban areas. Reflectance properties obtained from satellite imagery have proven useful for mapping urban land use and land cover change, as well as ecosystem health. Landsat reflectance bands are commonly used in regression tree models to generate linear equations that correspond to distinct land surface materials. However, urban land cover is generally a heterogeneous mix of bare soil, vegetation, rock, and anthropogenic impervious surfaces. Surface temperature obtained from satellite thermal infrared bands provides valuable information about surface biophysical properties and radiant thermal characteristics of land cover elements, especially for urban environments. This study demonstrates the improved characterization of land cover conditions for Seattle, Washington, and Las Vegas, Nevada, that were achieved by using both the reflectance and thermal bands of Landsat Enhanced Thematic Mapper Plus (ETM+) data. Including the thermal band in the image analysis increased the accuracy of discriminating cover types in heterogeneous landscapes with extreme contrasts, especially for mixed pixels at the urban interface.
Identifying linkages between land use, geomorphology, and aquatic habitat in a mixed-use watershed.
McIlroy, Susan K; Montagne, Cliff; Jones, Clain A; McGlynn, Brian L
2008-11-01
The potential impacts of land use on large woody debris (LWD) were examined in Sourdough Creek Watershed, a rapidly growing area encompassing Bozeman, Montana, USA. We identified six land classes within a 250 m buffer extending on either side of Sourdough Creek and assessed aquatic habitat and geomorphologic variables within each class. All LWD pieces were counted, and we examined 14 other variables, including undercut bank, sinuosity, and substrate composition. LWD numbers were generally low and ranged from 0 to 8.2 pieces per 50 m of stream. Linear regression showed that LWD increased with distance from headwaters, riparian forest width, and sinuosity in four of the six land classes. Statistically significant differences between land classes for many aquatic habitat and geomorphologic variables indicated the impacts of different land uses on stream structure. We also found that practices such as active wood removal played a key role in LWD abundance. This finding suggests that managers should prioritize public education and outreach concerning the importance of in-stream wood, especially in mixed-use watersheds where wood is removed for either aesthetic reasons or to prevent stream flooding.
NASA Astrophysics Data System (ADS)
Mathias, Simon A.; Wen, Zhang
2015-05-01
This article presents a numerical study to investigate the combined role of partial well penetration (PWP) and non-Darcy effects concerning the performance of groundwater production wells. A finite difference model is developed in MATLAB to solve the two-dimensional mixed-type boundary value problem associated with flow to a partially penetrating well within a cylindrical confined aquifer. Non-Darcy effects are incorporated using the Forchheimer equation. The model is verified by comparison to results from existing semi-analytical solutions concerning the same problem but assuming Darcy's law. A sensitivity analysis is presented to explore the problem of concern. For constant pressure production, Non-Darcy effects lead to a reduction in production rate, as compared to an equivalent problem solved using Darcy's law. For fully penetrating wells, this reduction in production rate becomes less significant with time. However, for partially penetrating wells, the reduction in production rate persists for much larger times. For constant production rate scenarios, the combined effect of PWP and non-Darcy flow takes the form of a constant additional drawdown term. An approximate solution for this loss term is obtained by performing linear regression on the modeling results.
Predictors of microbial agents in dust and respiratory health in the Ecrhs.
Tischer, Christina; Zock, Jan-Paul; Valkonen, Maria; Doekes, Gert; Guerra, Stefano; Heederik, Dick; Jarvis, Deborah; Norbäck, Dan; Olivieri, Mario; Sunyer, Jordi; Svanes, Cecilie; Täubel, Martin; Thiering, Elisabeth; Verlato, Giuseppe; Hyvärinen, Anne; Heinrich, Joachim
2015-05-02
Dampness and mould exposure have been repeatedly associated with respiratory health. However, less is known about the specific agents provoking or arresting health effects in adult populations. We aimed to assess predictors of microbial agents in mattress dust throughout Europe and to investigate associations between microbial exposures, home characteristics and respiratory health. Seven different fungal and bacterial parameters were assessed in mattress dust from 956 adult ECRHS II participants in addition to interview based home characteristics. Associations between microbial parameters and the asthma score and lung function were examined using mixed negative binomial regression and linear mixed models, respectively. Indoor dampness and pet keeping were significant predictors for higher microbial agent concentrations in mattress dust. Current mould and condensation in the bedroom were significantly associated with lung function decline and current mould at home was positively associated with the asthma score. Higher concentrations of muramic acid were associated with higher mean ratios of the asthma score (aMR 1.37, 95%CI 1.17-1.61). There was no evidence for any association between fungal and bacterial components and lung function. Indoor dampness was associated with microbial levels in mattress dust which in turn was positively associated with asthma symptoms.
NASA Astrophysics Data System (ADS)
Zhao, H.; Hao, Y.; Liu, X.; Hou, M.; Zhao, X.
2018-04-01
Hyperspectral remote sensing is a completely non-invasive technology for measurement of cultural relics, and has been successfully applied in identification and analysis of pigments of Chinese historical paintings. Although the phenomenon of mixing pigments is very usual in Chinese historical paintings, the quantitative analysis of the mixing pigments in the ancient paintings is still unsolved. In this research, we took two typical mineral pigments, vermilion and stone yellow as example, made precisely mixed samples using these two kinds of pigments, and measured their spectra in the laboratory. For the mixing spectra, both fully constrained least square (FCLS) method and derivative of ratio spectroscopy (DRS) were performed. Experimental results showed that the mixing spectra of vermilion and stone yellow had strong nonlinear mixing characteristics, but at some bands linear unmixing could also achieve satisfactory results. DRS using strong linear bands can reach much higher accuracy than that of FCLS using full bands.
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Approximating a nonlinear advanced-delayed equation from acoustics
NASA Astrophysics Data System (ADS)
Teodoro, M. Filomena
2016-10-01
We approximate the solution of a particular non-linear mixed type functional differential equation from physiology, the mucosal wave model of the vocal oscillation during phonation. The mathematical equation models a superficial wave propagating through the tissues. The numerical scheme is adapted from the work presented in [1, 2, 3], using homotopy analysis method (HAM) to solve the non linear mixed type equation under study.
We investigated the use of output from Bayesian stable isotope mixing models as constraints for a linear inverse food web model of a temperate intertidal seagrass system in the Marennes-Oléron Bay, France. Linear inverse modeling (LIM) is a technique that estimates a complete net...
Improving the Power of GWAS and Avoiding Confounding from Population Stratification with PC-Select
Tucker, George; Price, Alkes L.; Berger, Bonnie
2014-01-01
Using a reduced subset of SNPs in a linear mixed model can improve power for genome-wide association studies, yet this can result in insufficient correction for population stratification. We propose a hybrid approach using principal components that does not inflate statistics in the presence of population stratification and improves power over standard linear mixed models. PMID:24788602
Robust extraction of baseline signal of atmospheric trace species using local regression
NASA Astrophysics Data System (ADS)
Ruckstuhl, A. F.; Henne, S.; Reimann, S.; Steinbacher, M.; Vollmer, M. K.; O'Doherty, S.; Buchmann, B.; Hueglin, C.
2012-11-01
The identification of atmospheric trace species measurements that are representative of well-mixed background air masses is required for monitoring atmospheric composition change at background sites. We present a statistical method based on robust local regression that is well suited for the selection of background measurements and the estimation of associated baseline curves. The bootstrap technique is applied to calculate the uncertainty in the resulting baseline curve. The non-parametric nature of the proposed approach makes it a very flexible data filtering method. Application to carbon monoxide (CO) measured from 1996 to 2009 at the high-alpine site Jungfraujoch (Switzerland, 3580 m a.s.l.), and to measurements of 1,1-difluoroethane (HFC-152a) from Jungfraujoch (2000 to 2009) and Mace Head (Ireland, 1995 to 2009) demonstrates the feasibility and usefulness of the proposed approach. The determined average annual change of CO at Jungfraujoch for the 1996 to 2009 period as estimated from filtered annual mean CO concentrations is -2.2 ± 1.1 ppb yr-1. For comparison, the linear trend of unfiltered CO measurements at Jungfraujoch for this time period is -2.9 ± 1.3 ppb yr-1.
NASA Astrophysics Data System (ADS)
Hadley, Brian Christopher
This dissertation assessed remotely sensed data and geospatial modeling technique(s) to map the spatial distribution of total above-ground biomass present on the surface of the Savannah River National Laboratory's (SRNL) Mixed Waste Management Facility (MWMF) hazardous waste landfill. Ordinary least squares (OLS) regression, regression kriging, and tree-structured regression were employed to model the empirical relationship between in-situ measured Bahia (Paspalum notatum Flugge) and Centipede [Eremochloa ophiuroides (Munro) Hack.] grass biomass against an assortment of explanatory variables extracted from fine spatial resolution passive optical and LIDAR remotely sensed data. Explanatory variables included: (1) discrete channels of visible, near-infrared (NIR), and short-wave infrared (SWIR) reflectance, (2) spectral vegetation indices (SVI), (3) spectral mixture analysis (SMA) modeled fractions, (4) narrow-band derivative-based vegetation indices, and (5) LIDAR derived topographic variables (i.e. elevation, slope, and aspect). Results showed that a linear combination of the first- (1DZ_DGVI), second- (2DZ_DGVI), and third-derivative of green vegetation indices (3DZ_DGVI) calculated from hyperspectral data recorded over the 400--960 nm wavelengths of the electromagnetic spectrum explained the largest percentage of statistical variation (R2 = 0.5184) in the total above-ground biomass measurements. In general, the topographic variables did not correlate well with the MWMF biomass data, accounting for less than five percent of the statistical variation. It was concluded that tree-structured regression represented the optimum geospatial modeling technique due to a combination of model performance and efficiency/flexibility factors.
Trepczynski, Adam; Kutzner, Ines; Bergmann, Georg; Taylor, William R; Heller, Markus O
2014-05-01
The external knee adduction moment (EAM) is often considered a surrogate measure of the distribution of loads across the tibiofemoral joint during walking. This study was undertaken to quantify the relationship between the EAM and directly measured medial tibiofemoral contact forces (Fmed ) in a sample of subjects across a spectrum of activities. The EAM for 9 patients who underwent total knee replacement was calculated using inverse dynamics analysis, while telemetric implants provided Fmed for multiple repetitions of 10 activities, including walking, stair negotiation, sit-to-stand activities, and squatting. The effects of the factors "subject" and "activity" on the relationships between Fmed and EAM were quantified using mixed-effects regression analyses in terms of the root mean square error (RMSE) and the slope of the regression. Across subjects and activities a good correlation between peak EAM and Fmed values was observed, with an overall R(2) value of 0.88. However, the slope of the linear regressions varied between subjects by up to a factor of 2. At peak EAM and Fmed , the RMSE of the regression across all subjects was 35% body weight (%BW), while the maximum error was 127 %BW. The relationship between EAM and Fmed is generally good but varies considerably across subjects and activities. These findings emphasize the limitation of relying solely on the EAM to infer medial joint loading when excessive directed cocontraction of muscles exists and call for further investigations into the soft tissue-related mechanisms that modulate the internal forces at the knee. Copyright © 2014 by the American College of Rheumatology.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Image interpolation via regularized local linear regression.
Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang
2011-12-01
The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Kumar, K Vasanth; Porkodi, K; Rocha, F
2008-01-15
A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
Otomaru, Takafumi; Sumita, Yuka I; Chang, Qingan; Fueki, Kenji; Igarashi, Yoshimasa; Taniguchi, Hisashi
2009-07-01
Several previous reports have described factors that affect masticatory function. However, there are no known predictors that affect the food mixing ability of the masticatory function, and it has been impossible to predict masticatory function in mandibulectomy and/or glossectomy patients. The purpose of the present study was to develop a numerical formula that could predict the food mixing ability of the masticatory function among mandibulectomy and/or glossectomy patients. The null hypothesis of the study was that five predictors, namely mandibulectomy, mandibular continuity, number of residual mandibular teeth, occlusal units and tongue movement score, were unable to account for the mixing ability index (MAI) in mandibulectomy and/or glossectomy patients. The subjects were 20 patients who had undergone mandibulectomy and/or glossectomy. The above-described five predictors were assessed. Tongue movement was evaluated with a tongue movement test and the MAI was evaluated with a mixing ability test. Multiple regression analysis was used to examine whether the five predictors affected the MAI after prosthetic treatment. A regression equation was determined for the five predictors (R(2)=0.83; adjusted R(2)=0.77; p<0.001). The obtained regression equation could successfully account for the MAI in mandibulectomy and/or glossectomy patients.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
A comparison of two gears for quantifying abundance of lotic-dwelling crayfish
Williams, Kristi; Brewer, Shannon K.; Ellersieck, Mark R.
2014-01-01
Crayfish (saddlebacked crayfish, Orconectes medius) catch was compared using a kick seine applied two different ways with a 1-m2 quadrat sampler (with known efficiency and bias in riffles) from three small streams in the Missouri Ozarks. Triplicate samples (one of each technique) were taken from two creeks and one headwater stream (n=69 sites) over a two-year period. General linear mixed models showed the number of crayfish collected using the quadrat sampler was greater than the number collected using either of the two seine techniques. However, there was no significant interaction with gear suggesting year, stream size, and channel unit type did not relate to different catches of crayfish by gear type. Variation in catch among gears was similar, as was the proportion of young-of-year individuals across samples taken with different gears or techniques. Negative binomial linear regression provided the appropriate relation between the gears which allows correction factors to be applied, if necessary, to relate catches by the kick seine to those of the quadrat sampler. The kick seine appears to be a reasonable substitute to the quadrat sampler in these shallow streams, with the advantage of ease of use and shorter time required per sample.
Multivariate predictors of music perception and appraisal by adult cochlear implant users.
Gfeller, Kate; Oleson, Jacob; Knutson, John F; Breheny, Patrick; Driscoll, Virginia; Olszewski, Carol
2008-02-01
The research examined whether performance by adult cochlear implant recipients on a variety of recognition and appraisal tests derived from real-world music could be predicted from technological, demographic, and life experience variables, as well as speech recognition scores. A representative sample of 209 adults implanted between 1985 and 2006 participated. Using multiple linear regression models and generalized linear mixed models, sets of optimal predictor variables were selected that effectively predicted performance on a test battery that assessed different aspects of music listening. These analyses established the importance of distinguishing between the accuracy of music perception and the appraisal of musical stimuli when using music listening as an index of implant success. Importantly, neither device type nor processing strategy predicted music perception or music appraisal. Speech recognition performance was not a strong predictor of music perception, and primarily predicted music perception when the test stimuli included lyrics. Additionally, limitations in the utility of speech perception in predicting musical perception and appraisal underscore the utility of music perception as an alternative outcome measure for evaluating implant outcomes. Music listening background, residual hearing (i.e., hearing aid use), cognitive factors, and some demographic factors predicted several indices of perceptual accuracy or appraisal of music.
Familial associations with paratuberculosis ELISA results in Texas Longhorn cattle.
Osterstock, Jason B; Fosgate, Geoffrey T; Cohen, Noah D; Derr, James N; Manning, Elizabeth J B; Collins, Michael T; Roussel, Allen J
2008-05-25
The objective of this cross-sectional study was to estimate familial associations with paratuberculosis ELISA status in beef cattle. Texas Longhorn cattle (n=715) greater than 2years of age were sampled for paratuberculosis testing using ELISA and fecal culture. Diagnostic test results were indicative of substantial numbers of false-positive serological reactions consistent with environmental exposure to non-MAP Mycobacterium spp. Associations between ancestors and paratuberculosis ELISA status of offspring were assessed using conditional logistic regression. The association between ELISA status of the dam and her offspring was assessed using linear mixed-effect models. Significant associations were identified between some ancestors and offspring ELISA status. The odds of being classified as "suspect" or greater based on ELISA results were 4.6 times greater for offspring of dams with similarly increased S:P ratios. A significant positive linear association was also observed between dam and offspring log-transformed S:P ratios. Results indicate that there is familial aggregation of paratuberculosis ELISA results in beef cattle and suggest that genetic selection based on paratuberculosis ELISA status may decrease seroprevalence. However, genetic selection may have minimal effect on paratuberculosis control in herds with exposure to non-MAP Mycobacterium spp.
A componential model of human interaction with graphs: 1. Linear regression modeling
NASA Technical Reports Server (NTRS)
Gillan, Douglas J.; Lewis, Robert
1994-01-01
Task analyses served as the basis for developing the Mixed Arithmetic-Perceptual (MA-P) model, which proposes (1) that people interacting with common graphs to answer common questions apply a set of component processes-searching for indicators, encoding the value of indicators, performing arithmetic operations on the values, making spatial comparisons among indicators, and repsonding; and (2) that the type of graph and user's task determine the combination and order of the components applied (i.e., the processing steps). Two experiments investigated the prediction that response time will be linearly related to the number of processing steps according to the MA-P model. Subjects used line graphs, scatter plots, and stacked bar graphs to answer comparison questions and questions requiring arithmetic calculations. A one-parameter version of the model (with equal weights for all components) and a two-parameter version (with different weights for arithmetic and nonarithmetic processes) accounted for 76%-85% of individual subjects' variance in response time and 61%-68% of the variance taken across all subjects. The discussion addresses possible modifications in the MA-P model, alternative models, and design implications from the MA-P model.
Statistical and Biophysical Models for Predicting Total and Outdoor Water Use in Los Angeles
NASA Astrophysics Data System (ADS)
Mini, C.; Hogue, T. S.; Pincetl, S.
2012-04-01
Modeling water demand is a complex exercise in the choice of the functional form, techniques and variables to integrate in the model. The goal of the current research is to identify the determinants that control total and outdoor residential water use in semi-arid cities and to utilize that information in the development of statistical and biophysical models that can forecast spatial and temporal urban water use. The City of Los Angeles is unique in its highly diverse socio-demographic, economic and cultural characteristics across neighborhoods, which introduces significant challenges in modeling water use. Increasing climate variability also contributes to uncertainties in water use predictions in urban areas. Monthly individual water use records were acquired from the Los Angeles Department of Water and Power (LADWP) for the 2000 to 2010 period. Study predictors of residential water use include socio-demographic, economic, climate and landscaping variables at the zip code level collected from US Census database. Climate variables are estimated from ground-based observations and calculated at the centroid of each zip code by inverse-distance weighting method. Remotely-sensed products of vegetation biomass and landscape land cover are also utilized. Two linear regression models were developed based on the panel data and variables described: a pooled-OLS regression model and a linear mixed effects model. Both models show income per capita and the percentage of landscape areas in each zip code as being statistically significant predictors. The pooled-OLS model tends to over-estimate higher water use zip codes and both models provide similar RMSE values.Outdoor water use was estimated at the census tract level as the residual between total water use and indoor use. This residual is being compared with the output from a biophysical model including tree and grass cover areas, climate variables and estimates of evapotranspiration at very high spatial resolution. A genetic algorithm based model (Shuffled Complex Evolution-UA; SCE-UA) is also being developed to provide estimates of the predictions and parameters uncertainties and to compare against the linear regression models. Ultimately, models will be selected to undertake predictions for a range of climate change and landscape scenarios. Finally, project results will contribute to a better understanding of water demand to help predict future water use and implement targeted landscaping conservation programs to maintain sustainable water needs for a growing population under uncertain climate variability.
Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma
2016-01-01
Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.
Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko
2016-03-01
In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Semiparametric regression during 2003–2007*
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2010-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
High linearity current communicating passive mixer employing a simple resistor bias
NASA Astrophysics Data System (ADS)
Rongjiang, Liu; Guiliang, Guo; Yuepeng, Yan
2013-03-01
A high linearity current communicating passive mixer including the mixing cell and transimpedance amplifier (TIA) is introduced. It employs the resistor in the TIA to reduce the source voltage and the gate voltage of the mixing cell. The optimum linearity and the maximum symmetric switching operation are obtained at the same time. The mixer is implemented in a 0.25 μm CMOS process. The test shows that it achieves an input third-order intercept point of 13.32 dBm, conversion gain of 5.52 dB, and a single sideband noise figure of 20 dB.
An Application to the Prediction of LOD Change Based on General Regression Neural Network
NASA Astrophysics Data System (ADS)
Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.
2011-07-01
Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.
Hossein-Zadeh, Navid Ghavi
2016-08-01
The aim of this study was to compare seven non-linear mathematical models (Brody, Wood, Dhanoa, Sikka, Nelder, Rook and Dijkstra) to examine their efficiency in describing the lactation curves for milk fat to protein ratio (FPR) in Iranian buffaloes. Data were 43 818 test-day records for FPR from the first three lactations of Iranian buffaloes which were collected on 523 dairy herds in the period from 1996 to 2012 by the Animal Breeding Center of Iran. Each model was fitted to monthly FPR records of buffaloes using the non-linear mixed model procedure (PROC NLMIXED) in SAS and the parameters were estimated. The models were tested for goodness of fit using Akaike's information criterion (AIC), Bayesian information criterion (BIC) and log maximum likelihood (-2 Log L). The Nelder and Sikka mixed models provided the best fit of lactation curve for FPR in the first and second lactations of Iranian buffaloes, respectively. However, Wood, Dhanoa and Sikka mixed models provided the best fit of lactation curve for FPR in the third parity buffaloes. Evaluation of first, second and third lactation features showed that all models, except for Dijkstra model in the third lactation, under-predicted test time at which daily FPR was minimum. On the other hand, minimum FPR was over-predicted by all equations. Evaluation of the different models used in this study indicated that non-linear mixed models were sufficient for fitting test-day FPR records of Iranian buffaloes.
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation
NASA Astrophysics Data System (ADS)
Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.
A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.
NASA Astrophysics Data System (ADS)
Huang, Wen Deng; Chen, Guang De; Yuan, Zhao Lin; Yang, Chuang Hua; Ye, Hong Gang; Wu, Ye Long
2016-02-01
The theoretical investigations of the interface optical phonons, electron-phonon couplings and its ternary mixed effects in zinc-blende spherical quantum dots are obtained by using the dielectric continuum model and modified random-element isodisplacement model. The features of dispersion curves, electron-phonon coupling strengths, and its ternary mixed effects for interface optical phonons in a single zinc-blende GaN/AlxGa1-xN spherical quantum dot are calculated and discussed in detail. The numerical results show that there are three branches of interface optical phonons. One branch exists in low frequency region; another two branches exist in high frequency region. The interface optical phonons with small quantum number l have more important contributions to the electron-phonon interactions. It is also found that ternary mixed effects have important influences on the interface optical phonon properties in a single zinc-blende GaN/AlxGa1-xN quantum dot. With the increase of Al component, the interface optical phonon frequencies appear linear changes, and the electron-phonon coupling strengths appear non-linear changes in high frequency region. But in low frequency region, the frequencies appear non-linear changes, and the electron-phonon coupling strengths appear linear changes.
Fuertes, Elaine; Flohr, Carsten; Silverberg, Jonathan I; Standl, Marie; Strachan, David P
2017-06-01
We sought to examine the relationship globally between UVR dose exposure and current eczema prevalences. ISAAC Phase Three provided data on eczema prevalence for 13- to 14-year-olds in 214 centers in 87 countries and for 6- to 7-year-olds in 132 centers in 57 countries. Linear and nonlinear associations between (natural log transformed) eczema prevalence and the mean, maximum, minimum, standard deviation, and range of monthly UV dose exposures were assessed using linear mixed-effects regression models. For the 13- to 14-year-olds, the country-level eczema prevalence was positively and linearly associated with country-level monthly mean (prevalence ratio = 1.31 [95% confidence interval = 1.05-1.63] per kJ/m 2 ) and minimum (1.25 [1.06-1.47] per kJ/m 2 ) UVR dose exposure. Linear and nonlinear associations were also observed for other metrics of UV. Results were similar in trend, but nonsignificant, for the fewer centers with 6- to 7-year-olds (e.g., 1.24 [0.96-1.59] per kJ/m 2 for country-level monthly mean UVR). No consistent within-country associations were observed (e.g., 1.05 [0.89-1.23] and 0.92 [0.71-1.18] per kJ/m 2 for center-level monthly mean UVR for the 13- to 14- and 6- to 7-year-olds, respectively). These ecological results support a role for UVR exposure in explaining some of the variation in global childhood eczema prevalence. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Vossoughi, Mehrdad; Ayatollahi, S M T; Towhidi, Mina; Ketabchi, Farzaneh
2012-03-22
The summary measure approach (SMA) is sometimes the only applicable tool for the analysis of repeated measurements in medical research, especially when the number of measurements is relatively large. This study aimed to describe techniques based on summary measures for the analysis of linear trend repeated measures data and then to compare performances of SMA, linear mixed model (LMM), and unstructured multivariate approach (UMA). Practical guidelines based on the least squares regression slope and mean of response over time for each subject were provided to test time, group, and interaction effects. Through Monte Carlo simulation studies, the efficacy of SMA vs. LMM and traditional UMA, under different types of covariance structures, was illustrated. All the methods were also employed to analyze two real data examples. Based on the simulation and example results, it was found that the SMA completely dominated the traditional UMA and performed convincingly close to the best-fitting LMM in testing all the effects. However, the LMM was not often robust and led to non-sensible results when the covariance structure for errors was misspecified. The results emphasized discarding the UMA which often yielded extremely conservative inferences as to such data. It was shown that summary measure is a simple, safe and powerful approach in which the loss of efficiency compared to the best-fitting LMM was generally negligible. The SMA is recommended as the first choice to reliably analyze the linear trend data with a moderate to large number of measurements and/or small to moderate sample sizes.
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
NASA Astrophysics Data System (ADS)
Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino
2018-07-01
Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.
Kohli, Nidhi; Sullivan, Amanda L; Sadeh, Shanna; Zopluoglu, Cengiz
2015-04-01
Effective instructional planning and intervening rely heavily on accurate understanding of students' growth, but relatively few researchers have examined mathematics achievement trajectories, particularly for students with special needs. We applied linear, quadratic, and piecewise linear mixed-effects models to identify the best-fitting model for mathematics development over elementary and middle school and to ascertain differences in growth trajectories of children with learning disabilities relative to their typically developing peers. The analytic sample of 2150 students was drawn from the Early Childhood Longitudinal Study - Kindergarten Cohort, a nationally representative sample of United States children who entered kindergarten in 1998. We first modeled students' mathematics growth via multiple mixed-effects models to determine the best fitting model of 9-year growth and then compared the trajectories of students with and without learning disabilities. Results indicate that the piecewise linear mixed-effects model captured best the functional form of students' mathematics trajectories. In addition, there were substantial achievement gaps between students with learning disabilities and students with no disabilities, and their trajectories differed such that students without disabilities progressed at a higher rate than their peers who had learning disabilities. The results underscore the need for further research to understand how to appropriately model students' mathematics trajectories and the need for attention to mathematics achievement gaps in policy. Copyright © 2015 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Johnson, A P; Godden, S M; Royster, E; Zuidhof, S; Miller, B; Sorg, J
2016-01-01
The study objective was to compare the efficacy of 2 commercial dry cow mastitis formulations containing cloxacillin benzathine or ceftiofur hydrochloride. Quarter-level outcomes included prevalence of intramammary infection (IMI) postcalving, risk for cure of preexisting infections, risk for acquiring a new IMI during the dry period, and risk for clinical mastitis between dry off and 100 d in milk (DIM). Cow-level outcomes included the risk for clinical mastitis and the risk for removal from the herd between dry off and 100 DIM, as well as Dairy Herd Improvement Association (DHIA) test-day milk component and production measures between calving and 100 DIM. A total of 799 cows from 4 Wisconsin dairy herds were enrolled at dry off and randomized to 1 of the 2 commercial dry cow therapy (DCT) treatments: cloxacillin benzathine (DC; n=401) or ceftiofur hydrochloride (SM; n=398). Aseptic quarter milk samples were collected for routine bacteriological culture before DCT at dry off and again at 0 to 10 DIM. Data describing clinical mastitis cases and DHIA test-day results were retrieved from on-farm electronic records. The overall crude quarter-level prevalence of IMI at dry off was 34.7% and was not different between treatment groups. Ninety-six percent of infections at dry off were of gram-positive organisms, with coagulase-negative Staphylococcus and Aerococcus spp. isolated most frequently. Mixed logistic regression analysis showed no difference between treatments as to the risk for presence of IMI at 0 to 10 DIM (DC=22.4%, SM=19.9%) or on the risk for acquiring a new IMI between dry off and 0 to 10 DIM (DC=16.6%, SM=14.1%). Noninferiority analysis and mixed logistic regression analysis both showed no treatment difference in risk for a cure between dry off and 0 to 10 DIM (DC=84.8%, SM=85.7%). Cox proportional hazards regression showed no difference between treatments in quarter-level risk for clinical mastitis (DC=1.99%, SM=2.96%), cow-level risk for clinical mastitis (DC=17.0%, SM=15.3%), or on risk for removal from the herd (DC=10.7%, SM=10.3%) between dry off and 100 DIM. Finally, multivariable linear regression with repeated measures showed no overall no difference between treatments in DHIA test-day somatic cell count linear score (DC=2.19, SM=2.22), butterfat test (DC=3.84%, SM=3.86%), protein test (DC=3.02%, SM=3.02%), or 305-d mature-equivalent milk production (DC=11,817 kg, SM=11,932 kg) between calving and 100 DIM. In conclusion, DC was noninferior to SM in effecting a cure, and there was no difference in efficacy between these 2 DCT formulations as related to all other udder health or cow performance measures evaluated between dry off and 100 DIM. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Senn, Stephen; Graf, Erika; Caputo, Angelika
2007-12-30
Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.
Trninić, Marko; Jeličić, Mario; Papić, Vladan
2015-07-01
In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A
2013-01-01
Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi
2007-10-01
Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H
2012-10-01
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
ERIC Educational Resources Information Center
Liou, Pey-Yan
2009-01-01
The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.
Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W
1992-03-01
The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
NASA Astrophysics Data System (ADS)
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Cleaning products and short-term respiratory effects among female cleaners with asthma.
Vizcaya, David; Mirabelli, Maria C; Gimeno, David; Antó, Josep-Maria; Delclos, George L; Rivera, Marcela; Orriols, Ramon; Arjona, Lourdes; Burgos, Felip; Zock, Jan-Paul
2015-11-01
We evaluated the short-term effects of exposure to cleaning products on lung function and respiratory symptoms among professional cleaning women. Twenty-one women with current asthma and employed as professional cleaners participated in a 15-day panel study. During 312 person-days of data collection, participants self-reported their use of cleaning products and respiratory symptoms in daily diaries and recorded their forced expiratory volume in 1 s (FEV1) and peak expiratory flow (PEF) three times per day using a handheld spirometer. We evaluated associations of cleaning product use with upper and lower respiratory tract symptoms using Poisson mixed regression models and with changes in FEV1 and PEF using linear mixed regression analyses. Participants reported using an average of 2.4 cleaning products per day, with exposure to at least one strong irritant (eg, ammonia, bleach, hydrochloric acid) on 56% of person-days. Among participants without atopy, lower respiratory tract symptoms were associated with the use of hydrochloric acid and detergents. Measurements of FEV1 and PEF taken in the evening were 174 mL (95% CI 34 to 314) and 37 L/min (CI 4 to 70), respectively, lower on days when three or more sprays were used. Evening and next morning FEV1 were both lower following the use of hydrochloric acid (-616 and -526 mL, respectively) and solvents (-751 and -1059 mL, respectively). Diurnal variation in FEV1 and PEF increased on days when ammonia and lime-scale removers were used. The use of specific cleaning products at work, mainly irritants and sprays, may exacerbate asthma. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Comparison of multi-subject ICA methods for analysis of fMRI data
Erhardt, Erik Barry; Rachakonda, Srinivas; Bedrick, Edward; Allen, Elena; Adali, Tülay; Calhoun, Vince D.
2010-01-01
Spatial independent component analysis (ICA) applied to functional magnetic resonance imaging (fMRI) data identifies functionally connected networks by estimating spatially independent patterns from their linearly mixed fMRI signals. Several multi-subject ICA approaches estimating subject-specific time courses (TCs) and spatial maps (SMs) have been developed, however there has not yet been a full comparison of the implications of their use. Here, we provide extensive comparisons of four multi-subject ICA approaches in combination with data reduction methods for simulated and fMRI task data. For multi-subject ICA, the data first undergo reduction at the subject and group levels using principal component analysis (PCA). Comparisons of subject-specific, spatial concatenation, and group data mean subject-level reduction strategies using PCA and probabilistic PCA (PPCA) show that computationally intensive PPCA is equivalent to PCA, and that subject-specific and group data mean subject-level PCA are preferred because of well-estimated TCs and SMs. Second, aggregate independent components are estimated using either noise free ICA or probabilistic ICA (PICA). Third, subject-specific SMs and TCs are estimated using back-reconstruction. We compare several direct group ICA (GICA) back-reconstruction approaches (GICA1-GICA3) and an indirect back-reconstruction approach, spatio-temporal regression (STR, or dual regression). Results show the earlier group ICA (GICA1) approximates STR, however STR has contradictory assumptions and may show mixed-component artifacts in estimated SMs. Our evidence-based recommendation is to use GICA3, introduced here, with subject-specific PCA and noise-free ICA, providing the most robust and accurate estimated SMs and TCs in addition to offering an intuitive interpretation. PMID:21162045