Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Lunt, Mark
2015-07-01
In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
Estimating linear temporal trends from aggregated environmental monitoring data
Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.
2017-01-01
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Wang, Zheng-Xin; Hao, Peng; Yao, Pei-Yi
2017-01-01
The non-linear relationship between provincial economic growth and carbon emissions is investigated by using panel smooth transition regression (PSTR) models. The research indicates that, on the condition of separately taking Gross Domestic Product per capita (GDPpc), energy structure (Es), and urbanisation level (Ul) as transition variables, three models all reject the null hypothesis of a linear relationship, i.e., a non-linear relationship exists. The results show that the three models all contain only one transition function but different numbers of location parameters. The model taking GDPpc as the transition variable has two location parameters, while the other two models separately considering Es and Ul as the transition variables both contain one location parameter. The three models applied in the study all favourably describe the non-linear relationship between economic growth and CO2 emissions in China. It also can be seen that the conversion rate of the influence of Ul on per capita CO2 emissions is significantly higher than those of GDPpc and Es on per capita CO2 emissions. PMID:29236083
Wang, Zheng-Xin; Hao, Peng; Yao, Pei-Yi
2017-12-13
The non-linear relationship between provincial economic growth and carbon emissions is investigated by using panel smooth transition regression (PSTR) models. The research indicates that, on the condition of separately taking Gross Domestic Product per capita (GDPpc), energy structure (Es), and urbanisation level (Ul) as transition variables, three models all reject the null hypothesis of a linear relationship, i.e., a non-linear relationship exists. The results show that the three models all contain only one transition function but different numbers of location parameters. The model taking GDPpc as the transition variable has two location parameters, while the other two models separately considering Es and Ul as the transition variables both contain one location parameter. The three models applied in the study all favourably describe the non-linear relationship between economic growth and CO₂ emissions in China. It also can be seen that the conversion rate of the influence of Ul on per capita CO₂ emissions is significantly higher than those of GDPpc and Es on per capita CO₂ emissions.
Brand, Tilman; Samkange-Zeeb, Florence; Ellert, Ute; Keil, Thomas; Krist, Lilian; Dragano, Nico; Jöckel, Karl-Heinz; Razum, Oliver; Reiss, Katharina; Greiser, Karin Halina; Zimmermann, Heiko; Becher, Heiko; Zeeb, Hajo
2017-06-01
We assessed the association between acculturation and health-related quality of life (HRQoL) among persons with a Turkish migrant background in Germany. 1226 adults of Turkish origin were recruited in four German cities. Acculturation was assessed using the Frankfurt Acculturation Scale resulting in four groups (integration, assimilation, separation and marginalization). Short Form-8 physical and mental components were used to assess the HRQoL. Associations were analysed with linear regression models. Of the respondents, 20% were classified as integrated, 29% assimilated, 29% separated and 19% as marginalized. Separation was associated with poorer physical and mental health (linear regression coefficient (RC) = -2.3, 95% CI -3.9 to -0.8 and RC = -2.4, 95% CI -4.4 to -0.5, respectively; reference: integration). Marginalization was associated with poorer mental health in descendants of migrants (RC = -6.4, 95% CI -12.0 to -0.8; reference: integration). Separation and marginalization are associated with a poorer HRQoL. Policies should support the integration of migrants, and health promotion interventions should target separated and marginalized migrants to improve their HRQoL.
Lee, Kyung Hee; Kang, Seung Kwan; Goo, Jin Mo; Lee, Jae Sung; Cheon, Gi Jeong; Seo, Seongho; Hwang, Eui Jin
2017-03-01
To compare the relationship between K trans from DCE-MRI and K 1 from dynamic 13 N-NH 3 -PET, with simultaneous and separate MR/PET in the VX-2 rabbit carcinoma model. MR/PET was performed simultaneously and separately, 14 and 15 days after VX-2 tumor implantation at the paravertebral muscle. The K trans and K 1 values were estimated using an in-house software program. The relationships between K trans and K 1 were analyzed using Pearson's correlation coefficients and linear/non-linear regression function. Assuming a linear relationship, K trans and K 1 exhibited a moderate positive correlations with both simultaneous (r=0.54-0.57) and separate (r=0.53-0.69) imaging. However, while the K trans and K 1 from separate imaging were linearly correlated, those from simultaneous imaging exhibited a non-linear relationship. The amount of change in K 1 associated with a unit increase in K trans varied depending on K trans values. The relationship between K trans and K 1 may be mis-interpreted with separate MR and PET acquisition. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Application of linear regression analysis in accuracy assessment of rolling force calculations
NASA Astrophysics Data System (ADS)
Poliak, E. I.; Shim, M. K.; Kim, G. S.; Choo, W. Y.
1998-10-01
Efficient operation of the computational models employed in process control systems require periodical assessment of the accuracy of their predictions. Linear regression is proposed as a tool which allows separate systematic and random prediction errors from those related to measurements. A quantitative characteristic of the model predictive ability is introduced in addition to standard statistical tests for model adequacy. Rolling force calculations are considered as an example for the application. However, the outlined approach can be used to assess the performance of any computational model.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Pattern Recognition Analysis of Age-Related Retinal Ganglion Cell Signatures in the Human Eye
Yoshioka, Nayuta; Zangerl, Barbara; Nivison-Smith, Lisa; Khuu, Sieu K.; Jones, Bryan W.; Pfeiffer, Rebecca L.; Marc, Robert E.; Kalloniatis, Michael
2017-01-01
Purpose To characterize macular ganglion cell layer (GCL) changes with age and provide a framework to assess changes in ocular disease. This study used data clustering to analyze macular GCL patterns from optical coherence tomography (OCT) in a large cohort of subjects without ocular disease. Methods Single eyes of 201 patients evaluated at the Centre for Eye Health (Sydney, Australia) were retrospectively enrolled (age range, 20–85); 8 × 8 grid locations obtained from Spectralis OCT macular scans were analyzed with unsupervised classification into statistically separable classes sharing common GCL thickness and change with age. The resulting classes and gridwise data were fitted with linear and segmented linear regression curves. Additionally, normalized data were analyzed to determine regression as a percentage. Accuracy of each model was examined through comparison of predicted 50-year-old equivalent macular GCL thickness for the entire cohort to a true 50-year-old reference cohort. Results Pattern recognition clustered GCL thickness across the macula into five to eight spatially concentric classes. F-test demonstrated segmented linear regression to be the most appropriate model for macular GCL change. The pattern recognition–derived and normalized model revealed less difference between the predicted macular GCL thickness and the reference cohort (average ± SD 0.19 ± 0.92 and −0.30 ± 0.61 μm) than a gridwise model (average ± SD 0.62 ± 1.43 μm). Conclusions Pattern recognition successfully identified statistically separable macular areas that undergo a segmented linear reduction with age. This regression model better predicted macular GCL thickness. The various unique spatial patterns revealed by pattern recognition combined with core GCL thickness data provide a framework to analyze GCL loss in ocular disease. PMID:28632847
Impact of divorce on the quality of life in school-age children.
Eymann, Alfredo; Busaniche, Julio; Llera, Julián; De Cunto, Carmen; Wahren, Carlos
2009-01-01
To assess psychosocial quality of life in school-age children of divorced parents. A cross-sectional survey was conducted at the pediatric outpatient clinic of a community hospital. Children 5 to 12 years old from married families and divorced families were included. Child quality of life was assessed through maternal reports using a Child Health Questionnaire-Parent Form 50. A multiple linear regression model was constructed including clinically relevant variables significant on univariate analysis (beta coefficient and 95%CI). Three hundred and thirty families were invited to participate and 313 completed the questionnaire. Univariate analysis showed that quality of life was significantly associated with parental separation, child sex, time spent with the father, standard of living, and maternal education. In a multiple linear regression model, quality of life scores decreased in boys -4.5 (-6.8 to -2.3) and increased for time spent with the father 0.09 (0.01 to 0.2). In divorced families, multiple linear regression showed that quality of life scores increased when parents had separated by mutual agreement 6.1 (2.7 to 9.4), when the mother had university level education 5.9 (1.7 to 10.1) and for each year elapsed since separation 0.6 (0.2 to 1.1), whereas scores decreased in boys -5.4 (-9.5 to -1.3) and for each one-year increment of maternal age -0.4 (-0.7 to -0.05). Children's psychosocial quality of life was affected by divorce. The Child Health Questionnaire can be useful to detect a decline in the psychosocial quality of life.
DOA Finding with Support Vector Regression Based Forward-Backward Linear Prediction.
Pan, Jingjing; Wang, Yide; Le Bastard, Cédric; Wang, Tianzhen
2017-05-27
Direction-of-arrival (DOA) estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward-backward linear prediction (FBLP) is able to directly deal with coherent signals. Support vector regression (SVR) is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs). Simulation results show the effectiveness of the proposed method.
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.
Trninić, Marko; Jeličić, Mario; Papić, Vladan
2015-07-01
In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Miller, Nathan; Prevatt, Frances
2017-10-01
The purpose of this study was to reexamine the latent structure of ADHD and sluggish cognitive tempo (SCT) due to issues with construct validity. Two proposed changes to the construct include viewing hyperactivity and sluggishness (hypoactivity) as a single continuum of activity level, and viewing inattention as a separate dimension from activity level. Data were collected from 1,398 adults using Amazon's MTurk. A new scale measuring activity level was developed, and scores of Inattention were regressed onto scores of Activity Level using curvilinear regression. The Activity Level scale showed acceptable levels of internal consistency, normality, and unimodality. Curvilinear regression indicates that a quadratic (curvilinear) model accurately explains a small but significant portion of the variance in levels of inattention. Hyperactivity and hypoactivity may be viewed as a continuum, rather than separate disorders. Inattention may have a U-shaped relationship with activity level. Linear analyses may be insufficient and inaccurate for studying ADHD.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
2015-10-30
pressure values onto the SD card. The addition of free and open-source Arduino libraries allowed for the seamless integration of the shield into the...alert the user when replacing the separator is necessary. Methods: A sensor was built to measure and record differential pressure values within the...from the transducers during simulated blockages were transformed into pressure values using linear regression equations from the calibration data
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Almalik, Osama; Nijhuis, Michiel B; van den Heuvel, Edwin R
2014-01-01
Shelf-life estimation usually requires that at least three registration batches are tested for stability at multiple storage conditions. The shelf-life estimates are often obtained by linear regression analysis per storage condition, an approach implicitly suggested by ICH guideline Q1E. A linear regression analysis combining all data from multiple storage conditions was recently proposed in the literature when variances are homogeneous across storage conditions. The combined analysis is expected to perform better than the separate analysis per storage condition, since pooling data would lead to an improved estimate of the variation and higher numbers of degrees of freedom, but this is not evident for shelf-life estimation. Indeed, the two approaches treat the observed initial batch results, the intercepts in the model, and poolability of batches differently, which may eliminate or reduce the expected advantage of the combined approach with respect to the separate approach. Therefore, a simulation study was performed to compare the distribution of simulated shelf-life estimates on several characteristics between the two approaches and to quantify the difference in shelf-life estimates. In general, the combined statistical analysis does estimate the true shelf life more consistently and precisely than the analysis per storage condition, but it did not outperform the separate analysis in all circumstances.
Genomic prediction based on data from three layer lines using non-linear regression models.
Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L
2014-11-06
Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.
Analytical three-point Dixon method: With applications for spiral water-fat imaging.
Wang, Dinghui; Zwart, Nicholas R; Li, Zhiqiang; Schär, Michael; Pipe, James G
2016-02-01
The goal of this work is to present a new three-point analytical approach with flexible even or uneven echo increments for water-fat separation and to evaluate its feasibility with spiral imaging. Two sets of possible solutions of water and fat are first found analytically. Then, two field maps of the B0 inhomogeneity are obtained by linear regression. The initial identification of the true solution is facilitated by the root-mean-square error of the linear regression and the incorporation of a fat spectrum model. The resolved field map after a region-growing algorithm is refined iteratively for spiral imaging. The final water and fat images are recalculated using a joint water-fat separation and deblurring algorithm. Successful implementations were demonstrated with three-dimensional gradient-echo head imaging and single breathhold abdominal imaging. Spiral, high-resolution T1 -weighted brain images were shown with comparable sharpness to the reference Cartesian images. With appropriate choices of uneven echo increments, it is feasible to resolve the aliasing of the field map voxel-wise. High-quality water-fat spiral imaging can be achieved with the proposed approach. © 2015 Wiley Periodicals, Inc.
Revisiting tests for neglected nonlinearity using artificial neural networks.
Cho, Jin Seo; Ishida, Isao; White, Halbert
2011-05-01
Tests for regression neglected nonlinearity based on artificial neural networks (ANNs) have so far been studied by separately analyzing the two ways in which the null of regression linearity can hold. This implies that the asymptotic behavior of general ANN-based tests for neglected nonlinearity is still an open question. Here we analyze a convenient ANN-based quasi-likelihood ratio statistic for testing neglected nonlinearity, paying careful attention to both components of the null. We derive the asymptotic null distribution under each component separately and analyze their interaction. Somewhat remarkably, it turns out that the previously known asymptotic null distribution for the type 1 case still applies, but under somewhat stronger conditions than previously recognized. We present Monte Carlo experiments corroborating our theoretical results and showing that standard methods can yield misleading inference when our new, stronger regularity conditions are violated.
NASA Astrophysics Data System (ADS)
Kawase, H.; Nakano, K.
2015-12-01
We investigated the characteristics of strong ground motions separated from acceleration Fourier spectra and acceleration response spectra of 5% damping calculated from weak and moderate ground motions observed by K-NET, KiK-net, and the JMA Shindokei Network in Japan using the generalized spectral inversion method. The separation method used the outcrop motions at YMGH01 as reference where we extracted site responses due to shallow weathered layers. We include events with JMA magnitude equal to or larger than 4.5 observed from 1996 to 2011. We find that our frequency-dependent Q values are comparable to those of previous studies. From the corner frequencies of Fourier source spectra, we calculate Brune's stress parameters and found a clear magnitude dependence, in which smaller events tend to spread over a wider range while maintaining the same maximum value. We confirm that this is exactly the case for several mainshock-aftershock sequences. The average stress parameters for crustal earthquakes are much smaller than those of subduction zone, which can be explained by their depth dependence. We then compared the strong motion characteristics based on the acceleration response spectra and found that the separated characteristics of strong ground motions are different, especially in the lower frequency range less than 1Hz. These differences comes from the difference between Fourier spectra and response spectra found in the observed data; that is, predominant components in high frequency range of Fourier spectra contribute to increase the response in lower frequency range with small Fourier amplitude because strong high frequency component acts as an impulse to a Single-Degree-of-Freedom system. After the separation of the source terms for 5% damping response spectra we can obtain regression coefficients with respect to the magnitude, which lead to a new GMPE as shown in Fig.1 on the left. Although stress drops for inland earthquakes are 1/7 of the subduction-zone earthquakes, we can see linear regression works quite well. After this linear regression we correlate residuals as a function of Brune's stress parameters of corresponding events as shown in Fig.1 on the right for the case of 1Hz. We found quite good linear correlation, which makes aleatoric uncertainty 40 to 60 % smaller than the original.
Schilling, K.E.; Wolter, C.F.
2005-01-01
Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).
Simultaneous multiple non-crossing quantile regression estimation using kernel constraints
Liu, Yufeng; Wu, Yichao
2011-01-01
Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842
Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio
2015-08-15
In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E
2015-05-01
The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Darsazan, Bahar; Shafaati, Alireza; Mortazavi, Seyed Alireza; Zarghi, Afshin
2017-01-01
A simple and reliable stability-indicating RP-HPLC method was developed and validated for analysis of adefovir dipivoxil (ADV).The chromatographic separation was performed on a C 18 column using a mixture of acetonitrile-citrate buffer (10 mM at pH 5.2) 36:64 (%v/v) as mobile phase, at a flow rate of 1.5 mL/min. Detection was carried out at 260 nm and a sharp peak was obtained for ADV at a retention time of 5.8 ± 0.01 min. No interferences were observed from its stress degradation products. The method was validated according to the international guidelines. Linear regression analysis of data for the calibration plot showed a linear relationship between peak area and concentration over the range of 0.5-16 μg/mL; the regression coefficient was 0.9999and the linear regression equation was y = 24844x-2941.3. The detection (LOD) and quantification (LOQ) limits were 0.12 and 0.35 μg/mL, respectively. The results proved the method was fast (analysis time less than 7 min), precise, reproducible, and accurate for analysis of ADV over a wide range of concentration. The proposed specific method was used for routine quantification of ADV in pharmaceutical bulk and a tablet dosage form.
Sulfur Mustard Induces Apoptosis in Lung Epithelial Cells via a Caspase Amplification Loop
2010-01-01
analysis using antibodies specific for exe- cutioner caspase-3. The positions of the immunoreactive proteins are indicated. Results shown are representative...respectively. The emission at 460nm from each sample was plotted against time, and linear regression analysis was used to determine the initial veloc- ity...follows, **pɘ.01, ***pɘ.001. .4. Immunoblot analysis SDS-PAGE and transfer of separated proteins to nitrocellulosemembranes were erformed according to
Anti-TNF levels in cord blood at birth are associated with anti-TNF type.
Kanis, Shannon L; de Lima, Alison; van der Ent, Cokkie; Rizopoulos, Dimitris; van der Woude, C Janneke
2018-05-15
Pregnancy guidelines for women with Inflammatory Bowel Disease (IBD) provide recommendations regarding anti-TNF cessation during pregnancy, in order to limit fetal exposure. Although infliximab (IFX) leads to higher anti-TNF concentrations in cord blood than adalimumab (ADA), recommendations are similar. We aimed to demonstrate the effect of anti-TNF cessation during pregnancy on fetal exposure, for IFX and ADA separately. We conducted a prospective single center cohort study. Women with IBD, using IFX or ADA, were followed-up during pregnancy. In case of sustained disease remission, anti-TNF was stopped in the third trimester. At birth, anti-TNF concentration was measured in cord blood. A linear regression model was developed to demonstrate anti-TNF concentration in cord blood at birth. In addition, outcomes such as disease activity, pregnancy outcomes and 1-year health outcomes of infants were collected. We included 131 pregnancies that resulted in a live birth (73 IFX, 58 ADA). At birth, 94 cord blood samples were obtained (52 IFX, 42 ADA), showing significantly higher levels of IFX than ADA (p<0.0001). Anti-TNF type and stop week were used in the linear regression model. During the third trimester, IFX transportation over the placenta increases exponentially, however, ADA transportation is limited and increases in a linear fashion. Overall, health outcomes were comparable. Our linear regression model shows that ADA may be continued longer during pregnancy as transportation over the placenta is lower than IFX. This may reduce relapse risk of the mother without increasing fetal anti-TNF exposure.
NASA Technical Reports Server (NTRS)
Jones, Harrison P.; Branston, Detrick D.; Jones, Patricia B.; Popescu, Miruna D.
2002-01-01
An earlier study compared NASA/NSO Spectromagnetograph (SPM) data with spacecraft measurements of total solar irradiance (TSI) variations over a 1.5 year period in the declining phase of solar cycle 22. This paper extends the analysis to an eight-year period which also spans the rising and early maximum phases of cycle 23. The conclusions of the earlier work appear to be robust: three factors (sunspots, strong unipolar regions, and strong mixed polarity regions) describe most of the variation in the SPM record, but only the first two are associated with TSI. Additionally, the residuals of a linear multiple regression of TSI against SPM observations over the entire eight-year period show an unexplained, increasing, linear time variation with a rate of about 0.05 W m(exp -2) per year. Separate regressions for the periods before and after 1996 January 01 show no unexplained trends but differ substantially in regression parameters. This behavior may reflect a solar source of TSI variations beyond sunspots and faculae but more plausibly results from uncompensated non-solar effects in one or both of the TSI and SPM data sets.
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Duvall, Susanne W.; Erickson, Sarah J.; MacLean, Peggy; Lowe, Jean R.
2014-01-01
The goal was to identify perinatal predictors of early executive dysfunction in preschoolers born very low birth weight. Fifty-seven preschoolers completed three executive function tasks (Dimensional Change Card Sort-Separated (inhibition, working memory and cognitive flexibility), Bear Dragon (inhibition and working memory) and Gift Delay Open (inhibition)). Relationships between executive function and perinatal medical severity factors (gestational age, days on ventilation, size for gestational age, maternal steroids and number of surgeries), and chronological age were investigated by multiple linear regression and logistic regression. Different perinatal medical severity factors were predictive of executive function tasks, with gestational age predicting Bear Dragon and Gift Open; and number of surgeries and maternal steroids predicting performance on Dimensional Change Card Sort-Separated. By understanding the relationship between perinatal medical severity factors and preschool executive outcomes, we may be able to identify children at highest risk for future executive dysfunction, thereby focusing targeted early intervention services. PMID:25117418
Recruiting Older Youths: Insights from a New Survey of Army Recruits
2014-01-01
remaining in the service at the time to be considered for promotion 8. the unconditional probability of achieving the military grade of E-5 at four years...of service 9. the unconditional probability of achieving the military grade of E-5 at six years of ser- vice. We examined both the total effects of...career outcomes for Army enlist- ees. These effects are computed from separate linear probability regression models that include only dummy variables
Signal and noise extraction from analog memory elements for neuromorphic computing.
Gong, N; Idé, T; Kim, S; Boybat, I; Sebastian, A; Narayanan, V; Ando, T
2018-05-29
Dense crossbar arrays of non-volatile memory (NVM) can potentially enable massively parallel and highly energy-efficient neuromorphic computing systems. The key requirements for the NVM elements are continuous (analog-like) conductance tuning capability and switching symmetry with acceptable noise levels. However, most NVM devices show non-linear and asymmetric switching behaviors. Such non-linear behaviors render separation of signal and noise extremely difficult with conventional characterization techniques. In this study, we establish a practical methodology based on Gaussian process regression to address this issue. The methodology is agnostic to switching mechanisms and applicable to various NVM devices. We show tradeoff between switching symmetry and signal-to-noise ratio for HfO 2 -based resistive random access memory. Then, we characterize 1000 phase-change memory devices based on Ge 2 Sb 2 Te 5 and separate total variability into device-to-device variability and inherent randomness from individual devices. These results highlight the usefulness of our methodology to realize ideal NVM devices for neuromorphic computing.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
NASA Astrophysics Data System (ADS)
Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto
2000-12-01
The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.
Woo, John H; Wang, Sumei; Melhem, Elias R; Gee, James C; Cucchiara, Andrew; McCluskey, Leo; Elman, Lauren
2014-01-01
To assess the relationship between clinically assessed Upper Motor Neuron (UMN) disease in Amyotrophic Lateral Sclerosis (ALS) and local diffusion alterations measured in the brain corticospinal tract (CST) by a tractography-driven template-space region-of-interest (ROI) analysis of Diffusion Tensor Imaging (DTI). This cross-sectional study included 34 patients with ALS, on whom DTI was performed. Clinical measures were separately obtained including the Penn UMN Score, a summary metric based upon standard clinical methods. After normalizing all DTI data to a population-specific template, tractography was performed to determine a region-of-interest (ROI) outlining the CST, in which average Mean Diffusivity (MD) and Fractional Anisotropy (FA) were estimated. Linear regression analyses were used to investigate associations of DTI metrics (MD, FA) with clinical measures (Penn UMN Score, ALSFRS-R, duration-of-disease), along with age, sex, handedness, and El Escorial category as covariates. For MD, the regression model was significant (p = 0.02), and the only significant predictors were the Penn UMN Score (p = 0.005) and age (p = 0.03). The FA regression model was also significant (p = 0.02); the only significant predictor was the Penn UMN Score (p = 0.003). Measured by the template-space ROI method, both MD and FA were linearly associated with the Penn UMN Score, supporting the hypothesis that DTI alterations reflect UMN pathology as assessed by the clinical examination.
Ye, Xin; Beck, Travis W; DeFreitas, Jason M; Wages, Nathan P
2015-04-01
The aim of this study was to compare the acute effects of concentric versus eccentric exercise on motor control strategies. Fifteen men performed six sets of 10 repetitions of maximal concentric exercises or eccentric isokinetic exercises with their dominant elbow flexors on separate experimental visits. Before and after the exercise, maximal strength testing and submaximal trapezoid isometric contractions (40% of the maximal force) were performed. Both exercise conditions caused significant strength loss in the elbow flexors, but the loss was greater following the eccentric exercise (t=2.401, P=.031). The surface electromyographic signals obtained from the submaximal trapezoid isometric contractions were decomposed into individual motor unit action potential trains. For each submaximal trapezoid isometric contraction, the relationship between the average motor unit firing rate and the recruitment threshold was examined using linear regression analysis. In contrast to the concentric exercise, which did not cause significant changes in the mean linear slope coefficient and y-intercept of the linear regression line, the eccentric exercise resulted in a lower mean linear slope and an increased mean y-intercept, thereby indicating that increasing the firing rates of low-threshold motor units may be more important than recruiting high-threshold motor units to compensate for eccentric exercise-induced strength loss. Copyright © 2014 Elsevier B.V. All rights reserved.
Ventilation-Perfusion Relationships Following Experimental Pulmonary Contusion
2007-06-14
696.7 6.1 to 565.0 24.3 Hounsfield units ), as did VOL (4.3 0.5 to 33.5 3.2%). Multivariate linear regression of MGSD, VOL, VD/VT, and QS vs. PaO2...parenchyma was separated into four regions based on the Hounsfield unit (HU) ranges reported by Gattinoni et al. (23) via a segmentation process executed...determined by repeated measures ANOVA. CT, computed tomography; MGSD, mean gray-scale density of the entire lung by CT scan; HU, Hounsfield units
LAS bioconcentration is isomer specific
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tolls, J.; Haller, M.; Graaf, I. de
1995-12-31
The authors measured parent compound specific bioconcentration data for linear alkylbenzene sulfonates in Pimephales promelas. They did so by using cold, custom synthesized sulfophenyl alkanes. They observed that, within homologous series of isomers, the uptake rate constants (k{sub 1}) and the bioconcentration factor (BCF) increase with increasing number of carbon atoms in the alkyl chain (n{sub C-atoms}). In contrast, the elimination rate constant k{sub 2} appears to be independent of the alkyl chain length. Regressions of log BCF vs n{sub C-atoms} yielded different slopes for the homologous groups of the 5- and the 2-sulfophenyl alkane isomers. Regression of all logmore » BCF-data vs log 1/CMC yielded a good description of the data. However, when regressing the data for both homologous series separately again very different slopes are obtained. The results therefore indicate that hydrophobicity-bioconcentration relationships may be different for different homologous groups of sulfophenyl alkanes.« less
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
NASA Astrophysics Data System (ADS)
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
NASA Astrophysics Data System (ADS)
Lisenko, S. A.; Kugeiko, M. M.
2013-01-01
The ability to determine noninvasively microphysical parameters (MPPs) of skin characteristic of malignant melanoma was demonstrated. The MPPs were the melanin content in dermis, saturation of tissue with blood vessels, and concentration and effective size of tissue scatterers. The proposed method was based on spatially resolved spectral measurements of skin diffuse reflectance and multiple regressions between linearly independent measurement components and skin MPPs. The regressions were established by modeling radiation transfer in skin with a wide variation of its MPPs. Errors in the determination of skin MPPs were estimated using fiber-optic measurements of its diffuse reflectance at wavelengths of commercially available semiconductor diode lasers (578, 625, 660, 760, and 806 nm) at source-detector separations of 0.23-1.38 mm.
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Density determination of nail polishes and paint chips using magnetic levitation
NASA Astrophysics Data System (ADS)
Huang, Peggy P.
Trace evidence is often small, easily overlooked, and difficult to analyze. This study describes a nondestructive method to separate and accurately determine the density of trace evidence samples, specifically nail polish and paint chip using magnetic levitation (MagLev). By determining the levitation height of each sample in the MagLev device, the density of the sample is back extrapolated using a standard density bead linear regression line. The results show that MagLev distinguishes among eight clear nail polishes, including samples from the same manufacturer; separates select colored nail polishes from the same manufacturer; can determine the density range of household paint chips; and shows limited levitation for unknown paint chips. MagLev provides a simple, affordable, and nondestructive means of determining density. The addition of co-solutes to the paramagnetic solution to expand the density range may result in greater discriminatory power and separation and lead to further applications of this technique.
Interpersonal Guilt and Substance Use in College Students
Locke, Geoffrey W.; Shilkret, Robert; Everett, Joyce E.; Petry, Nancy M.
2016-01-01
The college years are a time for developing independence and separating from one’s family, and it is also a time in which substance use often escalates. This study examined the relationships between use of substances and interpersonal guilt, an emotion that can arise from feelings about separation, among 1,979 college students. Regular users of alcohol, cigarettes, cannabis, and other illicit drugs were compared with non-regular users of each substance. Sequential linear regression, controlling for confounding variables, examined relationships between regular use of each substance and scores on a guilt index. Risky drinkers and daily smokers had significantly more interpersonal guilt than their peers who did not regularly use these substances. In contrast, regular cannabis users had significantly less guilt than non-regular cannabis users. These data suggest that substance use among college students may be related to interpersonal guilt and family separation issues, and this relationship may vary across substances. PMID:24579980
The impact of patient autonomy on older adults with asthma.
Karamched, Keerthi R; Hao, Wei; Song, Peter X; Carpenter, Laurie; Steinberg, Joel; Baptist, Alan P
2018-05-03
Understanding patient preferences and desire for involvement in making medical decisions is important when managing chronic conditions. Previous studies have utilized the Autonomy Preference Index (API) in younger asthmatic patients to evaluate these preferences. To identify factors associated with autonomy, and to determine if autonomy is related to asthma outcomes among older adults. 189 older adults (>55 yr) with persistent asthma were included. Preferences for autonomy were assessed using the API, with a higher score indicating higher desire for autonomy. Scores were separated into two domains of 'information seeking' and 'decision making' preferences. The separated scores were correlated with asthma outcomes and demographic variables. To control for confounding factors, a linear regression analysis was performed. Higher 'decision making' preference scores correlated with female gender (p=0.007), higher education level (p=0.01), and lower depression scores (p=0.04). Regarding outcomes, 'decision making' scores positively correlated with asthma quality of life questionnaire (AQLQ) scores (p=0.01). On linear regression analysis, the AQLQ score remained significantly associated with 'decision making' preference scores (p=0.03). There was no association with asthma control test scores, spirometry values, and healthcare utilization. 'Information seeking' preference scores correlated with education level (p=0.03), but there was no correlation with asthma outcomes. Older asthmatic adults with a greater desire for involvement in decision making have a higher asthma related quality of life. Future studies with the intention to increase patient autonomy may help establish a causal relationship. Copyright © 2018. Published by Elsevier Inc.
Multi-sensory landscape assessment: the contribution of acoustic perception to landscape evaluation.
Gan, Yonghong; Luo, Tao; Breitung, Werner; Kang, Jian; Zhang, Tianhai
2014-12-01
In this paper, the contribution of visual and acoustic preference to multi-sensory landscape evaluation was quantitatively compared. The real landscapes were treated as dual-sensory ambiance and separated into visual landscape and soundscape. Both were evaluated by 63 respondents in laboratory conditions. The analysis of the relationship between respondent's visual and acoustic preference as well as their respective contribution to landscape preference showed that (1) some common attributes are universally identified in assessing visual, aural and audio-visual preference, such as naturalness or degree of human disturbance; (2) with acoustic and visual preferences as variables, a multi-variate linear regression model can satisfactorily predict landscape preference (R(2 )= 0.740), while the coefficients of determination for a unitary linear regression model were 0.345 and 0.720 for visual and acoustic preference as predicting factors, respectively; (3) acoustic preference played a much more important role in landscape evaluation than visual preference in this study (the former is about 4.5 times of the latter), which strongly suggests a rethinking of the role of soundscape in environment perception research and landscape planning practice.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
The effect of clouds on the earth's radiation budget
NASA Technical Reports Server (NTRS)
Ziskin, Daniel; Strobel, Darrell F.
1991-01-01
The radiative fluxes from the Earth Radiation Budget Experiment (ERBE) and the cloud properties from the International Satellite Cloud Climatology Project (ISCCP) over Indonesia for the months of June and July of 1985 and 1986 were analyzed to determine the cloud sensitivity coefficients. The method involved a linear least squares regression between co-incident flux and cloud coverage measurements. The calculated slope is identified as the cloud sensitivity. It was found that the correlations between the total cloud fraction and radiation parameters were modest. However, correlations between cloud fraction and IR flux were improved by separating clouds by height. Likewise, correlations between the visible flux and cloud fractions were improved by distinguishing clouds based on optical depth. Calculating correlations between the net fluxes and either height or optical depth segregated cloud fractions were somewhat improved. When clouds were classified in terms of their height and optical depth, correlations among all the radiation components were improved. Mean cloud sensitivities based on the regression of radiative fluxes against height and optical depth separated cloud types are presented. Results are compared to a one-dimensional radiation model with a simple cloud parameterization scheme.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
NASA Astrophysics Data System (ADS)
Cambra-López, María; Winkel, Albert; Mosquera, Julio; Ogink, Nico W. M.; Aarnink, André J. A.
2015-06-01
The objective of this study was to compare co-located real-time light scattering devices and equivalent gravimetric samplers in poultry and pig houses for PM10 mass concentration, and to develop animal-specific calibration factors for light scattering samplers. These results will contribute to evaluate the comparability of different sampling instruments for PM10 concentrations. Paired DustTrak light scattering device (DustTrak aerosol monitor, TSI, U.S.) and PM10 gravimetric cyclone sampler were used for measuring PM10 mass concentrations during 24 h periods (from noon to noon) inside animal houses. Sampling was conducted in 32 animal houses in the Netherlands, including broilers, broiler breeders, layers in floor and in aviary system, turkeys, piglets, growing-finishing pigs in traditional and low emission housing with dry and liquid feed, and sows in individual and group housing. A total of 119 pairs of 24 h measurements (55 for poultry and 64 for pigs) were recorded and analyzed using linear regression analysis. Deviations between samplers were calculated and discussed. In poultry, cyclone sampler and DustTrak data fitted well to a linear regression, with a regression coefficient equal to 0.41, an intercept of 0.16 mg m-3 and a correlation coefficient of 0.91 (excluding turkeys). Results in turkeys showed a regression coefficient equal to 1.1 (P = 0.49), an intercept of 0.06 mg m-3 (P < 0.0001) and a correlation coefficient of 0.98. In pigs, we found a regression coefficient equal to 0.61, an intercept of 0.05 mg m-3 and a correlation coefficient of 0.84. Measured PM10 concentrations using DustTraks were clearly underestimated (approx. by a factor 2) in both poultry and pig housing systems compared with cyclone pre-separators. Absolute, relative, and random deviations increased with concentration. DustTrak light scattering devices should be self-calibrated to investigate PM10 mass concentrations accurately in animal houses. We recommend linear regression equations as animal-specific calibration factors for DustTraks instead of manufacturer calibration factors, especially in heavily dusty environments such as animal houses.
Method and Excel VBA Algorithm for Modeling Master Recession Curve Using Trigonometry Approach.
Posavec, Kristijan; Giacopetti, Marco; Materazzi, Marco; Birk, Steffen
2017-11-01
A new method was developed and implemented into an Excel Visual Basic for Applications (VBAs) algorithm utilizing trigonometry laws in an innovative way to overlap recession segments of time series and create master recession curves (MRCs). Based on a trigonometry approach, the algorithm horizontally translates succeeding recession segments of time series, placing their vertex, that is, the highest recorded value of each recession segment, directly onto the appropriate connection line defined by measurement points of a preceding recession segment. The new method and algorithm continues the development of methods and algorithms for the generation of MRC, where the first published method was based on a multiple linear/nonlinear regression model approach (Posavec et al. 2006). The newly developed trigonometry-based method was tested on real case study examples and compared with the previously published multiple linear/nonlinear regression model-based method. The results show that in some cases, that is, for some time series, the trigonometry-based method creates narrower overlaps of the recession segments, resulting in higher coefficients of determination R 2 , while in other cases the multiple linear/nonlinear regression model-based method remains superior. The Excel VBA algorithm for modeling MRC using the trigonometry approach is implemented into a spreadsheet tool (MRCTools v3.0 written by and available from Kristijan Posavec, Zagreb, Croatia) containing the previously published VBA algorithms for MRC generation and separation. All algorithms within the MRCTools v3.0 are open access and available free of charge, supporting the idea of running science on available, open, and free of charge software. © 2017, National Ground Water Association.
Bageshwar, Deepak; Khanvilkar, Vineeta; Kadam, Vilasrao
2011-01-01
A specific, precise and stability indicating high-performance thin-layer chromatographic method for simultaneous estimation of pantoprazole sodium and itopride hydrochloride in pharmaceutical formulations was developed and validated. The method employed TLC aluminium plates precoated with silica gel 60F254 as the stationary phase. The solvent system consisted of methanol:water:ammonium acetate; 4.0:1.0:0.5 (v/v/v). This system was found to give compact and dense spots for both itopride hydrochloride (Rf value of 0.55±0.02) and pantoprazole sodium (Rf value of 0.85±0.04). Densitometric analysis of both drugs was carried out in the reflectance–absorbance mode at 289 nm. The linear regression analysis data for the calibration plots showed a good linear relationship with R2=0.9988±0.0012 in the concentration range of 100–400 ng for pantoprazole sodium. Also, the linear regression analysis data for the calibration plots showed a good linear relationship with R2=0.9990±0.0008 in the concentration range of 200–1200 ng for itopride hydrochloride. The method was validated for specificity, precision, robustness and recovery. Statistical analysis proves that the method is repeatable and selective for the estimation of both the said drugs. As the method could effectively separate the drug from its degradation products, it can be employed as a stability indicating method. PMID:29403710
Bageshwar, Deepak; Khanvilkar, Vineeta; Kadam, Vilasrao
2011-11-01
A specific, precise and stability indicating high-performance thin-layer chromatographic method for simultaneous estimation of pantoprazole sodium and itopride hydrochloride in pharmaceutical formulations was developed and validated. The method employed TLC aluminium plates precoated with silica gel 60F 254 as the stationary phase. The solvent system consisted of methanol:water:ammonium acetate; 4.0:1.0:0.5 (v/v/v). This system was found to give compact and dense spots for both itopride hydrochloride ( R f value of 0.55±0.02) and pantoprazole sodium ( R f value of 0.85±0.04). Densitometric analysis of both drugs was carried out in the reflectance-absorbance mode at 289 nm. The linear regression analysis data for the calibration plots showed a good linear relationship with R 2 =0.9988±0.0012 in the concentration range of 100-400 ng for pantoprazole sodium. Also, the linear regression analysis data for the calibration plots showed a good linear relationship with R 2 =0.9990±0.0008 in the concentration range of 200-1200 ng for itopride hydrochloride. The method was validated for specificity, precision, robustness and recovery. Statistical analysis proves that the method is repeatable and selective for the estimation of both the said drugs. As the method could effectively separate the drug from its degradation products, it can be employed as a stability indicating method.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Analyzing industrial energy use through ordinary least squares regression models
NASA Astrophysics Data System (ADS)
Golden, Allyson Katherine
Extensive research has been performed using regression analysis and calibrated simulations to create baseline energy consumption models for residential buildings and commercial institutions. However, few attempts have been made to discuss the applicability of these methodologies to establish baseline energy consumption models for industrial manufacturing facilities. In the few studies of industrial facilities, the presented linear change-point and degree-day regression analyses illustrate ideal cases. It follows that there is a need in the established literature to discuss the methodologies and to determine their applicability for establishing baseline energy consumption models of industrial manufacturing facilities. The thesis determines the effectiveness of simple inverse linear statistical regression models when establishing baseline energy consumption models for industrial manufacturing facilities. Ordinary least squares change-point and degree-day regression methods are used to create baseline energy consumption models for nine different case studies of industrial manufacturing facilities located in the southeastern United States. The influence of ambient dry-bulb temperature and production on total facility energy consumption is observed. The energy consumption behavior of industrial manufacturing facilities is only sometimes sufficiently explained by temperature, production, or a combination of the two variables. This thesis also provides methods for generating baseline energy models that are straightforward and accessible to anyone in the industrial manufacturing community. The methods outlined in this thesis may be easily replicated by anyone that possesses basic spreadsheet software and general knowledge of the relationship between energy consumption and weather, production, or other influential variables. With the help of simple inverse linear regression models, industrial manufacturing facilities may better understand their energy consumption and production behavior, and identify opportunities for energy and cost savings. This thesis study also utilizes change-point and degree-day baseline energy models to disaggregate facility annual energy consumption into separate industrial end-user categories. The baseline energy model provides a suitable and economical alternative to sub-metering individual manufacturing equipment. One case study describes the conjoined use of baseline energy models and facility information gathered during a one-day onsite visit to perform an end-point energy analysis of an injection molding facility conducted by the Alabama Industrial Assessment Center. Applying baseline regression model results to the end-point energy analysis allowed the AIAC to better approximate the annual energy consumption of the facility's HVAC system.
Pistonesi, Marcelo F; Di Nezio, María S; Centurión, María E; Lista, Adriana G; Fragoso, Wallace D; Pontes, Márcio J C; Araújo, Mário C U; Band, Beatriz S Fernández
2010-12-15
In this study, a novel, simple, and efficient spectrofluorimetric method to determine directly and simultaneously five phenolic compounds (hydroquinone, resorcinol, phenol, m-cresol and p-cresol) in air samples is presented. For this purpose, variable selection by the successive projections algorithm (SPA) is used in order to obtain simple multiple linear regression (MLR) models based on a small subset of wavelengths. For comparison, partial least square (PLS) regression is also employed in full-spectrum. The concentrations of the calibration matrix ranged from 0.02 to 0.2 mg L(-1) for hydroquinone, from 0.05 to 0.6 mg L(-1) for resorcinol, and from 0.05 to 0.4 mg L(-1) for phenol, m-cresol and p-cresol; incidentally, such ranges are in accordance with the Argentinean environmental legislation. To verify the accuracy of the proposed method a recovery study on real air samples of smoking environment was carried out with satisfactory results (94-104%). The advantage of the proposed method is that it requires only spectrofluorimetric measurements of samples and chemometric modeling for simultaneous determination of five phenols. With it, air is simply sampled and no pre-treatment sample is needed (i.e., separation steps and derivatization reagents are avoided) that means a great saving of time. Copyright © 2010 Elsevier B.V. All rights reserved.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi
2011-07-01
Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
NASA Astrophysics Data System (ADS)
Venedikov, A. P.; Arnoso, J.; Cai, W.; Vieira, R.; Tan, S.; Velez, E. J.
2006-01-01
A 12-year series (1992-2004) of strain measurements recorded in the Geodynamics Laboratory of Lanzarote is investigated. Through a tidal analysis the non-tidal component of the data is separated in order to use it for studying signals, useful for monitoring of the volcanic activity on the island. This component contains various perturbations of meteorological and oceanic origin, which should be eliminated in order to make the useful signals discernible. The paper is devoted to the estimation and elimination of the effect of the air temperature inside the station, which strongly dominates the strainmeter data. For solving this task, a regression model is applied, which includes a linear relation with the temperature and time-dependant polynomials. The regression includes nonlinearly a set of parameters, which are estimated by a properly applied Bayesian approach. The results obtained are: the regression coefficient of the strain data on temperature is equal to (-367.4 ± 0.8) × 10 -9 °C -1, the curve of the non-tidal component reduced by the effect of the temperature and a polynomial approximation of the reduced curve. The technique used here can be helpful to investigators in the domain of the earthquake and volcano monitoring. However, the fundamental and extremely difficult problem of what kind of signals in the reduced curves might be useful in this field is not considered here.
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Does linear separability really matter? Complex visual search is explained by simple search
Vighneshvel, T.; Arun, S. P.
2013-01-01
Visual search in real life involves complex displays with a target among multiple types of distracters, but in the laboratory, it is often tested using simple displays with identical distracters. Can complex search be understood in terms of simple searches? This link may not be straightforward if complex search has emergent properties. One such property is linear separability, whereby search is hard when a target cannot be separated from its distracters using a single linear boundary. However, evidence in favor of linear separability is based on testing stimulus configurations in an external parametric space that need not be related to their true perceptual representation. We therefore set out to assess whether linear separability influences complex search at all. Our null hypothesis was that complex search performance depends only on classical factors such as target-distracter similarity and distracter homogeneity, which we measured using simple searches. Across three experiments involving a variety of artificial and natural objects, differences between linearly separable and nonseparable searches were explained using target-distracter similarity and distracter heterogeneity. Further, simple searches accurately predicted complex search regardless of linear separability (r = 0.91). Our results show that complex search is explained by simple search, refuting the widely held belief that linear separability influences visual search. PMID:24029822
Yousefi, Siamak; Balasubramanian, Madhusudhanan; Goldbaum, Michael H; Medeiros, Felipe A; Zangwill, Linda M; Weinreb, Robert N; Liebmann, Jeffrey M; Girkin, Christopher A; Bowd, Christopher
2016-05-01
To validate Gaussian mixture-model with expectation maximization (GEM) and variational Bayesian independent component analysis mixture-models (VIM) for detecting glaucomatous progression along visual field (VF) defect patterns (GEM-progression of patterns (POP) and VIM-POP). To compare GEM-POP and VIM-POP with other methods. GEM and VIM models separated cross-sectional abnormal VFs from 859 eyes and normal VFs from 1117 eyes into abnormal and normal clusters. Clusters were decomposed into independent axes. The confidence limit (CL) of stability was established for each axis with a set of 84 stable eyes. Sensitivity for detecting progression was assessed in a sample of 83 eyes with known progressive glaucomatous optic neuropathy (PGON). Eyes were classified as progressed if any defect pattern progressed beyond the CL of stability. Performance of GEM-POP and VIM-POP was compared to point-wise linear regression (PLR), permutation analysis of PLR (PoPLR), and linear regression (LR) of mean deviation (MD), and visual field index (VFI). Sensitivity and specificity for detecting glaucomatous VFs were 89.9% and 93.8%, respectively, for GEM and 93.0% and 97.0%, respectively, for VIM. Receiver operating characteristic (ROC) curve areas for classifying progressed eyes were 0.82 for VIM-POP, 0.86 for GEM-POP, 0.81 for PoPLR, 0.69 for LR of MD, and 0.76 for LR of VFI. GEM-POP was significantly more sensitive to PGON than PoPLR and linear regression of MD and VFI in our sample, while providing localized progression information. Detection of glaucomatous progression can be improved by assessing longitudinal changes in localized patterns of glaucomatous defect identified by unsupervised machine learning.
Jaime-Pérez, José Carlos; Jiménez-Castillo, Raúl Alberto; Vázquez-Hernández, Karina Elizabeth; Salazar-Riojas, Rosario; Méndez-Ramírez, Nereida; Gómez-Almaguer, David
2017-10-01
Advances in automated cell separators have improved the efficiency of plateletpheresis and the possibility of obtaining double products (DP). We assessed cell processor accuracy of predicted platelet (PLT) yields with the goal of a better prediction of DP collections. This retrospective proof-of-concept study included 302 plateletpheresis procedures performed on a Trima Accel v6.0 at the apheresis unit of a hematology department. Donor variables, software predicted yield and actual PLT yield were statistically evaluated. Software prediction was optimized by linear regression analysis and its optimal cut-off to obtain a DP assessed by receiver operating characteristic curve (ROC) modeling. Three hundred and two plateletpheresis procedures were performed; in 271 (89.7%) occasions, donors were men and in 31 (10.3%) women. Pre-donation PLT count had the best direct correlation with actual PLT yield (r = 0.486. P < .001). Means of software machine-derived values differed significantly from actual PLT yield, 4.72 × 10 11 vs.6.12 × 10 11 , respectively, (P < .001). The following equation was developed to adjust these values: actual PLT yield= 0.221 + (1.254 × theoretical platelet yield). ROC curve model showed an optimal apheresis device software prediction cut-off of 4.65 × 10 11 to obtain a DP, with a sensitivity of 82.2%, specificity of 93.3%, and an area under the curve (AUC) of 0.909. Trima Accel v6.0 software consistently underestimated PLT yields. Simple correction derived from linear regression analysis accurately corrected this underestimation and ROC analysis identified a precise cut-off to reliably predict a DP. © 2016 Wiley Periodicals, Inc.
Da Costa, M J; Colson, G; Frost, T J; Halley, J; Pesti, G M
2017-07-01
The objective of this analysis was to evaluate the effects of raising broilers under sex separate and straight-run conditions for 2 broiler genetic lines. One-day-old Ross 308 and Ross 708 chicks (n = 1,344) were sex separated and placed in 48 pens according to rearing type: sex separate (28 males or 28 females) or straight-run (14 males + 14 females). There were 3 dietary phases: starter (zero to 17 d), grower (17 to 32 d), and finisher (32 to 48 d). Bird individual BW and group feed intakes were measured at 12, 17, 25, 32, 42, and 48 d to evaluate performance. At 33, 43, and 49 d 4 birds per pen (straight-run pens 2 males + 2 females) were sampled for carcass yield evaluation. Data were analyzed using linear and non-linear regression in order to estimate feed intake and cut-up weights at 3 separate market weights (1,700, 2,700, and 3,700 g). Returns over feed cost were estimated for a 1.8 million broiler complex for each rearing system and under 9 feed/meat price scenarios. Overall, rearing birds that were sex separated resulted in extra income that ranged from ${\\$}$48,824 to ${\\$}$330,300 per week, depending on the market targeted and feed and meat price scenarios. Sex separation was shown to be especially important in disadvantageous scenarios in which feed prices were high. Gains from sex separation were markedly higher for the Ross 708 than for the Ross 308 broilers. Bird variability also was evaluated at the 3 separate market ages under narrow ranges of BW that were targeted. Straight-run birds decreased the number of birds present in the desired range. Depending on market weight, straight-run rearing resulted in 9.1 to 16.6% fewer birds than sex separate rearing to meet marketing goals. It was concluded that sex separation can result in increased company profitability and have possible beneficial effects at the processing plant due to increased bird uniformity. © 2017 Poultry Science Association Inc.
Zhang, Yu; Yu, Haixia; Wu, Yujiao; Zhao, Wenyan; Yang, Min; Jing, Huanwang; Chen, Anjia
2014-10-01
In this paper, a new capillary electrophoresis (CE) separation and detection method was developed for the chiral separation of the four major Cinchona alkaloids (quinine/quinidine and cinchonine/cinchonidine) using hydroxypropyl-β-cyclodextrin (HP-β-CD) and chiral ionic liquid ([TBA][L-ASP]) as selectors. Separation parameters such as buffer concentrations, pH, HP-β-CD and chiral ionic liquid concentrations, capillary temperature, and separation voltage were investigated. After optimization of separation conditions, baseline separation of the three analytes (cinchonidine, quinine, cinchonine) was achieved in fewer than 7 min in ammonium acetate background electrolyte (pH 5.0) with the addition of HP-β-CD in a concentration of 40 mM and [TBA][L-ASP] of 14 mM, while the baseline separation of cinchonine and quinidine was not obtained. Therefore, the first-order derivative electropherogram was applied for resolving overlapping peaks. Regression equations revealed a good linear relationship between peak areas in first-order derivative electropherograms and concentrations of the two diastereomer pairs. The results not only indicated that the first-order derivative electropherogram was effective in determination of a low content component and of those not fully separated from adjacent ones, but also showed that the ionic liquid appeared to be a very promising chiral selector in CE. Copyright © 2014 Elsevier Inc. All rights reserved.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Zhou, Qing-he; Zhu, Bo; Wei, Chang-na; Yan, Min
2016-03-24
Studies have shown that abdominal girth and vertebral column length have high predictive value for spinal spread after administering a dose of plain bupivacaine. we designed a study to identify the specific correlations between abdominal girth, vertebral column length and a 0.5% dosage of plain bupivacaine, which should provide a minimum upper block level (T12) and a suitable upper block level (T10) for lower limb surgeries. A suitable dose of 0.5% plain bupivacaine was administered intrathecally between the L3 and L4 vertebrae for lower limb surgeries. If the upper cephalad spread of the patient by loss of pinprick discrimination was T12 or T10, the patient was enrolled in this study. Five patient variables and intrathecal plain bupivacaine dose were recorded. Linear regression and multiple regression analyses were performed. Totals of 111 patients and 121 patients who lost pinprick discrimination at T12 and T10, respectively, were analyzed in this study. Linear regression analysis showed that only abdominal girth and plain bupivacaine dose were strongly correlated (r =-0.827 for T12, r = -0.806 for T10; both p < 0.0001). Multiple linear regression analysis showed that both abdominal girth and vertebral column length were the key determinants of plain bupivacaine dose (both p < 0.0001). R(2) was 0.874 and 0.860 for the loss of pinprick discrimination at T12 and T10, respectively. Our data indicated that vertebral column length and abdominal girth were strongly correlated with the dosage of intrathecal plain bupivacaine for the loss of pinprick discrimination at T12 and T10. The two regression equations were YT12 = 3.547 + 0.045X1-0.044X2 and YT10 = 3.848 + 0.047X1- 0.046X2 (Y, 0.5% plain bupivacaine volume; X1, vertebral column length;and X 2, abdominal girth), which can accurately predict the minimum and suitable intrathecal bupivacaine dose for lower limb surgery to a great extent, separately.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Bushon, R.N.; Brady, A.M.; Likirdopulos, C.A.; Cireddu, J.V.
2009-01-01
Aims: The aim of this study was to examine a rapid method for detecting Escherichia coli and enterococci in recreational water. Methods and Results: Water samples were assayed for E. coli and enterococci by traditional and immunomagnetic separation/adenosine triphosphate (IMS/ATP) methods. Three sample treatments were evaluated for the IMS/ATP method: double filtration, single filtration, and direct analysis. Pearson's correlation analysis showed strong, significant, linear relations between IMS/ATP and traditional methods for all sample treatments; strongest linear correlations were with the direct analysis (r = 0.62 and 0.77 for E. coli and enterococci, respectively). Additionally, simple linear regression was used to estimate bacteria concentrations as a function of IMS/ATP results. The correct classification of water-quality criteria was 67% for E. coli and 80% for enterococci. Conclusions: The IMS/ATP method is a viable alternative to traditional methods for faecal-indicator bacteria. Significance and Impact of the Study: The IMS/ATP method addresses critical public health needs for the rapid detection of faecal-indicator contamination and has potential for satisfying US legislative mandates requiring methods to detect bathing water contamination in 2 h or less. Moreover, IMS/ATP equipment is considerably less costly and more portable than that for molecular methods, making the method suitable for field applications. ?? 2009 The Authors.
Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide
2009-01-01
The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
Gouvêa, Ana Cristina M S; Melo, Armindo; Santiago, Manuela C P A; Peixoto, Fernanda M; Freitas, Vitor; Godoy, Ronoel L O; Ferreira, Isabel M P L V O
2015-10-15
Neomitranthes obscura (DC.) N. Silveira is a Brazilian fruit belonging to the Myrtaceae family that contains anthocyanins in the peel and was studied for the first time in this work. Delphinidin-3-O-galactoside, delphinidin-3-O-glucoside, cyanidin-3-O-galactoside, cyanidin-3-O-glucoside, cyanidin-3-O-arabinoside, petunidin-3-O-glucoside, pelargonidin-3-O-glucoside, peonidin-3-O-galactoside, peonidin-3-O-glucoside, cyanidin-3-O-xyloside were separated and identified by LC/DAD/MS and by co-elution with standards. Reliable quantification of anthocyanins in the mature fruits was performed by HPLC/DAD using weighted linear regression model from 0.05 to 50mg of cyaniding-3-O-glucoside L(-1) because it gave better fit quality than least squares linear regression. Good precision and accuracy were obtained. The total anthocyanin content of mature fruits was 263.6 ± 8.2 mg of cyanidin-3-O-glucoside equivalents 100 g(-1) fresh weight, which was in the same range found in literature for anthocyanin rich fruits. Copyright © 2015. Published by Elsevier Ltd.
Use of a tracing task to assess visuomotor performance for evidence of concussion and recuperation.
Kelty-Stephen, Damian G; Qureshi Ahmad, Mona; Stirling, Leia
2015-12-01
The likelihood of suffering a concussion while playing a contact sport ranges from 15-45% per year of play. These rates are highly variable as athletes seldom report concussive symptoms, or do not recognize their symptoms. We performed a prospective cohort study (n = 206, aged 10-17) to examine visuomotor tracing to determine the sensitivity for detecting neuromotor components of concussion. Tracing variability measures were investigated for a mean shift with presentation of concussion-related symptoms and a linear return toward baseline over subsequent return visits. Furthermore, previous research relating brain injury to the dissociation of smooth movements into "submovements" led to the expectation that cumulative micropause duration, a measure of motion continuity, might detect likelihood of injury. Separate linear mixed effects regressions of tracing measures indicated that 4 of the 5 tracing measures captured both short-term effects of injury and longer-term effects of recovery with subsequent visits. Cumulative micropause duration has a positive relationship with likelihood of participants having had a concussion. The present results suggest that future research should evaluate how well the coefficients for the tracing parameter in the logistic regression help to detect concussion in novel cases. (c) 2015 APA, all rights reserved).
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.
Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu
2015-06-01
High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Khamanga, Sandile M; Walker, Roderick B
2011-01-15
An accurate, sensitive and specific high performance liquid chromatography-electrochemical detection (HPLC-ECD) method that was developed and validated for captopril (CPT) is presented. Separation was achieved using a Phenomenex(®) Luna 5 μm (C(18)) column and a mobile phase comprised of phosphate buffer (adjusted to pH 3.0): acetonitrile in a ratio of 70:30 (v/v). Detection was accomplished using a full scan multi channel ESA Coulometric detector in the "oxidative-screen" mode with the upstream electrode (E(1)) set at +600 mV and the downstream (analytical) electrode (E(2)) set at +950 mV, while the potential of the guard cell was maintained at +1050 mV. The detector gain was set at 300. Experimental design using central composite design (CCD) was used to facilitate method development. Mobile phase pH, molarity and concentration of acetonitrile (ACN) were considered the critical factors to be studied to establish the retention time of CPT and cyclizine (CYC) that was used as the internal standard. Twenty experiments including centre points were undertaken and a quadratic model was derived for the retention time for CPT using the experimental data. The method was validated for linearity, accuracy, precision, limits of quantitation and detection, as per the ICH guidelines. The system was found to produce sharp and well-resolved peaks for CPT and CYC with retention times of 3.08 and 7.56 min, respectively. Linear regression analysis for the calibration curve showed a good linear relationship with a regression coefficient of 0.978 in the concentration range of 2-70 μg/mL. The linear regression equation was y=0.0131x+0.0275. The limits of detection (LOQ) and quantitation (LOD) were found to be 2.27 and 0.6 μg/mL, respectively. The method was used to analyze CPT in tablets. The wide range for linearity, accuracy, sensitivity, short retention time and composition of the mobile phase indicated that this method is better for the quantification of CPT than the pharmacopoeial methods. Copyright © 2010 Elsevier B.V. All rights reserved.
Interpersonal guilt and substance use in college students.
Locke, Geoffrey W; Shilkret, Robert; Everett, Joyce E; Petry, Nancy M
2015-01-01
The college years are a time for developing independence and separating from one's family, and they are also a time in which substance use often escalates. This study examined the relationships between use of substances and interpersonal guilt, an emotion that can arise from feelings about separation among college students. In total, 1865 college students completed a survey evaluating substance use and interpersonal guilt. Regular users of alcohol, cigarettes, cannabis, and other illicit drugs were compared with nonregular users of each substance. Sequential linear regression, controlling for confounding variables, examined relationships between regular use of each substance and scores on a guilt index. Risky drinkers and daily smokers had significantly more interpersonal guilt than their peers who did not regularly use these substances. In contrast, regular cannabis users had significantly less guilt than nonregular cannabis users. These data suggest that substance use among college students may be related to interpersonal guilt and family separation issues, and this relationship may vary across substances.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Gimelfarb, A.; Willis, J. H.
1994-01-01
An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Jandera, Pavel; Vyňuchalová, Kateřina; Nečilová, Kateřina
2013-11-22
Combined effects of temperature and mobile-phase composition on retention and separation selectivity of phenolic acids and flavonoid compounds were studied in liquid chromatography on a polydentate Blaze C8 silica based column. The temperature effects on the retention can be described by van't Hoff equation. Good linearity of lnk versus 1/T graphs indicates that the retention is controlled by a single mechanism in the mobile phase and temperature range studied. Enthalpic and entropic contributions to the retention were calculated from the regression lines. Generally, enthalpic contributions control the retention at lower temperatures and in mobile phases with lower concentrations of methanol in water. Semi-empirical retention models describe the simultaneous effects of temperature and the volume fraction of the organic solvent in the mobile phase. Using the linear free energy-retention model, selective dipolarity/polarizability, hydrogen-bond donor, hydrogen-bond acceptor and molecular size contributions to retention were estimated at various mobile phase compositions and temperatures. In addition to mobile phase gradients, temperature programming can be used to reduce separation times. Copyright © 2013 Elsevier B.V. All rights reserved.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
Control Variate Selection for Multiresponse Simulation.
1987-05-01
M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.
Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D
2018-05-30
NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Fernandes, David Douglas Sousa; Gomes, Adriano A; Costa, Gean Bezerra da; Silva, Gildo William B da; Véras, Germano
2011-12-15
This work is concerned of evaluate the use of visible and near-infrared (NIR) range, separately and combined, to determine the biodiesel content in biodiesel/diesel blends using Multiple Linear Regression (MLR) and variable selection by Successive Projections Algorithm (SPA). Full spectrum models employing Partial Least Squares (PLS) and variables selection by Stepwise (SW) regression coupled with Multiple Linear Regression (MLR) and PLS models also with variable selection by Jack-Knife (Jk) were compared the proposed methodology. Several preprocessing were evaluated, being chosen derivative Savitzky-Golay with second-order polynomial and 17-point window for NIR and visible-NIR range, with offset correction. A total of 100 blends with biodiesel content between 5 and 50% (v/v) prepared starting from ten sample of biodiesel. In the NIR and visible region the best model was the SPA-MLR using only two and eight wavelengths with RMSEP of 0.6439% (v/v) and 0.5741 respectively, while in the visible-NIR region the best model was the SW-MLR using five wavelengths and RMSEP of 0.9533% (v/v). Results indicate that both spectral ranges evaluated showed potential for developing a rapid and nondestructive method to quantify biodiesel in blends with mineral diesel. Finally, one can still mention that the improvement in terms of prediction error obtained with the procedure for variables selection was significant. Copyright © 2011 Elsevier B.V. All rights reserved.
Physician leadership styles and effectiveness: an empirical study.
Xirasagar, Sudha; Samuels, Michael E; Stoskopf, Carleen H
2005-12-01
The authors study the association between physician leadership styles and leadership effectiveness. Executive directors of community health centers were surveyed (269 respondents; response rate = 40.9 percent) for their perceptions of the medical director's leadership behaviors and effectiveness, using an adapted Multifactor Leadership Questionnaire (43 items on a 0-4 point Likert-type scale), with additional questions on demographics and the center's clinical goals and achievements. The authors hypothesize that transformational leadership would be more positively associated with executive directors' ratings of effectiveness, satisfaction with the leader, and subordinate extra effort, as well as the center's clinical goal achievement, than transactional or laissez-faire leadership. Separate ordinary least squares regressions were used to model each of the effectiveness measures, and general linear model regression was used to model clinical goal achievement. Results support the hypothesis and suggest that physician leadership development using the transformational leadership model may result in improved health care quality and cost control.
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
NASA Astrophysics Data System (ADS)
Hapugoda, J. C.; Sooriyarachchi, M. R.
2017-09-01
Survival time of patients with a disease and the incidence of that particular disease (count) is frequently observed in medical studies with the data of a clustered nature. In many cases, though, the survival times and the count can be correlated in a way that, diseases that occur rarely could have shorter survival times or vice versa. Due to this fact, joint modelling of these two variables will provide interesting and certainly improved results than modelling these separately. Authors have previously proposed a methodology using Generalized Linear Mixed Models (GLMM) by joining the Discrete Time Hazard model with the Poisson Regression model to jointly model survival and count model. As Aritificial Neural Network (ANN) has become a most powerful computational tool to model complex non-linear systems, it was proposed to develop a new joint model of survival and count of Dengue patients of Sri Lanka by using that approach. Thus, the objective of this study is to develop a model using ANN approach and compare the results with the previously developed GLMM model. As the response variables are continuous in nature, Generalized Regression Neural Network (GRNN) approach was adopted to model the data. To compare the model fit, measures such as root mean square error (RMSE), absolute mean error (AME) and correlation coefficient (R) were used. The measures indicate the GRNN model fits the data better than the GLMM model.
Topsakal, Vedat; Fransen, Erik; Schmerber, Sébastien; Declau, Frank; Yung, Matthew; Gordts, Frans; Van Camp, Guy; Van de Heyning, Paul
2006-09-01
To report the preoperative audiometric profile of surgically confirmed otosclerosis. Retrospective, multicenter study. Four tertiary referral centers. One thousand sixty-four surgically confirmed patients with otosclerosis. Therapeutic ear surgery for hearing improvement. Preoperative audiometric air conduction (AC) and bone conduction (BC) hearing thresholds were obtained retrospectively for 1064 patients with otosclerosis. A cross-sectional multiple linear regression analysis was performed on audiometric data of affected ears. Influences of age and sex were analyzed and age-related typical audiograms were created. Bone conduction thresholds were corrected for Carhart effect and presbyacusis; in addition, we tested to see if separate cochlear otosclerosis component existed. Corrected thresholds were than analyzed separately for progression of cochlear otosclerosis. The study population consisted of 35% men and 65% women (mean age, 44 yr). The mean pure-tone average at 0.5, 1, and 2 kHz was 57 dB hearing level. Multiple linear regression analysis showed significant progression for all measured AC and BC thresholds. The average annual threshold deterioration for AC was 0.45 dB/yr and the annual threshold deterioration for BC was 0.37 dB/yr. The average annual gap expansion was 0.08 dB/year. The corrected BC thresholds for Carhart effect and presbyacusis remained significantly different from zero, but only showed progression at 2 kHz. The preoperative audiological profile of otosclerosis is described. There is a significant sensorineural component in patients with otosclerosis planned for stapedotomy, which is worse than age-related hearing loss by itself. Deterioration rates of AC and BC thresholds have been reported, which can be helpful in clinical practice and might also guide the characterization of allegedly different phenotypes for familial and sporadic otosclerosis.
Sorimachi, Kenji; Okayasu, Teiji
2015-01-01
The complete vertebrate mitochondrial genome consists of 13 coding genes. We used this genome to investigate the existence of natural selection in vertebrate evolution. From the complete mitochondrial genomes, we predicted nucleotide contents and then separated these values into coding and non-coding regions. When nucleotide contents of a coding or non-coding region were plotted against the nucleotide content of the complete mitochondrial genomes, we obtained linear regression lines only between homonucleotides and their analogs. On every plot using G or A content purine, G content in aquatic vertebrates was higher than that in terrestrial vertebrates, while A content in aquatic vertebrates was lower than that in terrestrial vertebrates. Based on these relationships, vertebrates were separated into two groups, terrestrial and aquatic. However, using C or T content pyrimidine, clear separation between these two groups was not obtained. The hagfish (Eptatretus burgeri) was further separated from both terrestrial and aquatic vertebrates. Based on these results, nucleotide content relationships predicted from the complete vertebrate mitochondrial genomes reveal the existence of natural selection based on evolutionary separation between terrestrial and aquatic vertebrate groups. In addition, we propose that separation of the two groups might be linked to ammonia detoxification based on high G and low A contents, which encode Glu rich and Lys poor proteins.
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
The Relationship between TOC and pH with Exchangeable Heavy Metal Levels in Lithuanian Podzols
NASA Astrophysics Data System (ADS)
Khaledian, Yones; Pereira, Paulo; Brevik, Eric C.; Pundyte, Neringa; Paliulis, Dainius
2017-04-01
Heavy metals can have a negative impact on public and environmental health. The objective of this study was to investigate the relationship between total organic carbon (TOC) and pH with exchangeable heavy metals (Pb, Cd, Cu and Zn) in order to predict exchangeable heavy metal content in soils sampled near Panevėžys and Kaunas, Lithuania. Principal component regression (PCR) and nonlinear regression methods were tested to find the statistical relationship between TOC and pH with heavy metals. The results of PCR [R2 = 0.68, RMSE = 0.07] and non-linear regression [R2 = 0.74, RMSE= 0.065] (pH with TOC and exchangeable parameters) were statistically significant. However, this was not observed in the relationships of pH and TOC separately with exchangeable heavy metals. The results indicated that pH had a higher correlation with exchangeable heavy metals (non-linear regression [R2 = 0.72, RMSE= 0.066]) than TOC with heavy metals [R2 = 0.30, RMSE= 0.004]. It can be concluded that even though there was a strong relationship between TOC and pH with exchangeable metals, the metal mobility (exchangeable metals) can be explained by pH better than TOC in this study. Finally, manipulating soil pH could likely be productive to assess and control heavy metals when financial and time limitations exist (Khaledian et al. 2016). Reference(s) Khaledian Y, Pereira P, Brevik E.C, Pundyte N, Paliulis D. 2016. The Influence of Organic Carbon and pH on Heavy Metals, Potassium, and Magnesium Levels in Lithuanian Podzols. Land Degradation and Development. DOI: 10.1002/ldr.2638
2015-07-15
Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
The swan-song phenomenon: last-works effects for 172 classical composers.
Simonton, D K
1989-03-01
Creative individuals approaching their final years of life may undergo a transformation in outlook that is reflected in their last works. This hypothesized effect was quantitatively assessed for an extensive sample of 1,919 works by 172 classical composers. The works were independently gauged on seven aesthetic attributes (melodic originality, melodic variation, repertoire popularity, aesthetic significance, listener accessibility, performance duration, and thematic size), and potential last-works effects were operationally defined two separate ways (linearly and exponentially). Statistical controls were introduced for both longitudinal changes (linear, quadratic, and cubic age functions) and individual differences (eminence and lifetime productivity). Hierarchical regression analyses indicated that composers' swan songs tend to score lower in melodic originality and performance duration but higher in repertoire popularity and aesthetic significance. These last-works effects survive control for total compositional output, eminence, and most significantly, the composer's age when the last works were created.
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Passive dendrites enable single neurons to compute linearly non-separable functions.
Cazé, Romain Daniel; Humphries, Mark; Gutkin, Boris
2013-01-01
Local supra-linear summation of excitatory inputs occurring in pyramidal cell dendrites, the so-called dendritic spikes, results in independent spiking dendritic sub-units, which turn pyramidal neurons into two-layer neural networks capable of computing linearly non-separable functions, such as the exclusive OR. Other neuron classes, such as interneurons, may possess only a few independent dendritic sub-units, or only passive dendrites where input summation is purely sub-linear, and where dendritic sub-units are only saturating. To determine if such neurons can also compute linearly non-separable functions, we enumerate, for a given parameter range, the Boolean functions implementable by a binary neuron model with a linear sub-unit and either a single spiking or a saturating dendritic sub-unit. We then analytically generalize these numerical results to an arbitrary number of non-linear sub-units. First, we show that a single non-linear dendritic sub-unit, in addition to the somatic non-linearity, is sufficient to compute linearly non-separable functions. Second, we analytically prove that, with a sufficient number of saturating dendritic sub-units, a neuron can compute all functions computable with purely excitatory inputs. Third, we show that these linearly non-separable functions can be implemented with at least two strategies: one where a dendritic sub-unit is sufficient to trigger a somatic spike; another where somatic spiking requires the cooperation of multiple dendritic sub-units. We formally prove that implementing the latter architecture is possible with both types of dendritic sub-units whereas the former is only possible with spiking dendrites. Finally, we show how linearly non-separable functions can be computed by a generic two-compartment biophysical model and a realistic neuron model of the cerebellar stellate cell interneuron. Taken together our results demonstrate that passive dendrites are sufficient to enable neurons to compute linearly non-separable functions.
Passive Dendrites Enable Single Neurons to Compute Linearly Non-separable Functions
Cazé, Romain Daniel; Humphries, Mark; Gutkin, Boris
2013-01-01
Local supra-linear summation of excitatory inputs occurring in pyramidal cell dendrites, the so-called dendritic spikes, results in independent spiking dendritic sub-units, which turn pyramidal neurons into two-layer neural networks capable of computing linearly non-separable functions, such as the exclusive OR. Other neuron classes, such as interneurons, may possess only a few independent dendritic sub-units, or only passive dendrites where input summation is purely sub-linear, and where dendritic sub-units are only saturating. To determine if such neurons can also compute linearly non-separable functions, we enumerate, for a given parameter range, the Boolean functions implementable by a binary neuron model with a linear sub-unit and either a single spiking or a saturating dendritic sub-unit. We then analytically generalize these numerical results to an arbitrary number of non-linear sub-units. First, we show that a single non-linear dendritic sub-unit, in addition to the somatic non-linearity, is sufficient to compute linearly non-separable functions. Second, we analytically prove that, with a sufficient number of saturating dendritic sub-units, a neuron can compute all functions computable with purely excitatory inputs. Third, we show that these linearly non-separable functions can be implemented with at least two strategies: one where a dendritic sub-unit is sufficient to trigger a somatic spike; another where somatic spiking requires the cooperation of multiple dendritic sub-units. We formally prove that implementing the latter architecture is possible with both types of dendritic sub-units whereas the former is only possible with spiking dendrites. Finally, we show how linearly non-separable functions can be computed by a generic two-compartment biophysical model and a realistic neuron model of the cerebellar stellate cell interneuron. Taken together our results demonstrate that passive dendrites are sufficient to enable neurons to compute linearly non-separable functions. PMID:23468600
Xia, Z Y; Zhai, X D; Liu, B B; Zheng, Z; Zhao, L L; Mo, Y N
2017-02-01
To analyze the relationship among electrical conductivity (EC), total volatile basic nitrogen (TVB-N), which is an index of decomposition rate for meat production, and postmortem interval (PMI). To explore the feasibility of EC as an index of cadaveric skeletal muscle decomposition rate and lay the foundation for PMI estimation. Healthy Sprague-Dawley rats were sacrificed by cervical vertebrae dislocation and kept at 28 ℃. Muscle of rear limbs was removed at different PMI, homogenized in deionized water and then skeletal extraction liquid of mass concentration 0.1 g/mL was prepared. EC and TVB-N of extraction liquid were separately determined. The correlation between EC ( x ₁) and TVB-N ( x ₂) was analyzed, and their regression function was established. The relationship between PMI ( y ) and these two parameters were studied, and their regression functions were separately established. The change trends of EC and TVB-N of skeletal extraction liquid at different PMI were almost the same, and there was a linear positive correlation between them. The regression equation was x ₂=0.14 x ₁-164.91( R ²=0.982). EC and TVB-N of skeletal muscle changed significantly with PMI, and the regression functions were y =19.38 x ₁³-370.68 x ₁²+2 526.03 x ₁-717.06( R ²=0.994), and y =2.56 x ₂³-48.39 x ₂²+330.60 x ₂-255.04( R ²=0.997), respectively. EC and TVB-N of rat postmortem skeletal muscle show similar change trends, which can be used as an index for decomposition rate of cadaveric skeletal muscle and provide a method for further study of late PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Predicting the Retention Behavior of Specific O-Linked Glycopeptides.
Badgett, Majors J; Boyes, Barry; Orlando, Ron
2017-09-01
O -Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O -glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O - N -acetylgalactosamine ( O -GalNAc), O - N -acetylglucosamine ( O -GlcNAc), and O -fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications.
Predicting the Retention Behavior of Specific O-Linked Glycopeptides
Badgett, Majors J.; Boyes, Barry; Orlando, Ron
2017-01-01
O-Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O-glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O-N-acetylgalactosamine (O-GalNAc), O-N-acetylglucosamine (O-GlcNAc), and O-fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications. PMID:28785176
Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung
2012-07-01
In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
2017-10-01
ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Hanke, Alexander T; Tsintavi, Eleni; Ramirez Vazquez, Maria Del Pilar; van der Wielen, Luuk A M; Verhaert, Peter D E M; Eppink, Michel H M; van de Sandt, Emile J A X; Ottens, Marcel
2016-09-01
Knowledge-based development of chromatographic separation processes requires efficient techniques to determine the physicochemical properties of the product and the impurities to be removed. These characterization techniques are usually divided into approaches that determine molecular properties, such as charge, hydrophobicity and size, or molecular interactions with auxiliary materials, commonly in the form of adsorption isotherms. In this study we demonstrate the application of a three-dimensional liquid chromatography approach to a clarified cell homogenate containing a therapeutic enzyme. Each separation dimension determines a molecular property relevant to the chromatographic behavior of each component. Matching of the peaks across the different separation dimensions and against a high-resolution reference chromatogram allows to assign the determined parameters to pseudo-components, allowing to determine the most promising technique for the removal of each impurity. More detailed process design using mechanistic models requires isotherm parameters. For this purpose, the second dimension consists of multiple linear gradient separations on columns in a high-throughput screening compatible format, that allow regression of isotherm parameters with an average standard error of 8%. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1283-1291, 2016. © 2016 American Institute of Chemical Engineers.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Patterns of shading tolerance determined from experimental ...
An extensive review of the experimental literature on seagrass shading evaluated the relationship between experimental light reductions, duration of experiment and seagrass response metrics to determine whether there were consistent statistical patterns. There were highly significant linear relationships of both percent biomass and percent shoot density reduction versus percent light reduction (versus controls), although unexplained variation in the data were high. Duration of exposure affected extent of response for both metrics, but was more clearly a factor in biomass response. Both biomass and shoot density showed linear responses to duration of light reduction for treatments 60%. Unexplained variation was again high, and greater for shoot density than biomass. With few exceptions, regressions of both biomass and shoot density on light reduction for individual species and for genera were statistically significant, but also tended to show high degrees of variability in data. Multivariate regressions that included both percent light reduction and duration of reduction as dependent variables increased the percentage of variation explained in almost every case. Analysis of response data by seagrass life history category (Colonizing, Opportunistic, Persistent) did not yield clearly separate response relationships in most cases. Biomass tended to show somewhat less variation in response to light reduction than shoot density, and of the two, may be the prefe
Dubey, S. K.; Duddelly, S.; Jangala, H.; Saha, R. N.
2013-01-01
A reliable, rapid and sensitive isocratic reverse phase high-performance liquid chromatography method has been developed and validated for assay of ketorolac tromethamine in tablets and ophthalmic dosage forms using diclofenac sodium as an internal standard. An isocratic separation of ketorolac tromethamine was achieved on Oyster BDS (150×4.6 mm i.d., 5 μm particle size) column using mobile phase of methanol:acetonitrile:sodium dihydrogen phosphate (20 mM; pH 5.5) (50:10:40, %v/v) at a flow rate of 1.0 ml/min. The eluents were monitored at 322 nm for ketorolac and at 282 nm for diclofenac sodium with a photodiode array detector. The retention times of ketorolac and diclofenac sodium were found to be 1.9 min and 4.6 min, respectively. Response was a linear function of drug concentration in the range of 0.01-15 μg/ml (R2=0.994; linear regression model using weighing factor 1/x2) with a limit of detection and quantification of 0.002 μg/ml and 0.007 μg/ml, respectively. The % recovery and % relative standard deviation values indicated the method was accurate and precise. PMID:23901166
NASA Astrophysics Data System (ADS)
Laborda, Francisco; Medrano, Jesús; Castillo, Juan R.
2004-06-01
The quality of the quantitative results obtained from transient signals in high-performance liquid chromatography-inductively coupled plasma mass spectrometry (HPLC-ICPMS) and flow injection-inductively coupled plasma mass spectrometry (FI-ICPMS) was investigated under multielement conditions. Quantification methods were based on multiple-point calibration by simple and weighted linear regression, and double-point calibration (measurement of the baseline and one standard). An uncertainty model, which includes the main sources of uncertainty from FI-ICPMS and HPLC-ICPMS (signal measurement, sample flow rate and injection volume), was developed to estimate peak area uncertainties and statistical weights used in weighted linear regression. The behaviour of the ICPMS instrument was characterized in order to be considered in the model, concluding that the instrument works as a concentration detector when it is used to monitorize transient signals from flow injection or chromatographic separations. Proper quantification by the three calibration methods was achieved when compared to reference materials, although the double-point calibration allowed to obtain results of the same quality as the multiple-point calibration, shortening the calibration time. Relative expanded uncertainties ranged from 10-20% for concentrations around the LOQ to 5% for concentrations higher than 100 times the LOQ.
Quatman-Yates, Catherine; Bonnette, Scott; Gupta, Resmi; Hugentobler, Jason A; Wade, Shari L; Glauser, Tracy A; Ittenbach, Richard F; Paterno, Mark V; Riley, Michael A
2018-04-01
This study aimed to provide insight into the development of postural control abilities in youth. A total of 276 typically developing adolescents (155 males, 121 females) with a mean age of 13.23 years (range of 7.11-18.80) were recruited for participation. Subjects performed two-minute quiet standing trials in bipedal stance on a force plate. Center of pressure (COP) trajectories were quantified using Sample Entropy (SampEn) in the anterior-posterior direction (SampEn-AP), SampEn in the medial-lateral direction (SampEn-ML), and Path Length (PL) measures. Three separate linear regression analyses were conducted to predict the relationship between age and each of the response variables after adjusting for individuals' physical characteristics. Linear regression models showed an inverse relationship between age and entropy measures after adjusting for body mass index. Results indicated that chronological age was predictive of entropy and path length patterns. Specifically, older adolescents exhibited center of pressure displacement (smaller path length) and less complex, more regular center of pressure displacement patterns (lower SampEn-AP and SampEn-ML values) compared to the younger children. These findings support prior studies suggesting that developmental changes in postural control abilities may continue throughout adolescence and into adulthood. Copyright © 2018 Elsevier B.V. All rights reserved.
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.
Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam
2017-11-01
The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Amen, Daniel G; Willeumier, Kristen; Omalu, Bennet; Newberg, Andrew; Raghavendra, Cauligi; Raji, Cyrus A
2016-04-25
National Football League (NFL) players are exposed to multiple head collisions during their careers. Increasing awareness of the adverse long-term effects of repetitive head trauma has raised substantial concern among players, medical professionals, and the general public. To determine whether low perfusion in specific brain regions on neuroimaging can accurately separate professional football players from healthy controls. A cohort of retired and current NFL players (n = 161) were recruited in a longitudinal study starting in 2009 with ongoing interval follow up. A healthy control group (n = 124) was separately recruited for comparison. Assessments included medical examinations, neuropsychological tests, and perfusion neuroimaging with single photon emission computed tomography (SPECT). Perfusion estimates of each scan were quantified using a standard atlas. We hypothesized that hypoperfusion particularly in the orbital frontal, anterior cingulate, anterior temporal, hippocampal, amygdala, insular, caudate, superior/mid occipital, and cerebellar sub-regions alone would reliably separate controls from NFL players. Cerebral perfusion differences were calculated using a one-way ANOVA and diagnostic separation was determined with discriminant and automatic linear regression predictive models. NFL players showed lower cerebral perfusion on average (p < 0.01) in 36 brain regions. The discriminant analysis subsequently distinguished NFL players from controls with 90% sensitivity, 86% specificity, and 94% accuracy (95% CI 95-99). Automatic linear modeling achieved similar results. Inclusion of age and clinical co-morbidities did not improve diagnostic classification. Specific brain regions commonly damaged in traumatic brain injury show abnormally low perfusion on SPECT in professional NFL players. These same regions alone can distinguish this group from healthy subjects with high diagnostic accuracy. This study carries implications for the neurological safety of NFL players.
Amen, Daniel G.; Willeumier, Kristen; Omalu, Bennet; Newberg, Andrew; Raghavendra, Cauligi; Raji, Cyrus A.
2016-01-01
Background: National Football League (NFL) players are exposed to multiple head collisions during their careers. Increasing awareness of the adverse long-term effects of repetitive head trauma has raised substantial concern among players, medical professionals, and the general public. Objective: To determine whether low perfusion in specific brain regions on neuroimaging can accurately separate professional football players from healthy controls. Method: A cohort of retired and current NFL players (n = 161) were recruited in a longitudinal study starting in 2009 with ongoing interval follow up. A healthy control group (n = 124) was separately recruited for comparison. Assessments included medical examinations, neuropsychological tests, and perfusion neuroimaging with single photon emission computed tomography (SPECT). Perfusion estimates of each scan were quantified using a standard atlas. We hypothesized that hypoperfusion particularly in the orbital frontal, anterior cingulate, anterior temporal, hippocampal, amygdala, insular, caudate, superior/mid occipital, and cerebellar sub-regions alone would reliably separate controls from NFL players. Cerebral perfusion differences were calculated using a one-way ANOVA and diagnostic separation was determined with discriminant and automatic linear regression predictive models. Results: NFL players showed lower cerebral perfusion on average (p < 0.01) in 36 brain regions. The discriminant analysis subsequently distinguished NFL players from controls with 90% sensitivity, 86% specificity, and 94% accuracy (95% CI 95-99). Automatic linear modeling achieved similar results. Inclusion of age and clinical co-morbidities did not improve diagnostic classification. Conclusion: Specific brain regions commonly damaged in traumatic brain injury show abnormally low perfusion on SPECT in professional NFL players. These same regions alone can distinguish this group from healthy subjects with high diagnostic accuracy. This study carries implications for the neurological safety of NFL players. PMID:27128374
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.
ERIC Educational Resources Information Center
Schafer, William D.; Wang, Yuh-Yin
A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Method and Apparatus for Separating Particles by Dielectrophoresis
NASA Technical Reports Server (NTRS)
Pant, Kapil (Inventor); Wang, Yi (Inventor); Bhatt, Ketan (Inventor); Prabhakarpandian, Balabhasker (Inventor)
2014-01-01
Particle separation apparatus separate particles and particle populations using dielectrophoretic (DEP) forces generated by one or more pairs of electrically coupled electrodes separated by a gap. Particles suspended in a fluid are separated by DEP forces generated by the at least one electrode pair at the gap as they travel over a separation zone comprising the electrode pair. Selected particles are deflected relative to the flow of incoming particles by DEP forces that are affected by controlling applied potential, gap width, and the angle linear gaps with respect to fluid flow. The gap between an electrode pair may be a single, linear gap of constant gap, a single linear gap having variable width, or a be in the form of two or more linear gaps having constant or variable gap width having different angles with respect to one another and to the flow.
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Li, Cun-Yu; Wu, Xin; Gu, Jia-Mei; Li, Hong-Yang; Peng, Guo-Ping
2018-04-01
Based on the molecular sieving and solution-diffusion effect in nanofiltration separation, the correlation between initial concentration and mass transfer coefficient of three typical phenolic acids from Salvia miltiorrhiza was fitted to analyze the relationship among mass transfer coefficient, molecular weight and concentration. The experiment showed a linear relationship between operation pressure and membrane flux. Meanwhile, the membrane flux was gradually decayed with the increase of solute concentration. On the basis of the molecular sieving and solution-diffusion effect, the mass transfer coefficient and initial concentration of three phenolic acids showed a power function relationship, and the regression coefficients were all greater than 0.9. The mass transfer coefficient and molecular weight of three phenolic acids were negatively correlated with each other, and the order from high to low is protocatechualdehyde >rosmarinic acid> salvianolic acid B. The separation mechanism of nanofiltration for phenolic acids was further clarified through the analysis of the correlation of molecular weight and nanofiltration mass transfer coefficient. The findings provide references for nanofiltration separation, especially for traditional Chinese medicine with phenolic acids. Copyright© by the Chinese Pharmaceutical Association.
Locally linear regression for pose-invariant face recognition.
Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen
2007-07-01
The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
Quantitative analysis of the major constituents of St John's wort with HPLC-ESI-MS.
Chandrasekera, Dhammitha H; Welham, Kevin J; Ashton, David; Middleton, Richard; Heinrich, Michael
2005-12-01
A method was developed to profile the major constituents of St John's wort extracts using high-performance liquid chromatography-electrospray mass spectrometry (HPLC-ESI-MS). The objective was to simultaneously separate, identify and quantify hyperforin, hypericin, pseudohypericin, rutin, hyperoside, isoquercetrin, quercitrin and chlorogenic acid using HPLC-MS. Quantification was performed using an external standardisation method with reference standards. The method consisted of two protocols: one for the analysis of flavonoids and glycosides and the other for the analysis of the more lipophilic hypericins and hyperforin. Both protocols used a reverse phase Luna phenyl hexyl column. The separation of the flavonoids and glycosides was achieved within 35 min and that of the hypericins and hyperforin within 9 min. The linear response range in ESI-MS was established for each compound and all had linear regression coefficient values greater than 0.97. Both protocols proved to be very specific for the constituents analysed. MS analysis showed no other signals within the analyte peaks. The method was robust and applicable to alcoholic tinctures, tablet/capsule extracts in various solvents and herb extracts. The method was applied to evaluate the phytopharmaceutical quality of St John's wort preparations available in the UK in order to test the method and investigate if they contain at least the main constituents and at what concentrations.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Cronin, Matthew A.; Amstrup, Steven C.; Durner, George M.; Noel, Lynn E.; McDonald, Trent L.; Ballard, Warren B.
1998-01-01
There is concern that caribou (Rangifer tarandus) may avoid roads and facilities (i.e., infrastructure) in the Prudhoe Bay oil field (PBOF) in northern Alaska, and that this avoidance can have negative effects on the animals. We quantified the relationship between caribou distribution and PBOF infrastructure during the post-calving period (mid-June to mid-August) with aerial surveys from 1990 to 1995. We conducted four to eight surveys per year with complete coverage of the PBOF. We identified active oil field infrastructure and used a geographic information system (GIS) to construct ten 1 km wide concentric intervals surrounding the infrastructure. We tested whether caribou distribution is related to distance from infrastructure with a chi-squared habitat utilization-availability analysis and log-linear regression. We considered bulls, calves, and total caribou of all sex/age classes separately. The habitat utilization-availability analysis indicated there was no consistent trend of attraction to or avoidance of infrastructure. Caribou frequently were more abundant than expected in the intervals close to infrastructure, and this trend was more pronounced for bulls and for total caribou of all sex/age classes than for calves. Log-linear regression (with Poisson error structure) of numbers of caribou and distance from infrastructure were also done, with and without combining data into the 1 km distance intervals. The analysis without intervals revealed no relationship between caribou distribution and distance from oil field infrastructure, or between caribou distribution and Julian date, year, or distance from the Beaufort Sea coast. The log-linear regression with caribou combined into distance intervals showed the density of bulls and total caribou of all sex/age classes declined with distance from infrastructure. Our results indicate that during the post-calving period: 1) caribou distribution is largely unrelated to distance from infrastructure; 2) caribou regularly use habitats in the PBOF; 3) caribou often occur close to infrastructure; and 4) caribou do not appear to avoid oil field infrastructure.
Physical Function in Older Men With Hyperkyphosis
Harrison, Stephanie L.; Fink, Howard A.; Marshall, Lynn M.; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M.; Kado, Deborah M.
2015-01-01
Background. Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. Methods. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71–98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. Results. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5–1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Conclusions. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. PMID:25431353
Monopole and dipole estimation for multi-frequency sky maps by linear regression
NASA Astrophysics Data System (ADS)
Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.
2017-01-01
We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Linear separability in superordinate natural language concepts.
Ruts, Wim; Storms, Gert; Hampton, James
2004-01-01
Two experiments are reported in which linear separability was investigated in superordinate natural language concept pairs (e.g., toiletry-sewing gear). Representations of the exemplars of semantically related concept pairs were derived in two to five dimensions using multidimensional scaling (MDS) of similarities based on possession of the concept features. Next, category membership, obtained from an exemplar generation study (in Experiment 1) and from a forced-choice classification task (in Experiment 2) was predicted from the coordinates of the MDS representation using log linear analysis. The results showed that all natural kind concept pairs were perfectly linearly separable, whereas artifact concept pairs showed several violations. Clear linear separability of natural language concept pairs is in line with independent cue models. The violations in the artifact pairs, however, yield clear evidence against the independent cue models.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Breakfast intake among adults with type 2 diabetes: is bigger better?
Jarvandi, Soghra; Schootman, Mario; Racette, Susan B.
2015-01-01
Objective To assess the association between breakfast energy and total daily energy intake among individuals with type 2 diabetes. Design Cross-sectional study. Daily energy intake was computed from a 24-h dietary recall. Multiple regression models were used to estimate the association between daily energy intake (dependent variable) and quartiles of energy intake at breakfast (independent variable) expressed as either absolute or relative (% of total daily energy intake) terms. Orthogonal polynomial contrasts were used to test for linear and quadratic trends. Models were controlled for sex, age, race/ethnicity, body mass index, physical activity and smoking. In addition, we used separate multiple regression models to test the effect of quartiles of absolute and relative breakfast energy on intake at lunch, dinner, and snacks. Setting The 1999–2004 National Health and Nutrition Examination Survey (NHANES). Subjects Participants aged ≥ 30 years with self-reported history of diabetes (N = 1,146). Results Daily energy intake increased as absolute breakfast energy intake increased (linear trend, P < 0.0001; quadratic trend, P = 0.02), but decreased as relative breakfast energy intake increased (linear trend, P < 0.0001). In addition, while higher quartiles of absolute breakfast intake had no associations with energy intake at subsequent meals, higher quartiles of relative breakfast intake were associated with lower energy intake during all subsequent meals and snacks (P < 0.05). Conclusions Consuming a breakfast that provided less energy or comprised a greater proportion of daily energy intake was associated with lower total daily energy intake in adults with type 2 diabetes. PMID:25529061
Cabral, Ana Caroline; Stark, Jonathan S; Kolm, Hedda E; Martins, César C
2018-04-01
Sewage input and the relationship between chemical markers (linear alkylbenzenes and coprostanol) and fecal indicator bacteria (FIB, Escherichia coli and enterococci), were evaluated in order to establish thresholds values for chemical markers in suspended particulate matter (SPM) as indicators of sewage contamination in two subtropical estuaries in South Atlantic Brazil. Both chemical markers presented no linear relationship with FIB due to high spatial microbiological variability, however, microbiological water quality was related to coprostanol values when analyzed by logistic regression, indicating that linear models may not be the best representation of the relationship between both classes of indicators. Logistic regression was performed with all data and separately for two sampling seasons, using 800 and 100 MPN 100 mL -1 of E. coli and enterococci, respectively, as the microbiological limits of sewage contamination. Threshold values of coprostanol varied depending on the FIB and season, ranging between 1.00 and 2.23 μg g -1 SPM. The range of threshold values of coprostanol for SPM are relatively higher and more variable than those suggested in literature for sediments (0.10-0.50 μg g -1 ), probably due to higher concentration of coprostanol in SPM than in sediment. Temperature may affect the relationship between microbiological indicators and coprostanol, since the threshold value of coprostanol found here was similar to tropical areas, but lower than those found during winter in temperate areas, reinforcing the idea that threshold values should be calibrated for different climatic conditions. Copyright © 2018 Elsevier Ltd. All rights reserved.
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Breast feeding and resilience against psychosocial stress.
Montgomery, S M; Ehlin, A; Sacker, A
2006-12-01
Some early life exposures may result in a well controlled stress response, which can reduce stress related anxiety. Breast feeding may be a marker of some relevant exposures. To assess whether breast feeding is associated with modification of the relation between parental divorce and anxiety. Observational study using longitudinal birth cohort data. Linear regression was used to assess whether breast feeding modifies the association of parental divorce/separation with anxiety using stratification and interaction testing. Data were obtained from the 1970 British Cohort Study, which is following the lives of those born in one week in 1970 and living in Great Britain. This study uses information collected at birth and at ages 5 and 10 years for 8958 subjects. Class teachers answered a question on anxiety among 10 year olds using an analogue scale (range 0-50) that was log transformed to minimise skewness. Among 5672 non-breast fed subjects, parental divorce/separation was associated with a statistically significantly raised risk of anxiety, with a regression coefficient (95% CI) of 9.4 (6.1 to 12.8). Among the breast fed group this association was much lower: 2.2 (-2.6 to 7.0). Interaction testing confirmed statistically significant effect modification by breast feeding, independent of simultaneous adjustment for multiple potential confounding factors, producing an interaction coefficient of -7.0 (-12.8 to -1.2), indicating a 7% reduction in anxiety after adjustment. Breast feeding is associated with resilience against the psychosocial stress linked with parental divorce/separation. This could be because breast feeding is a marker of exposures related to maternal characteristics and parent-child interaction.
Ruan, Xiaofang; Zhang, Ruisheng; Yao, Xiaojun; Liu, Mancang; Fan, Botao
2007-03-01
Alkylphenols are a group of permanent pollutants in the environment and could adversely disturb the human endocrine system. It is therefore important to effectively separate and measure the alkylphenols. To guide the chromatographic analysis of these compounds in practice, the development of quantitative relationship between the molecular structure and the retention time of alkylphenols becomes necessary. In this study, topological, constitutional, geometrical, electrostatic and quantum-chemical descriptors of 44 alkylphenols were calculated using a software, CODESSA, and these descriptors were pre-selected using the heuristic method. As a result, three-descriptor linear model (LM) was developed to describe the relationship between the molecular structure and the retention time of alkylphenols. Meanwhile, the non-linear regression model was also developed based on support vector machine (SVM) using the same three descriptors. The correlation coefficient (R(2)) for the LM and SVM was 0.98 and 0. 92, and the corresponding root-mean-square error was 0. 99 and 2. 77, respectively. By comparing the stability and prediction ability of the two models, it was found that the linear model was a better method for describing the quantitative relationship between the retention time of alkylphenols and the molecular structure. The results obtained suggested that the linear model could be applied for the chromatographic analysis of alkylphenols with known molecular structural parameters.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Separation in Logistic Regression: Causes, Consequences, and Control.
Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg
2018-04-01
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
Overhead longwave infrared hyperspectral material identification using radiometric models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zelinski, M. E.
Material detection algorithms used in hyperspectral data processing are computationally efficient but can produce relatively high numbers of false positives. Material identification performed as a secondary processing step on detected pixels can help separate true and false positives. This paper presents a material identification processing chain for longwave infrared hyperspectral data of solid materials collected from airborne platforms. The algorithms utilize unwhitened radiance data and an iterative algorithm that determines the temperature, humidity, and ozone of the atmospheric profile. Pixel unmixing is done using constrained linear regression and Bayesian Information Criteria for model selection. The resulting product includes an optimalmore » atmospheric profile and full radiance material model that includes material temperature, abundance values, and several fit statistics. A logistic regression method utilizing all model parameters to improve identification is also presented. This paper details the processing chain and provides justification for the algorithms used. Several examples are provided using modeled data at different noise levels.« less
Global correlation of topographic heights and gravity anomalies
NASA Technical Reports Server (NTRS)
Roufosse, M. C.
1977-01-01
The short wavelength features were obtained by subtracting a calculated 24th-degree-and-order field from observed data written in 1 deg x 1 deg squares. The correlation between the two residual fields was examined by a program of linear regression. When run on a worldwide scale over oceans and continents separately, the program did not exhibit any correlation; this can be explained by the fact that the worldwide autocorrelation function for residual gravity anomalies falls off much faster as a function of distance than does that for residual topographic heights. The situation was different when the program was used in restricted areas, of the order of 5 deg x 5 deg square. For 30% of the world,fair-to-good correlations were observed, mostly over continents. The slopes of the regression lines are proportional to apparent densities, which offer a large spectrum of values that are being interpreted in terms of features in the upper mantle consistent with available heat-flow, gravity, and seismic data.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Performance characteristics of LOX-H2, tangential-entry, swirl-coaxial, rocket injectors
NASA Technical Reports Server (NTRS)
Howell, Doug; Petersen, Eric; Clark, Jim
1993-01-01
Development of a high performing swirl-coaxial injector requires an understanding of fundamental performance characteristics. This paper addresses the findings of studies on cold flow atomic characterizations which provided information on the influence of fluid properties and element operating conditions on the produced droplet sprays. These findings are applied to actual rocket conditions. The performance characteristics of swirl-coaxial injection elements under multi-element hot-fire conditions were obtained by analysis of combustion performance data from three separate test series. The injection elements are described and test results are analyzed using multi-variable linear regression. A direct comparison of test results indicated that reduced fuel injection velocity improved injection element performance through improved propellant mixing.
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Peng, Xuejun; Sternberg, Ethan; Dolphin, David
2002-01-01
A method for the separation of benzoporphyrin derivative mono- and diacid (BPDMA, BPDDA) enantiomers by laser induced fluorescence-capillary electrophoresis (LIF-CE) has been developed. By using 300 mM borate buffer, pH 9.2, 25 mM sodium cholate and 10% acetronitrile as electrolyte, +10 kV electrokinetic sampling injection of 2 s and an applied +20 kV voltage across the ends of a 37 cm capillary (30 cm to the detector, 50 microm ID), all six BPD stereoisomers were baseline-separated within 20 min. Formation constants, free electrophoretic and complexation mobilities with borate and cholate were determined based on dynamic complexation capillary electrophoresis theory. The BPD enantiomers can be quantitatively determined in the range of 10(-2)-10(-5) mg mL(-1). The correlation coefficients (r2) of the least-squares linear regression analysis of the BPD enantiomers are in the range of 0.9914-0.9997. Their limits of detection are 2.18-3.5 x 10(-3) mg mL(-1). The relative standard deviations for the separation were 2.90-4.64% (n = 10). In comparison with high-performance liquid chromatography (HPLC), CE has better resolution and efficiency. This separation method was successfully applied to the BPD enantiomers obtained from a matrix of bovine serum and from liposomally formulated material as well as from studies with rat, dog and human microsomes.
Pesonen, Anu-Katriina; Räikkönen, Katri; Feldt, Kimmo; Heinonen, Kati; Osmond, Clive; Phillips, David I W; Barker, David J P; Eriksson, Johan G; Kajantie, Eero
2010-06-01
Animal models have linked early maternal separation with lifelong changes in hypothalamic-pituitary-adrenocortical (HPA) axis activity. Although this is paralleled in human studies, this is often in the context of other life adversities, for example, divorce or adoption, and it is not known whether early separation in the absence of these factors has long term effects on the HPA axis. The Finnish experience in World War II created a natural experiment to test whether separation from a father serving in the armed forces or from both parents due to war evacuation are associated with alterations in HPA axis response to psychosocial stress in late adulthood. 282 subjects (M=63.5 years, SD=2.5), of whom 85 were non-separated, 129 were separated from their father, and 68 were separated from both their caregivers during WWII, were enlisted to participate in a Trier Social Stress Test (TSST), during which we measured salivary cortisol and, for 215 individuals, plasma cortisol and ACTH concentrations. We used mixed models to study whether parental separation is associated with salivary and plasma cortisol or plasma ACTH reactivity, and linear regressions to analyse differences in the baseline, or incremental area under the cortisol or ACTH curves. Participants separated from their father did not differ significantly from non-separated participants. However, those separated from both parents had higher average salivary cortisol and plasma ACTH concentrations across all time points compared to the non-separated group. They also had higher salivary cortisol reactivity to the TSST. Separated women had higher baselines in plasma cortisol and ACTH, whereas men had higher reactivity in response to stress during the TSST. Participants who had experienced the separation in early childhood were more affected than children separated during infancy or school age. Separation from parents during childhood may alter an individual's stress physiology much later in adult life. Copyright 2009 Elsevier Ltd. All rights reserved.
GIS Tools to Estimate Average Annual Daily Traffic
DOT National Transportation Integrated Search
2012-06-01
This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Improved estimation of PM2.5 using Lagrangian satellite-measured aerosol optical depth
NASA Astrophysics Data System (ADS)
Olivas Saunders, Rolando
Suspended particulate matter (aerosols) with aerodynamic diameters less than 2.5 mum (PM2.5) has negative effects on human health, plays an important role in climate change and also causes the corrosion of structures by acid deposition. Accurate estimates of PM2.5 concentrations are thus relevant in air quality, epidemiology, cloud microphysics and climate forcing studies. Aerosol optical depth (AOD) retrieved by the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument has been used as an empirical predictor to estimate ground-level concentrations of PM2.5 . These estimates usually have large uncertainties and errors. The main objective of this work is to assess the value of using upwind (Lagrangian) MODIS-AOD as predictors in empirical models of PM2.5. The upwind locations of the Lagrangian AOD were estimated using modeled backward air trajectories. Since the specification of an arrival elevation is somewhat arbitrary, trajectories were calculated to arrive at four different elevations at ten measurement sites within the continental United States. A systematic examination revealed trajectory model calculations to be sensitive to starting elevation. With a 500 m difference in starting elevation, the 48-hr mean horizontal separation of trajectory endpoints was 326 km. When the difference in starting elevation was doubled and tripled to 1000 m and 1500m, the mean horizontal separation of trajectory endpoints approximately doubled and tripled to 627 km and 886 km, respectively. A seasonal dependence of this sensitivity was also found: the smallest mean horizontal separation of trajectory endpoints was exhibited during the summer and the largest separations during the winter. A daily average AOD product was generated and coupled to the trajectory model in order to determine AOD values upwind of the measurement sites during the period 2003-2007. Empirical models that included in situ AOD and upwind AOD as predictors of PM2.5 were generated by multivariate linear regressions using the least squares method. The multivariate models showed improved performance over the single variable regression (PM2.5 and in situ AOD) models. The statistical significance of the improvement of the multivariate models over the single variable regression models was tested using the extra sum of squares principle. In many cases, even when the R-squared was high for the multivariate models, the improvement over the single models was not statistically significant. The R-squared of these multivariate models varied with respect to seasons, with the best performance occurring during the summer months. A set of seasonal categorical variables was included in the regressions to exploit this variability. The multivariate regression models that included these categorical seasonal variables performed better than the models that didn't account for seasonal variability. Furthermore, 71% of these regressions exhibited improvement over the single variable models that was statistically significant at a 95% confidence level.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
NASA Astrophysics Data System (ADS)
Wang, Weibao; Overall, Gary; Riggs, Travis; Silveston-Keith, Rebecca; Whitney, Julie; Chiu, George; Allebach, Jan P.
2013-01-01
Assessment of macro-uniformity is a capability that is important for the development and manufacture of printer products. Our goal is to develop a metric that will predict macro-uniformity, as judged by human subjects, by scanning and analyzing printed pages. We consider two different machine learning frameworks for the metric: linear regression and the support vector machine. We have implemented the image quality ruler, based on the recommendations of the INCITS W1.1 macro-uniformity team. Using 12 subjects at Purdue University and 20 subjects at Lexmark, evenly balanced with respect to gender, we conducted subjective evaluations with a set of 35 uniform b/w prints from seven different printers with five levels of tint coverage. Our results suggest that the image quality ruler method provides a reliable means to assess macro-uniformity. We then defined and implemented separate features to measure graininess, mottle, large area variation, jitter, and large-scale non-uniformity. The algorithms that we used are largely based on ISO image quality standards. Finally, we used these features computed for a set of test pages and the subjects' image quality ruler assessments of these pages to train the two different predictors - one based on linear regression and the other based on the support vector machine (SVM). Using five-fold cross-validation, we confirmed the efficacy of our predictor.
Restoring method for missing data of spatial structural stress monitoring based on correlation
NASA Astrophysics Data System (ADS)
Zhang, Zeyu; Luo, Yaozhi
2017-07-01
Long-term monitoring of spatial structures is of great importance for the full understanding of their performance and safety. The missing part of the monitoring data link will affect the data analysis and safety assessment of the structure. Based on the long-term monitoring data of the steel structure of the Hangzhou Olympic Center Stadium, the correlation between the stress change of the measuring points is studied, and an interpolation method of the missing stress data is proposed. Stress data of correlated measuring points are selected in the 3 months of the season when missing data is required for fitting correlation. Data of daytime and nighttime are fitted separately for interpolation. For a simple linear regression when single point's correlation coefficient is 0.9 or more, the average error of interpolation is about 5%. For multiple linear regression, the interpolation accuracy is not significantly increased after the number of correlated points is more than 6. Stress baseline value of construction step should be calculated before interpolating missing data in the construction stage, and the average error is within 10%. The interpolation error of continuous missing data is slightly larger than that of the discrete missing data. The data missing rate of this method should better not exceed 30%. Finally, a measuring point's missing monitoring data is restored to verify the validity of the method.
Circulating fibrinogen but not D-dimer level is associated with vital exhaustion in school teachers.
Kudielka, Brigitte M; Bellingrath, Silja; von Känel, Roland
2008-07-01
Meta-analyses have established elevated fibrinogen and D-dimer levels in the circulation as biological risk factors for the development and progression of coronary artery disease (CAD). Here, we investigated whether vital exhaustion (VE), a known psychosocial risk factor for CAD, is associated with fibrinogen and D-dimer levels in a sample of apparently healthy school teachers. The teaching profession has been proposed as a potentially high stressful occupation due to enhanced psychosocial stress at the workplace. Plasma fibrinogen and D-dimer levels were measured in 150 middle-aged male and female teachers derived from the first year of the Trier-Teacher-Stress-Study. Log-transformed levels were analyzed using linear regression. Results yielded a significant association between VE and fibrinogen (p = 0.02) but not D-dimer controlling for relevant covariates. Further investigation of possible interaction effects resulted in a significant association between fibrinogen and the interaction term "VE x gender" (p = 0.05). In a secondary analysis, we reran linear regression models for males and females separately. Gender-specific results revealed that the association between fibrinogen and VE remained significant in males but not females. In sum, the present data support the notion that fibrinogen levels are positively related to VE. Elevated fibrinogen might be one biological pathway by which chronic work stress may impact on teachers' cardiovascular health in the long run.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Analysis of separation test for automatic brake adjuster based on linear radon transformation
NASA Astrophysics Data System (ADS)
Luo, Zai; Jiang, Wensong; Guo, Bin; Fan, Weijun; Lu, Yi
2015-01-01
The linear Radon transformation is applied to extract inflection points for online test system under the noise conditions. The linear Radon transformation has a strong ability of anti-noise and anti-interference by fitting the online test curve in several parts, which makes it easy to handle consecutive inflection points. We applied the linear Radon transformation to the separation test system to solve the separating clearance of automatic brake adjuster. The experimental results show that the feature point extraction error of the gradient maximum optimal method is approximately equal to ±0.100, while the feature point extraction error of linear Radon transformation method can reach to ±0.010, which has a lower error than the former one. In addition, the linear Radon transformation is robust.
Huang, Yu; Griffin, Michael J
2014-01-01
This study investigated the prediction of the discomfort caused by simultaneous noise and vibration from the discomfort caused by noise and the discomfort caused by vibration when they are presented separately. A total of 24 subjects used absolute magnitude estimation to report their discomfort caused by seven levels of noise (70-88 dBA SEL), 7 magnitudes of vibration (0.146-2.318 ms(- 1.75)) and all 49 possible combinations of these noise and vibration stimuli. Vibration did not significantly influence judgements of noise discomfort, but noise reduced vibration discomfort by an amount that increased with increasing noise level, consistent with a 'masking effect' of noise on judgements of vibration discomfort. A multiple linear regression model or a root-sums-of-squares model predicted the discomfort caused by combined noise and vibration, but the root-sums-of-squares model is more convenient and provided a more accurate prediction of the discomfort produced by combined noise and vibration.
Ichikawa, Akio; Ono, Hiroshi; Furuta, Kenjiro; Shiotsuki, Takahiro; Shinoda, Tetsuro
2007-08-17
Juvenile hormone III (JH III) racemate was prepared from methyl (2E,6E)-farnesoate via epoxidation with 3-chloroperbenzoic acid (mCPBA). Enantioselective separation of JH III was conducted using normal-phase high-performance liquid chromatography (HPLC) on a chiral stationary phase. [(2)H(3)]Methyl (2E,6E)-farnesoate was also prepared from (2E,6E)-farnesoic acid and [(2)H(4)]methanol (methanol-d(4)) using 1-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydrochloride (EDC) and 4-dimethylaminopyridine (DMAP); the conjugated double bond underwent isomerization to some degree. Epoxidation of [(2)H(3)]methyl (2E,6E)-farnesoate with mCPBA gave a novel deuterium-substituted internal standard [(2)H(3)]JH III (JH III-d(3)). The standard curve was produced by linear regression using the peak area ratios of JH III and JH III-d(3) in liquid chromatography-mass spectrometry (LC-MS).
Brown, K M; Middaugh, S J; Haythornthwaite, J A; Bielory, L
2001-04-01
It was expected that stress and anxiety would be related to Raynaud's phenomenon (RP) attack characteristics when mild outdoor temperatures produced partial or no digital vasoconstriction. Hypotheses were that in warmer temperature categories, compared to those below 40 degrees F, higher stress or anxiety would be associated with more frequent, severe, and painful attacks. The Raynaud's Treatment Study recruited 313 participants with primary RP. Outcomes were attack rate, severity, and pain. Predictors were average daily outdoor temperature, stress, anxiety, age, gender, and a stress-by-temperature or an anxiety-by-temperature interaction. Outcomes were tested separately in multiple linear regression models. Stress and anxiety were tested in separate models. Stress was not a significant predictor of RP attack characteristics. Higher anxiety was related to more frequent attacks above 60 degrees F. It was also related to greater attack severity at all temperatures, and to greater pain above 60 degrees F and between 40 degrees and 49.9 degrees F.
Integrating uniform design and response surface methodology to optimize thiacloprid suspension
Li, Bei-xing; Wang, Wei-chang; Zhang, Xian-peng; Zhang, Da-xia; Mu, Wei; Liu, Feng
2017-01-01
A model 25% suspension concentrate (SC) of thiacloprid was adopted to evaluate an integrative approach of uniform design and response surface methodology. Tersperse2700, PE1601, xanthan gum and veegum were the four experimental factors, and the aqueous separation ratio and viscosity were the two dependent variables. Linear and quadratic polynomial models of stepwise regression and partial least squares were adopted to test the fit of the experimental data. Verification tests revealed satisfactory agreement between the experimental and predicted data. The measured values for the aqueous separation ratio and viscosity were 3.45% and 278.8 mPa·s, respectively, and the relative errors of the predicted values were 9.57% and 2.65%, respectively (prepared under the proposed conditions). Comprehensive benefits could also be obtained by appropriately adjusting the amount of certain adjuvants based on practical requirements. Integrating uniform design and response surface methodology is an effective strategy for optimizing SC formulas. PMID:28383036
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Kanamori, Shogo; Castro, Marcia C; Sow, Seydou; Matsuno, Rui; Cissokho, Alioune; Jimba, Masamine
2016-01-01
The 5S method is a lean management tool for workplace organization, with 5S being an abbreviation for five Japanese words that translate to English as Sort, Set in Order, Shine, Standardize, and Sustain. In Senegal, the 5S intervention program was implemented in 10 health centers in two regions between 2011 and 2014. To identify the impact of the 5S intervention program on the satisfaction of clients (patients and caretakers) who visited the health centers. A standardized 5S intervention protocol was implemented in the health centers using a quasi-experimental separate pre-post samples design (four intervention and three control health facilities). A questionnaire with 10 five-point Likert items was used to measure client satisfaction. Linear regression analysis was conducted to identify the intervention's effect on the client satisfaction scores, represented by an equally weighted average of the 10 Likert items (Cronbach's alpha=0.83). Additional regression analyses were conducted to identify the intervention's effect on the scores of each Likert item. Backward stepwise linear regression ( n= 1,928) indicated a statistically significant effect of the 5S intervention, represented by an increase of 0.19 points in the client satisfaction scores in the intervention group, 6 to 8 months after the intervention ( p= 0.014). Additional regression analyses showed significant score increases of 0.44 ( p= 0.002), 0.14 ( p= 0.002), 0.06 ( p= 0.019), and 0.17 ( p= 0.044) points on four items, which, respectively were healthcare staff members' communication, explanations about illnesses or cases, and consultation duration, and clients' overall satisfaction. The 5S has the potential to improve client satisfaction at resource-poor health facilities and could therefore be recommended as a strategic option for improving the quality of healthcare service in low- and middle-income countries. To explore more effective intervention modalities, further studies need to address the mechanisms by which 5S leads to attitude changes in healthcare staff.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
Spatial interpolation schemes of daily precipitation for hydrologic modeling
Hwang, Y.; Clark, M.R.; Rajagopalan, B.; Leavesley, G.
2012-01-01
Distributed hydrologic models typically require spatial estimates of precipitation interpolated from sparsely located observational points to the specific grid points. We compare and contrast the performance of regression-based statistical methods for the spatial estimation of precipitation in two hydrologically different basins and confirmed that widely used regression-based estimation schemes fail to describe the realistic spatial variability of daily precipitation field. The methods assessed are: (1) inverse distance weighted average; (2) multiple linear regression (MLR); (3) climatological MLR; and (4) locally weighted polynomial regression (LWP). In order to improve the performance of the interpolations, the authors propose a two-step regression technique for effective daily precipitation estimation. In this simple two-step estimation process, precipitation occurrence is first generated via a logistic regression model before estimate the amount of precipitation separately on wet days. This process generated the precipitation occurrence, amount, and spatial correlation effectively. A distributed hydrologic model (PRMS) was used for the impact analysis in daily time step simulation. Multiple simulations suggested noticeable differences between the input alternatives generated by three different interpolation schemes. Differences are shown in overall simulation error against the observations, degree of explained variability, and seasonal volumes. Simulated streamflows also showed different characteristics in mean, maximum, minimum, and peak flows. Given the same parameter optimization technique, LWP input showed least streamflow error in Alapaha basin and CMLR input showed least error (still very close to LWP) in Animas basin. All of the two-step interpolation inputs resulted in lower streamflow error compared to the directly interpolated inputs. ?? 2011 Springer-Verlag.
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
ERIC Educational Resources Information Center
Laird, Robert D.; Weems, Carl F.
2011-01-01
Research on informant discrepancies has increasingly utilized difference scores. This article demonstrates the statistical equivalence of regression models using difference scores (raw or standardized) and regression models using separate scores for each informant to show that interpretations should be consistent with both models. First,…
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Cyst-based measurements for assessing lymphangioleiomyomatosis in computed tomography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, P., E-mail: pechinlo@mednet.edu.ucla; Brown, M. S.; Kim, H.
Purpose: To investigate the efficacy of a new family of measurements made on individual pulmonary cysts extracted from computed tomography (CT) for assessing the severity of lymphangioleiomyomatosis (LAM). Methods: CT images were analyzed using thresholding to identify a cystic region of interest from chest CT of LAM patients. Individual cysts were then extracted from the cystic region by the watershed algorithm, which separates individual cysts based on subtle edges within the cystic regions. A family of measurements were then computed, which quantify the amount, distribution, and boundary appearance of the cysts. Sequential floating feature selection was used to select amore » small subset of features for quantification of the severity of LAM. Adjusted R{sup 2} from multiple linear regression and R{sup 2} from linear regression against measurements from spirometry were used to compare the performance of our proposed measurements with currently used density based CT measurements in the literature, namely, the relative area measure and the D measure. Results: Volumetric CT data, performed at total lung capacity and residual volume, from a total of 49 subjects enrolled in the MILES trial were used in our study. Our proposed measures had adjusted R{sup 2} ranging from 0.42 to 0.59 when regressing against the spirometry measures, with p < 0.05. For previously used density based CT measurements in the literature, the best R{sup 2} was 0.46 (for only one instance), with the majority being lower than 0.3 or p > 0.05. Conclusions: The proposed family of CT-based cyst measurements have better correlation with spirometric measures than previously used density based CT measurements. They show potential as a sensitive tool for quantitatively assessing the severity of LAM.« less
Sundh, Josefin; Ställberg, Björn; Lisspers, Karin; Kämpe, Mary; Janson, Christer; Montgomery, Scott
2016-01-01
The COPD Assessment Test (CAT) and the Clinical COPD Questionnaire (CCQ) are both clinically useful health status instruments. The main objective was to compare CAT and CCQ measurement instruments. CAT and CCQ forms were completed by 432 randomly selected primary and secondary care patients with a COPD diagnosis. Correlation and linear regression analyses of CAT and CCQ were performed. Standardised scores were created for the CAT and CCQ scores, and separate multiple linear regression analyses for CAT and CCQ examined associations with sex, age (≤ 60, 61-70 and >70 years), exacerbations (≥ 1 vs 0 in the previous year), body mass index (BMI), heart disease, anxiety/depression and lung function (subgroup with n = 246). CAT and CCQ correlated well (r = 0.88, p < 0.0001), as did CAT ≥ 10 and CCQ ≥ 1 (r = 0.78, p < 0.0001). CCQ 1.0 corresponded to CAT 9.93 and CAT 10 to CCQ 1.29. Both instruments were associated with BMI < 20 (standardised adjusted regression coefficient (95%CI) for CAT 0.56 (0.18 to 0.93) and CCQ 0.56 (0.20 to 0.92)), exacerbations (CAT 0.77 (0.58 to 0.95) and CCQ 0.94 (0.76 to 1.12)), heart disease (CAT 0.38 (0.17 to 0.59) and CCQ 0.23 (0.03 to 0.43)), anxiety/depression (CAT 0.35 (0.15 to 0.56) and CCQ 0.41 (0.21 to 0.60)) and COPD stage (CAT 0.19 (0.05 to 0.34) and CCQ 0.22 (0.07 to 0.36)). CAT and CCQ correlate well with each other. Heart disease, anxiety/depression, underweight, exacerbations, and low lung function are associated with worse health status assessed by both instruments.
The Association of Fever with Total Mechanical Ventilation Time in Critically Ill Patients.
Park, Dong Won; Egi, Moritoki; Nishimura, Masaji; Chang, Youjin; Suh, Gee Young; Lim, Chae Man; Kim, Jae Yeol; Tada, Keiichi; Matsuo, Koichi; Takeda, Shinhiro; Tsuruta, Ryosuke; Yokoyama, Takeshi; Kim, Seon Ok; Koh, Younsuck
2016-12-01
This research aims to investigate the impact of fever on total mechanical ventilation time (TVT) in critically ill patients. Subgroup analysis was conducted using a previous prospective, multicenter observational study. We included mechanically ventilated patients for more than 24 hours from 10 Korean and 15 Japanese intensive care units (ICU), and recorded maximal body temperature under the support of mechanical ventilation (MAX(MV)). To assess the independent association of MAX(MV) with TVT, we used propensity-matched analysis in a total of 769 survived patients with medical or surgical admission, separately. Together with multiple linear regression analysis to evaluate the association between the severity of fever and TVT, the effect of MAX(MV) on ventilator-free days was also observed by quantile regression analysis in all subjects including non-survivors. After propensity score matching, a MAX(MV) ≥ 37.5°C was significantly associated with longer mean TVT by 5.4 days in medical admission, and by 1.2 days in surgical admission, compared to those with MAX(MV) of 36.5°C to 37.4°C. In multivariate linear regression analysis, patients with three categories of fever (MAX(MV) of 37.5°C to 38.4°C, 38.5°C to 39.4°C, and ≥ 39.5°C) sustained a significantly longer duration of TVT than those with normal range of MAX(MV) in both categories of ICU admission. A significant association between MAX(MV) and mechanical ventilator-free days was also observed in all enrolled subjects. Fever may be a detrimental factor to prolong TVT in mechanically ventilated patients. These findings suggest that fever in mechanically ventilated patients might be associated with worse mechanical ventilation outcome.
Physical function in older men with hyperkyphosis.
Katzman, Wendy B; Harrison, Stephanie L; Fink, Howard A; Marshall, Lynn M; Orwoll, Eric; Barrett-Connor, Elizabeth; Cawthon, Peggy M; Kado, Deborah M
2015-05-01
Age-related hyperkyphosis has been associated with poor physical function and is a well-established predictor of adverse health outcomes in older women, but its impact on health in older men is less well understood. We conducted a cross-sectional study to evaluate the association of hyperkyphosis and physical function in 2,363 men, aged 71-98 (M = 79) from the Osteoporotic Fractures in Men Study. Kyphosis was measured using the Rancho Bernardo Study block method. Measurements of grip strength and lower extremity function, including gait speed over 6 m, narrow walk (measure of dynamic balance), repeated chair stands ability and time, and lower extremity power (Nottingham Power Rig) were included separately as primary outcomes. We investigated associations of kyphosis and each outcome in age-adjusted and multivariable linear or logistic regression models, controlling for age, clinic, education, race, bone mineral density, height, weight, diabetes, and physical activity. In multivariate linear regression, we observed a dose-related response of worse scores on each lower extremity physical function test as number of blocks increased, p for trend ≤.001. Using a cutoff of ≥4 blocks, 20% (N = 469) of men were characterized with hyperkyphosis. In multivariate logistic regression, men with hyperkyphosis had increased odds (range 1.5-1.8) of being in the worst quartile of performing lower extremity physical function tasks (p < .001 for each outcome). Kyphosis was not associated with grip strength in any multivariate analysis. Hyperkyphosis is associated with impaired lower extremity physical function in older men. Further studies are needed to determine the direction of causality. © The Author 2014. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
UK asbestos imports and mortality due to idiopathic pulmonary fibrosis.
Barber, C M; Wiggans, R E; Young, C; Fishwick, D
2016-03-01
Previous studies have demonstrated that the rising mortality due to mesothelioma and asbestosis can be predicted from historic asbestos usage. Mortality due to idiopathic pulmonary fibrosis (IPF) is also rising, without any apparent explanation. To compare mortality due to these conditions and examine the relationship between mortality and national asbestos imports. Mortality data for IPF and asbestosis in England and Wales were available from the Office for National Statistics. Data for mesothelioma deaths in England and Wales and historic UK asbestos import data were available from the Health & Safety Executive. The numbers of annual deaths due to each condition were plotted separately by gender, against UK asbestos imports 48 years earlier. Linear regression models were constructed. For mesothelioma and IPF, there was a significant linear relationship between the number of male and female deaths each year and historic UK asbestos imports. For asbestosis mortality, a similar relationship was found for male but not female deaths. The annual numbers of deaths due to asbestosis in both sexes were lower than for IPF and mesothelioma. The strength of the association between IPF mortality and historic asbestos imports was similar to that seen in an established asbestos-related disease, i.e. mesothelioma. This finding could in part be explained by diagnostic difficulties in separating asbestosis from IPF and highlights the need for a more accurate method of assessing lifetime occupational asbestos exposure. © Crown copyright 2015.
Mukozhiwa, S Y; Khamanga, S M M; Walker, R B
2017-09-01
A capillary zone electrophoresis (CZE) method for the quantitation of captopril (CPT) using UV detection was developed. Influence of electrolyte concentration and system variables on electrophoretic separation was evaluated and a central composite design (CCD) was used to optimize the method. Variables investigated were pH, molarity, applied voltage and capillary length. The influence of sodium metabisulphite on the stability of test solutions was also investigated. The use of sodium metabisulphite prevented degradation of CPT over 24 hours. A fused uncoated silica capillary of 67.5cm total and 57.5 cm effective length was used for analysis. The applied voltage and capillary length affected the migration time of CPT significantly. A 20 mM phosphate buffer adjusted to pH 7.0 was used as running buffer and an applied voltage of 23.90 kV was suitable to effect a separation. The optimized electrophoretic conditions produced sharp, well-resolved peaks for CPT and sodium metabisulphite. Linear regression analysis of the response for CPT standards revealed the method was linear (R2 = 0.9995) over the range 5-70 μg/mL. The limits of quantitation and detection were 5 and 1.5 μg/mL. A simple, rapid and reliable CZE method has been developed and successfully applied to the analysis of commercially available CPT products.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Green, Kimberly T.; Beckham, Jean C.; Youssef, Nagy; Elbogen, Eric B.
2013-01-01
Objective The present study sought to investigate the longitudinal effects of psychological resilience against alcohol misuse adjusting for socio-demographic factors, trauma-related variables, and self-reported history of alcohol abuse. Methodology Data were from National Post-Deployment Adjustment Study (NPDAS) participants who completed both a baseline and one-year follow-up survey (N=1090). Survey questionnaires measured combat exposure, probable posttraumatic stress disorder (PTSD), psychological resilience, and alcohol misuse, all of which were measured at two discrete time periods (baseline and one-year follow-up). Baseline resilience and change in resilience (increased or decreased) were utilized as independent variables in separate models evaluating alcohol misuse at the one-year follow-up. Results Multiple linear regression analyses controlled for age, gender, level of educational attainment, combat exposure, PTSD symptom severity, and self-reported alcohol abuse. Accounting for these covariates, findings revealed that lower baseline resilience, younger age, male gender, and self-reported alcohol abuse were related to alcohol misuse at the one-year follow-up. A separate regression analysis, adjusting for the same covariates, revealed a relationship between change in resilience (from baseline to the one-year follow-up) and alcohol misuse at the one-year follow-up. The regression model evaluating these variables in a subset of the sample in which all the participants had been deployed to Iraq and/or Afghanistan was consistent with findings involving the overall era sample. Finally, logistic regression analyses of the one-year follow-up data yielded similar results to the baseline and resilience change models. Conclusions These findings suggest that increased psychological resilience is inversely related to alcohol misuse and is protective against alcohol misuse over time. Additionally, it supports the conceptualization of resilience as a process which evolves over time. Moreover, our results underscore the importance of assessing resilience as part of alcohol use screening for preventing alcohol misuse in Iraq and Afghanistan era military veterans. PMID:24090625
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Effect of water-based recovery on blood lactate removal after high-intensity exercise.
Lucertini, Francesco; Gervasi, Marco; D'Amen, Giancarlo; Sisti, Davide; Rocchi, Marco Bruno Luigi; Stocchi, Vilberto; Benelli, Piero
2017-01-01
This study assessed the effectiveness of water immersion to the shoulders in enhancing blood lactate removal during active and passive recovery after short-duration high-intensity exercise. Seventeen cyclists underwent active water- and land-based recoveries and passive water and land-based recoveries. The recovery conditions lasted 31 minutes each and started after the identification of each cyclist's blood lactate accumulation peak, induced by a 30-second all-out sprint on a cycle ergometer. Active recoveries were performed on a cycle ergometer at 70% of the oxygen consumption corresponding to the lactate threshold (the control for the intensity was oxygen consumption), while passive recoveries were performed with subjects at rest and seated on the cycle ergometer. Blood lactate concentration was measured 8 times during each recovery condition and lactate clearance was modeled over a negative exponential function using non-linear regression. Actual active recovery intensity was compared to the target intensity (one sample t-test) and passive recovery intensities were compared between environments (paired sample t-tests). Non-linear regression parameters (coefficients of the exponential decay of lactate; predicted resting lactates; predicted delta decreases in lactate) were compared between environments (linear mixed model analyses for repeated measures) separately for the active and passive recovery modes. Active recovery intensities did not differ significantly from the target oxygen consumption, whereas passive recovery resulted in a slightly lower oxygen consumption when performed while immersed in water rather than on land. The exponential decay of blood lactate was not significantly different in water- or land-based recoveries in either active or passive recovery conditions. In conclusion, water immersion at 29°C would not appear to be an effective practice for improving post-exercise lactate removal in either the active or passive recovery modes.
Smooth individual level covariates adjustment in disease mapping.
Huque, Md Hamidul; Anderson, Craig; Walton, Richard; Woolford, Samuel; Ryan, Louise
2018-05-01
Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available "indiCAR" model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log-linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non-log-linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth-indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two-step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth-indiCAR through simulation. Our results indicate that the smooth-indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
Srivastava, Nishi; Srivastava, Amit; Srivastava, Sharad; Rawat, Ajay Kumar Singh; Khan, Abdul Rahman
2016-03-01
A rapid, sensitive, selective and robust quantitative densitometric high-performance thin-layer chromatographic method was developed and validated for separation and quantification of syringic acid (SYA) and kaempferol (KML) in the hydrolyzed extracts of Bergenia ciliata and Bergenia stracheyi. The separation was performed on silica gel 60F254 high-performance thin-layer chromatography plates using toluene : ethyl acetate : formic acid (5 : 4: 1, v/v/v) as the mobile phase. The quantification of SYA and KML was carried out using a densitometric reflection/absorption mode at 290 nm. A dense spot of SYA and KML appeared on the developed plate at a retention factor value of 0.61 ± 0.02 and 0.70 ± 0.01. A precise and accurate quantification was performed using linear regression analysis by plotting the peak area vs concentration 100-600 ng/band (correlation coefficient: r = 0.997, regression coefficient: R(2) = 0.996) for SYA and 100-600 ng/band (correlation coefficient: r = 0.995, regression coefficient: R(2) = 0.991) for KML. The developed method was validated in terms of accuracy, recovery and inter- and intraday study as per International Conference on Harmonisation guidelines. The limit of detection and limit of quantification of SYA and KML were determined, respectively, as 91.63, 142.26 and 277.67, 431.09 ng. The statistical data analysis showed that the method is reproducible and selective for the estimation of SYA and KML in extracts of B. ciliata and B. stracheyi. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
A novel encoding scheme for effective biometric discretization: Linearly Separable Subcode.
Lim, Meng-Hui; Teoh, Andrew Beng Jin
2013-02-01
Separability in a code is crucial in guaranteeing a decent Hamming-distance separation among the codewords. In multibit biometric discretization where a code is used for quantization-intervals labeling, separability is necessary for preserving distance dissimilarity when feature components are mapped from a discrete space to a Hamming space. In this paper, we examine separability of Binary Reflected Gray Code (BRGC) encoding and reveal its inadequacy in tackling interclass variation during the discrete-to-binary mapping, leading to a tradeoff between classification performance and entropy of binary output. To overcome this drawback, we put forward two encoding schemes exhibiting full-ideal and near-ideal separability capabilities, known as Linearly Separable Subcode (LSSC) and Partially Linearly Separable Subcode (PLSSC), respectively. These encoding schemes convert the conventional entropy-performance tradeoff into an entropy-redundancy tradeoff in the increase of code length. Extensive experimental results vindicate the superiority of our schemes over the existing encoding schemes in discretization performance. This opens up possibilities of achieving much greater classification performance with high output entropy.
Specialization Agreements in the Council for Mutual Economic Assistance
1988-02-01
proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations
ERIC Educational Resources Information Center
Pek, Jolynn; Wong, Octavia; Wong, C. M.
2017-01-01
Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Breast feeding and resilience against psychosocial stress
Montgomery, S M; Ehlin, A; Sacker, A
2006-01-01
Background Some early life exposures may result in a well controlled stress response, which can reduce stress related anxiety. Breast feeding may be a marker of some relevant exposures. Aims To assess whether breast feeding is associated with modification of the relation between parental divorce and anxiety. Methods Observational study using longitudinal birth cohort data. Linear regression was used to assess whether breast feeding modifies the association of parental divorce/separation with anxiety using stratification and interaction testing. Data were obtained from the 1970 British Cohort Study, which is following the lives of those born in one week in 1970 and living in Great Britain. This study uses information collected at birth and at ages 5 and 10 years for 8958 subjects. Class teachers answered a question on anxiety among 10 year olds using an analogue scale (range 0–50) that was log transformed to minimise skewness. Results Among 5672 non‐breast fed subjects, parental divorce/separation was associated with a statistically significantly raised risk of anxiety, with a regression coefficient (95% CI) of 9.4 (6.1 to 12.8). Among the breast fed group this association was much lower: 2.2 (−2.6 to 7.0). Interaction testing confirmed statistically significant effect modification by breast feeding, independent of simultaneous adjustment for multiple potential confounding factors, producing an interaction coefficient of −7.0 (−12.8 to −1.2), indicating a 7% reduction in anxiety after adjustment. Conclusions Breast feeding is associated with resilience against the psychosocial stress linked with parental divorce/separation. This could be because breast feeding is a marker of exposures related to maternal characteristics and parent–child interaction. PMID:16887859
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Absolute determination of copper and silver in ancient coins using 14 MeV neutrons
NASA Astrophysics Data System (ADS)
Chalouhi, Ch.; Hourani, E.; Loos, R.; Melki, S.
1982-09-01
A method for absolute determination of copper and silver in ancient coins is described. Activation analysis by 14 MeV neutrons is performed. In the experimental procedure emphasis is placed on corrections for neutrons and gamma attenuation. In the analytical procedure, a multi linear-regression calculation is used to separate different contributions to the 511 keV gamma peak. The precision in the absolute determination of Cu and Ag is better than 2% in recent coins of definite shapes, whereas it is a somewhat lower in ancient coins of irregular shapes. The method was applied to ancient coins provided by the Museum of the American University of Beirut. Overall consistency and suitability of the method were obtained.
Lau, Ying; Wong, Daniel Fu Keung; Wang, Yuqiong; Kwong, Dennis Ho Keung; Wang, Ying
2014-10-01
A community-based sample of 755 pregnant Chinese women were recruited to test the direct and moderating effects of social support in mitigating perceived stress associated with antenatal depressive or anxiety symptoms. The Social Support Rating Scale, the Perceived Stress Scale, the Edinburgh Depressive Postnatal Scale and the Zung Self-Rating Anxiety Scale were used. Social support was found to have direct effects and moderating effects on the women's perceived stress on antenatal depressive and anxiety symptoms in multiple linear regression models. This knowledge of the separate effects of social support on behavioral health is important to psychiatric nurse in planning preventive interventions. Copyright © 2014 Elsevier Inc. All rights reserved.
Yokoyama, H; Matsumoto, M; Shiraishi, H; Ishii, H
2000-04-01
We established a high performance liquid chromatography system that allowed simultaneous quantification of various retinoids. We applied the retinoids to a high performance liquid chromatography system with a silica gel absorption column. Samples were separated by the system with a binary multistep gradient with two kinds of solvent that contained n-Hexan, 2-propanol, and glacial acetic acid in different ratios. Each retinoid was detected at a wavelength of 350 nm. This condition allowed separation of 13-cis-retinoic acid, 9-cis-retinoic acid, all-trans-retinoic acid, 13-cis-retinol, all-trans-retinol, all-trans-4-oxo-retinoic acid, and 13-cis-4-oxo-retinoic acid as distinct single peaks. Each retinoid was also analyzed separately and its retention time determined. To ascertain the reliability of this system for retinoid quantification, retinoids at various concentrations were applied to the system. We observed the linearities between the concentration and area under the curve of the peak for each retinoid by linear least-squares regression analysis up to 2.5 ng/ml for all retinoic acids and up to 5 ng/ml for all retinols. There was no significant scattering in tests of within-day reproducibility or day-to-day reproducibility. Using this system, we examined effects of light exposure on isomerization of retinoids. When retinoids were exposed to room light for 2 hr, the amounts of all but 13-cis-retinol changed significantly. In particular, the amounts of all-trans-retinoic acid and 9-cis-retinoic acid were reduced by 40% and 60%, respectively. The HPLC system established in this study should be useful for studying the oxidation pathway of retinol to retinoic acid. A light-shielded condition is required when particular retinoic acids are analyzed.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes predictors from a MGLMM are always preferable to scatterplots of empirical Bayes predictors generated by separate models, unless the true association between outcomes is zero.
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Rybolt, Thomas R; Bivona, Kevin T; Thomas, Howard E; O'Dell, Casey M
2009-10-01
Gas-solid chromatography was used to determine B(2s) (gas-solid virial coefficient) values for eight molecular adsorbates interacting with a carbon powder (Carbopack B, Supelco). B(2s) values were determined by multiple size variant injections within the temperature range of 313-553 K. The molecular adsorbates included: carbon dioxide (CO(2)); tetrafluoromethane (CF(4)); hexafluoroethane (C(2)F(6)); 1,1-difluoroethane (C(2)H(4)F(2)); 1-chloro-1,1-difluoroethane (C(2)H(3)ClF(2)); dichlorodifluoromethane (CCl(2)F(2)); trichlorofluoromethane (CCl(3)F); and 1,1,1-trichloroethane (C(2)H(3)Cl(3)). Two of these molecules are of special interest because they are "super greenhouse gases". The global warming potential, GWP, for CF(4) is 6500 and for C(2)F(6) is 9200 relative to the reference value of 1 for CO(2). The GWP index considers both radiative blocking and molecular lifetime. For these and other industrial greenhouse gases, adsorptive trapping on a carbonaceous solid, which depends on molecule-surface binding energy, could avoid atmospheric release. The temperature variations of the gas-solid virial coefficients in conjunction with van't Hoff plots were used to find the experimental adsorption energy or binding energy values (E(*)) for each adsorbate. A molecular mechanics based, rough-surface model was used to calculate the molecule-surface binding energy (Ecal(*)) using augmented MM2 parameters. The surface model consisted of parallel graphene layers with two separated nanostructures each containing 17 benzene rings arranged in linear strips. The separation of the parallel nanostructures had been optimized in a prior study to appropriately represent molecule-surface interactions for Carbopack B. Linear regressions of E(*) versus Ecal(*) for the current data set of eight molecules and the same surface model gave E(*)=0.926 Ecal(*) and r(2)=0.956. A combined set of the current and prior Carbopack B adsorbates studied (linear alkanes, branched alkanes, cyclic alkanes, ethers, and halogenated hydrocarbons) gave a data set with 33 molecules and a regression of E(*)=0.991 Ecal(*) and r(2)=0.968. These results indicated a good correlation between the experimental and the MM2 computed molecule-surface binding energies.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
1994-09-01
Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System
1989-09-01
residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?
ERIC Educational Resources Information Center
Blankmeyer, Eric
2006-01-01
Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method
ERIC Educational Resources Information Center
Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev
2018-01-01
The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
NASA Astrophysics Data System (ADS)
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
García-Esquinas, Esther; Pérez-Gómez, Beatriz; Fernández, Mario Antonio; Pérez-Meixeira, Ana María; Gil, Elisa; de Paz, Concha; Iriso, Andrés; Sanz, Juan Carlos; Astray, Jenaro; Cisneros, Margot; de Santos, Amparo; Asensio, Angel; García-Sagredo, José Miguel; García, José Frutos; Vioque, Jesus; Pollán, Marina; López-Abente, Gonzalo; González, Maria José; Martínez, Mercedes; Bohigas, Pedro Arias; Pastor, Roberto; Aragonés, Nuria
2011-09-01
Although breastfeeding is the ideal way of nurturing infants, it can be a source of exposure to toxicants. This study reports the concentration of Hg, Pb and Cd in breast milk from a sample of women drawn from the general population of the Madrid Region, and explores the association between metal levels and socio-demographic factors, lifestyle habits, diet and environmental exposures, including tobacco smoke, exposure at home and occupational exposures. Breast milk was obtained from 100 women (20 mL) at around the third week postpartum. Pb, Cd and Hg levels were determined using Atomic Absorption Spectrometry. Metal levels were log-transformed due to non-normal distribution. Their association with the variables collected by questionnaire was assessed using linear regression models. Separate models were fitted for Hg, Pb and Cd, using univariate linear regression in a first step. Secondly, multivariate linear regression models were adjusted introducing potential confounders specific for each metal. Finally, a test for trend was performed in order to evaluate possible dose-response relationships between metal levels and changes in variables categories. Geometric mean Hg, Pb and Cd content in milk were 0.53 μg L(-1), 15.56 μg L(-1), and 1.31 μg L(-1), respectively. Decreases in Hg levels in older women and in those with a previous history of pregnancies and lactations suggested clearance of this metal over lifetime, though differences were not statistically significant, probably due to limited sample size. Lead concentrations increased with greater exposure to motor vehicle traffic and higher potato consumption. Increased Cd levels were associated with type of lactation and tended to increase with tobacco smoking. Surveillance for the presence of heavy metals in human milk is needed. Smoking and dietary habits are the main factors linked to heavy metal levels in breast milk. Our results reinforce the need to strengthen national food safety programs and to further promote avoidance of unhealthy behaviors such as smoking during pregnancy. Copyright © 2011 Elsevier Ltd. All rights reserved.
García-Ramos, Amador; Pestaña-Melero, Francisco L; Pérez-Castilla, Alejandro; Rojas, Francisco J; Gregory Haff, G
2018-05-01
García-Ramos, A, Pestaña-Melero, FL, Pérez-Castilla, A, Rojas, FJ, and Haff, GG. Mean velocity vs. mean propulsive velocity vs. peak velocity: which variable determines bench press relative load with higher reliability? J Strength Cond Res 32(5): 1273-1279, 2018-This study aimed to compare between 3 velocity variables (mean velocity [MV], mean propulsive velocity [MPV], and peak velocity [PV]): (a) the linearity of the load-velocity relationship, (b) the accuracy of general regression equations to predict relative load (%1RM), and (c) the between-session reliability of the velocity attained at each percentage of the 1-repetition maximum (%1RM). The full load-velocity relationship of 30 men was evaluated by means of linear regression models in the concentric-only and eccentric-concentric bench press throw (BPT) variants performed with a Smith machine. The 2 sessions of each BPT variant were performed within the same week separated by 48-72 hours. The main findings were as follows: (a) the MV showed the strongest linearity of the load-velocity relationship (median r = 0.989 for concentric-only BPT and 0.993 for eccentric-concentric BPT), followed by MPV (median r = 0.983 for concentric-only BPT and 0.980 for eccentric-concentric BPT), and finally PV (median r = 0.974 for concentric-only BPT and 0.969 for eccentric-concentric BPT); (b) the accuracy of the general regression equations to predict relative load (%1RM) from movement velocity was higher for MV (SEE = 3.80-4.76%1RM) than for MPV (SEE = 4.91-5.56%1RM) and PV (SEE = 5.36-5.77%1RM); and (c) the PV showed the lowest within-subjects coefficient of variation (3.50%-3.87%), followed by MV (4.05%-4.93%), and finally MPV (5.11%-6.03%). Taken together, these results suggest that the MV could be the most appropriate variable for monitoring the relative load (%1RM) in the BPT exercise performed in a Smith machine.
Relationships between use of television during meals and children's food consumption patterns.
Coon, K A; Goldberg, J; Rogers, B L; Tucker, K L
2001-01-01
We examined relationships between the presence of television during meals and children's food consumption patterns to test whether children's overall food consumption patterns, including foods not normally advertised, vary systematically with the extent to which television is part of normal mealtime routines. Ninety-one parent-child pairs from suburbs adjacent to Washington, DC, recruited via advertisements and word of mouth, participated. Children were in the fourth, fifth, or sixth grades. Socioeconomic data and information on television use were collected during survey interviews. Three nonconsecutive 24-hour dietary recalls, conducted with each child, were used to construct nutrient and food intake outcome variables. Independent sample t tests were used to compare mean food and nutrient intakes of children from families in which the television was usually on during 2 or more meals (n = 41) to those of children from families in which the television was either never on or only on during one meal (n = 50). Multiple linear regression models, controlling for socioeconomic factors and other covariates, were used to test strength of associations between television and children's consumption of food groups and nutrients. Children from families with high television use derived, on average, 6% more of their total daily energy intake from meats; 5% more from pizza, salty snacks, and soda; and nearly 5% less of their energy intake from fruits, vegetables, and juices than did children from families with low television use. Associations between television and children's consumption of food groups remained statistically significant in multiple linear regression models that controlled for socioeconomic factors and other covariates. Children from high television families derived less of their total energy from carbohydrate and consumed twice as much caffeine as children from low television families. There continued to be a significant association between television and children's consumption of caffeine when these relationships were tested in multiple linear regression models. The dietary patterns of children from families in which television viewing is a normal part of meal routines may include fewer fruits and vegetables and more pizzas, snack foods, and sodas than the dietary patterns of children from families in which television viewing and eating are separate activities.
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C
2015-01-01
Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
Validity and validation of expert (Q)SAR systems.
Hulzebos, E; Sijm, D; Traas, T; Posthumus, R; Maslankiewicz, L
2005-08-01
At a recent workshop in Setubal (Portugal) principles were drafted to assess the suitability of (quantitative) structure-activity relationships ((Q)SARs) for assessing the hazards and risks of chemicals. In the present study we applied some of the Setubal principles to test the validity of three (Q)SAR expert systems and validate the results. These principles include a mechanistic basis, the availability of a training set and validation. ECOSAR, BIOWIN and DEREK for Windows have a mechanistic or empirical basis. ECOSAR has a training set for each QSAR. For half of the structural fragments the number of chemicals in the training set is >4. Based on structural fragments and log Kow, ECOSAR uses linear regression to predict ecotoxicity. Validating ECOSAR for three 'valid' classes results in predictivity of > or = 64%. BIOWIN uses (non-)linear regressions to predict the probability of biodegradability based on fragments and molecular weight. It has a large training set and predicts non-ready biodegradability well. DEREK for Windows predictions are supported by a mechanistic rationale and literature references. The structural alerts in this program have been developed with a training set of positive and negative toxicity data. However, to support the prediction only a limited number of chemicals in the training set is presented to the user. DEREK for Windows predicts effects by 'if-then' reasoning. The program predicts best for mutagenicity and carcinogenicity. Each structural fragment in ECOSAR and DEREK for Windows needs to be evaluated and validated separately.
Howell, Kathryn H; Thurston, Idia B; Hasselle, Amanda J; Decker, Kristina; Jamison, Lacy E
2018-04-01
Children are frequently present in homes in which intimate partner violence (IPV) occurs. Following exposure to IPV, children may develop behavioral health difficulties, struggle with regulating emotions, or exhibit aggression. Despite the negative outcomes associated with witnessing IPV, many children also display resilience. Guided by Bronfenbrenner's bioecological model, this study examined person-level, process-level (microsystem), and context-level (mesosystem) factors associated with positive and negative functioning among youth exposed to IPV. Participants were 118 mothers who reported on their 6- to 14-year-old children. All mothers experienced severe physical, psychological, and/or sexual IPV in the past 6 months. Linear regression modeling was conducted separately for youth maladaptive functioning and prosocial skills. The linear regression model for maladaptive functioning was significant, F(6, 110) = 9.32, p < .001, adj R 2 = 27%, with more severe IPV (β = .18, p < .05) and more negative parenting practices (β = .34, p < .001) associated with worse child outcomes. The model for prosocial skills was also significant, F(6, 110) = 3.34, p < .01, adj. R 2 = 14%, with less negative parenting practices (β = -.26, p < .001) and greater community connectedness (β = .17, p < .05) linked to more prosocial skills. These findings provide critical knowledge on specific mutable factors associated with positive and negative functioning among children in the context of IPV exposure. Such factors could be incorporated into strength-based interventions following family violence.
Yazdani, Kamran; Rahimi-Movaghar, Afarin; Nedjat, Saharnaz; Ghalichi, Leila; Khalili, Malahat
2015-01-01
Since Tehran University of Medical Sciences (TUMS) has the oldest and highest number of research centers among all Iranian medical universities, this study was conducted to evaluate scientific output of research centers affiliated to Tehran University of Medical Sciences (TUMS) using scientometric indices and the affecting factors. Moreover, a number of scientometric indicators were introduced. This cross-sectional study was performed to evaluate a 5-year scientific performance of research centers of TUMS. Data were collected through questionnaires, annual evaluation reports of the Ministry of Health, and also from Scopus database. We used appropriate measures of central tendency and variation for descriptive analyses. Moreover, uni-and multi-variable linear regression were used to evaluate the effect of independent factors on the scientific output of the centers. The medians of the numbers of papers and books during a 5-year period were 150.5 and 2.5 respectively. The median of the "articles per researcher" was 19.1. Based on multiple linear regression, younger age centers (p=0.001), having a separate budget line (p=0.016), and number of research personnel (p<0.001) had a direct significant correlation with the number of articles while real properties had a reverse significant correlation with it (p=0.004). The results can help policy makers and research managers to allocate sufficient resources to improve current situation of the centers. Newly adopted and effective scientometric indices are is suggested to be used to evaluate scientific outputs and functions of these centers.
NASA Astrophysics Data System (ADS)
Mills, Leila A.
This study examines middle school students' perceptions of a future career in a science, math, engineering, or technology (STEM) career field. Gender, grade, predispositions to STEM contents, and learner dispositions are examined for changing perceptions and development in career-related choice behavior. Student perceptions as measured by validated measurement instruments are analyzed pre and post participation in a STEM intervention energy-monitoring program that was offered in several U.S. middle schools during the 2009-2010, 2010-2011 school years. A multiple linear regression (MLR) model, developed by incorporating predictors identified by an examination of the literature and a hypothesis-generating pilot study for prediction of STEM career interest, is introduced. Theories on the career choice development process from authors such as Ginzberg, Eccles, and Lent are examined as the basis for recognition of career concept development among students. Multiple linear regression statistics, correlation analysis, and analyses of means are used to examine student data from two separate program years. Study research questions focus on predictive ability, RSQ, of MLR models by gender/grade, and significance of model predictors in order to determine the most significant predictors of STEM career interest, and changes in students' perceptions pre and post program participation. Analysis revealed increases in the perceptions of a science career, decreases in perceptions of a STEM career, increase of the significance of science and mathematics to predictive models, and significant increases in students' perceptions of creative tendencies.
Early life stress and physical and psychosocial functioning in late adulthood.
Alastalo, Hanna; von Bonsdorff, Mikaela B; Räikkönen, Katri; Pesonen, Anu-Katriina; Osmond, Clive; Barker, David J P; Heinonen, Kati; Kajantie, Eero; Eriksson, Johan G
2013-01-01
Severe stress experienced in early life may have long-term effects on adult physiological and psychological health and well-being. We studied physical and psychosocial functioning in late adulthood in subjects separated temporarily from their parents in childhood during World War II. The 1803 participants belong to the Helsinki Birth Cohort Study, born 1934-44. Of them, 267 (14.8%) had been evacuated abroad in childhood during WWII and the remaining subjects served as controls. Physical and psychosocial functioning was assessed with the Short Form 36 scale (SF-36) between 2001 and 2004. A test for trends was based on linear regression. All analyses were adjusted for age at clinical examination, social class in childhood and adulthood, smoking, alcohol intake, physical activity, body mass index, cardiovascular disease and diabetes. Physical functioning in late adulthood was lower among the separated men compared to non-separated men (b = -0.40, 95% confidence interval [95% CI]: -0.71 to -0.08). Those men separated in school age (>7 years) and who were separated for a duration over 2 years had the highest risk for lower physical functioning (b = -0.89, 95% CI: -1.58 to -0.20) and (b = -0.65, 95% CI: -1.25 to -0.05), respectively). Men separated for a duration over 2 years also had lower psychosocial functioning (b = -0.70, 95% CI: -1.35 to -0.06). These differences in physical and psychosocial functioning were not observed among women. Early life stress may increase the risk for impaired physical functioning in late adulthood among men. Timing and duration of the separation influenced the physical and psychosocial functioning in late adulthood.
Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan
2014-02-10
Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe
2010-12-01
To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Stano, Geoffrey T.; Fuelberg, Henry E.; Roeder, William P.
2010-01-01
This research addresses the 45th Weather Squadron's (45WS) need for improved guidance regarding lightning cessation at Cape Canaveral Air Force Station and Kennedy Space Center (KSC). KSC's Lightning Detection and Ranging (LDAR) network was the primary observational tool to investigate both cloud-to-ground and intracloud lightning. Five statistical and empirical schemes were created from LDAR, sounding, and radar parameters derived from 116 storms. Four of the five schemes were unsuitable for operational use since lightning advisories would be canceled prematurely, leading to safety risks to personnel. These include a correlation and regression tree analysis, three variants of multiple linear regression, event time trending, and the time delay between the greatest height of the maximum dBZ value to the last flash. These schemes failed to adequately forecast the maximum interval, the greatest time between any two flashes in the storm. The majority of storms had a maximum interval less than 10 min, which biased the schemes toward small values. Success was achieved with the percentile method (PM) by separating the maximum interval into percentiles for the 100 dependent storms.
Online EEG artifact removal for BCI applications by adaptive spatial filtering.
Guarnieri, Roberto; Marino, Marco; Barban, Federico; Ganzetti, Marco; Mantini, Dante
2018-06-28
The performance of brain computer interfaces (BCIs) based on electroencephalography (EEG) data strongly depends on the effective attenuation of artifacts that are mixed in the recordings. To address this problem, we have developed a novel online EEG artifact removal method for BCI applications, which combines blind source separation (BSS) and regression (REG) analysis. The BSS-REG method relies on the availability of a calibration dataset of limited duration for the initialization of a spatial filter using BSS. Online artifact removal is implemented by dynamically adjusting the spatial filter in the actual experiment, based on a linear regression technique. Our results showed that the BSS-REG method is capable of attenuating different kinds of artifacts, including ocular and muscular, while preserving true neural activity. Thanks to its low computational requirements, BSS-REG can be applied to low-density as well as high-density EEG data. We argue that BSS-REG may enable the development of novel BCI applications requiring high-density recordings, such as source-based neurofeedback and closed-loop neuromodulation. © 2018 IOP Publishing Ltd.
ERIC Educational Resources Information Center
Rule, David L.
Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance
1989-12-01
v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
1990-03-01
and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course
ERIC Educational Resources Information Center
Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna
2010-01-01
Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Morrell, Glen R; Ikizler, Talat A; Chen, Xiaorui; Heilbrun, Marta E; Wei, Guo; Boucher, Robert; Beddhu, Srinivasan
2016-07-01
We investigate whether psoas or paraspinous muscle area measured on a single L4-L5 image is a useful measure of whole lean body mass (LBM) compared to dedicated midthigh magnetic resonance imaging (MRI). Observational study. Outpatient dialysis units and a research clinic. One hundred five adult participants on maintenance hemodialysis. No control group was used. Psoas muscle area, paraspinous muscle area, and midthigh muscle area (MTMA) were measured by magnetic resonance imaging. LBM was measured by dual-energy absorptiometry scan. In separate multivariable linear regression models, psoas, paraspinous, and MTMA were associated with increase in LBM. In separate multivariate logistic regression models, C statistics for diagnosis of sarcopenia (defined as <25th percentile of LBM) were 0.69 for paraspinous muscle area, 0.81 for psoas muscle area, and 0.89 for MTMA. With sarcopenia defined as <10th percentile of LBM, the corresponding C statistics were 0.71, 0.92, and 0.94. We conclude that psoas muscle area provides a good measure of whole-body muscle mass, better than paraspinous muscle area but slightly inferior to midthigh measurement. Hence, in body composition studies a single axial MR image at the L4-L5 level can be used to provide information on both fat and muscle and may eliminate the need for time-consuming measurement of muscle area in the thigh. Copyright © 2016 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Linear and nonlinear spectroscopy from quantum master equations.
Fetherolf, Jonathan H; Berkelbach, Timothy C
2017-12-28
We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
Linear and nonlinear spectroscopy from quantum master equations
NASA Astrophysics Data System (ADS)
Fetherolf, Jonathan H.; Berkelbach, Timothy C.
2017-12-01
We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Regression of non-linear coupling of noise in LIGO detectors
NASA Astrophysics Data System (ADS)
Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.
2018-03-01
In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
He, Kang-Hao; Zou, Xiao-Li; Liu, Xiang; Zeng, Hong-Yan
2012-01-01
A method using reversed phase high performance liquid chromatography (RP-HPLC) coupled with diode array detector (DAD) was developed for the simultaneous determination of canthaxanthin and astaxanthin in egg yolks. Samples were extracted with acetonitrile in ultrasonic bath for 20 minutes and then purified by freezing-lipid filtration and solid phase extraction (SPE). After being vaporized to dryness by nitrogen blowing and made up to volume with methanol, the extract solution was chromatographically separated in C18 column with a unitary mobile phase consisting of acetonitrile. The proposed method was validated in terms of linearity, precision, accuracy, and limit of detection (LOD). Regression analysis revealed a good linearity between peak area of each analyte and its concentration (r > or = 0.998). The intra- and inter-day relative standard deviations (RSDs) were less than 3.6% and 5.2%, respectively. LODs of canthaxanthin and astaxanthin were 0.035 and 0.027 microg/mL (S/N = 3). The average recoveries of canthaxanthin and astaxanthin were 91.5% and 88.7%. The proposed method is simple, fast and easy to apply.
Structured chaos in a devil's staircase of the Josephson junction.
Shukrinov, Yu M; Botha, A E; Medvedeva, S Yu; Kolahchi, M R; Irie, A
2014-09-01
The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior. These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.
Structured chaos in a devil's staircase of the Josephson junction
NASA Astrophysics Data System (ADS)
Shukrinov, Yu. M.; Botha, A. E.; Medvedeva, S. Yu.; Kolahchi, M. R.; Irie, A.
2014-09-01
The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior. These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.
Transient Modeling of Hybrid Rocket Low Frequency Instabilities
NASA Technical Reports Server (NTRS)
Karabeyoglu, M. Arif; DeZilwa, Shane; Cantwell, Brian; Zilliac, Greg
2003-01-01
A comprehensive dynamic model of a hybrid rocket has been developed in order to understand and predict the transient behavior including instabilities. A linearized version of the transient model predicted the low-frequency chamber pressure oscillations that are commonly observed in hybrids. The source of the instabilities is based on a complex coupling of thermal transients in the solid fuel, wall heat transfer blocking due to fuel regression rate and the transients in the boundary layer that forms on the fuel surface. The oscillation frequencies predicted by the linearized theory are in very good agreement with 43 motor test results obtained from the hybrid propulsion literature. The motor test results used in the comparison cover a very wide spectrum of parameters including: 1) four separate research and development programs, 2) three different oxidizers (LOX, GOX, N2O), 3) a wide range of motor dimensions (i.e. from 5 inch diameter to 72 inch diameter) and operating conditions and 4) several fuel formulations. A simple universal scaling formula for the frequency of the primary oscillation mode is suggested.
Using Bar Velocity to Predict the Maximum Dynamic Strength in the Half-Squat Exercise.
Loturco, Irineu; Pereira, Lucas A; Cal Abad, Cesar C; Gil, Saulo; Kitamura, Katia; Kobal, Ronaldo; Nakamura, Fábio Y
2016-07-01
To determine whether athletes from different sport disciplines present similar mean propulsive velocity (MPV) in the half-squat (HS) during submaximal and maximal tests, enabling prediction of 1-repetition maximum (1-RM) from MPV at any given submaximal load. Sixty-four male athletes, comprising American football, rugby, and soccer players; sprinters and jumpers; and combat-sport strikers attended 2 testing sessions separated by 2-4 wk. On the first visit, a standardized 1-RM test was performed. On the second, athletes performed HSs on Smith-machine equipment, using relative percentages of 1-RM to determine the respective MPV of submaximal and maximal loads. Linear regression established the relationship between MPV and percentage of 1-RM. A very strong linear relationship (R2 ≈ .96) was observed between the MPV and the percentages of HS 1-RM, resulting in the following equation: %HS 1-RM = -105.05 × MPV + 131.75. The MPV at HS 1-RM was ~0.3 m/s. This equation can be used to predict HS 1-RM on a Smith machine with a high degree of accuracy.
Modeling demand for public transit services in rural areas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Attaluri, P.; Seneviratne, P.N.; Javid, M.
1997-05-01
Accurate estimates of demand are critical for planning, designing, and operating public transit systems. Previous research has demonstrated that the expected demand in rural areas is a function of both demographic and transit system variables. Numerous models have been proposed to describe the relationship between the aforementioned variables. However, most of them are site specific and their validity over time and space is not reported or perhaps has not been tested. Moreover, input variables in some cases are extremely difficult to quantify. In this article, the estimation of demand using the generalized linear modeling technique is discussed. Two separate models,more » one for fixed-route and another for demand-responsive services, are presented. These models, calibrated with data from systems in nine different states, are used to demonstrate the appropriateness and validity of generalized linear models compared to the regression models. They explain over 70% of the variation in expected demand for fixed-route services and 60% of the variation in expected demand for demand-responsive services. It was found that the models are spatially transferable and that data for calibration are easily obtainable.« less
Structured chaos in a devil's staircase of the Josephson junction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shukrinov, Yu. M.; Botha, A. E., E-mail: bothaae@unisa.ac.za; Medvedeva, S. Yu.
2014-09-01
The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior.more » These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.« less
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Pandya, Jui J; Sanyal, Mallika; Shrivastav, Pranav S
2017-09-01
A new, simple, accurate and precise high-performance thin-layer chromatographic method has been developed and validated for simultaneous determination of an anthelmintic drug, albendazole, and its active metabolite albendazole, sulfoxide. Planar chromatographic separation was performed on aluminum-backed layer of silica gel 60G F 254 using a mixture of toluene-acetonitrile-glacial acetic acid (7.0:2.9:0.1, v/v/v) as the mobile phase. For quantitation, the separated spots were scanned densitometrically at 225 nm. The retention factors (R f ) obtained under the established conditions were 0.76 ± 0.01 and 0.50 ± 0.01 and the regression plots were linear (r 2 ≥ 0.9997) in the concentration ranges 50-350 and 100-700 ng/band for albendazole and albendazole sulfoxide, respectively. The method was validated for linearity, specificity, accuracy (recovery) and precision, repeatability, stability and robustness. The limit of detection and limit of quantitation found were 9.84 and 29.81 ng/band for albendazole and 21.60 and 65.45 ng/band for albendazole sulfoxide, respectively. For plasma samples, solid-phase extraction of analytes yielded mean extraction recoveries of 87.59 and 87.13% for albendazole and albendazole sulfoxide, respectively. The method was successfully applied for the analysis of albendazole in pharmaceutical formulations with accuracy ≥99.32%. Copyright © 2017 John Wiley & Sons, Ltd.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES
Zhu, Liping; Huang, Mian; Li, Runze
2012-01-01
This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Multi-model ensemble combinations of the water budget in the East/Japan Sea
NASA Astrophysics Data System (ADS)
HAN, S.; Hirose, N.; Usui, N.; Miyazawa, Y.
2016-02-01
The water balance of East/Japan Sea is determined mainly by inflow and outflow through the Korea/Tsushima, Tsugaru and Soya/La Perouse Straits. However, the volume transports measured at three straits remain quantitatively unbalanced. This study examined the seasonal variation of the volume transport using the multiple linear regression and ridge regression of multi-model ensemble (MME) methods to estimate physically consistent circulation in East/Japan Sea by using four different data assimilation models. The MME outperformed all of the single models by reducing uncertainties, especially the multicollinearity problem with the ridge regression. However, the regression constants turned out to be inconsistent with each other if the MME was applied separately for each strait. The MME for a connected system was thus performed to find common constants for these straits. The estimation of this MME was found to be similar to the MME result of sea level difference (SLD). The estimated mean transport (2.42 Sv) was smaller than the measurement data at the Korea/Tsushima Strait, but the calibrated transport of the Tsugaru Strait (1.63 Sv) was larger than the observed data. The MME results of transport and SLD also suggested that the standard deviation (STD) of the Korea/Tsushima Strait is larger than the STD of the observation, whereas the estimated results were almost identical to that observed for the Tsugaru and Soya/La Perouse Straits. The similarity between MME results enhances the reliability of the present MME estimation.
Multi-model ensemble estimation of volume transport through the straits of the East/Japan Sea
NASA Astrophysics Data System (ADS)
Han, Sooyeon; Hirose, Naoki; Usui, Norihisa; Miyazawa, Yasumasa
2016-01-01
The volume transports measured at the Korea/Tsushima, Tsugaru, and Soya/La Perouse Straits remain quantitatively inconsistent. However, data assimilation models at least provide a self-consistent budget despite subtle differences among the models. This study examined the seasonal variation of the volume transport using the multiple linear regression and ridge regression of multi-model ensemble (MME) methods to estimate more accurately transport at these straits by using four different data assimilation models. The MME outperformed all of the single models by reducing uncertainties, especially the multicollinearity problem with the ridge regression. However, the regression constants turned out to be inconsistent with each other if the MME was applied separately for each strait. The MME for a connected system was thus performed to find common constants for these straits. The estimation of this MME was found to be similar to the MME result of sea level difference (SLD). The estimated mean transport (2.43 Sv) was smaller than the measurement data at the Korea/Tsushima Strait, but the calibrated transport of the Tsugaru Strait (1.63 Sv) was larger than the observed data. The MME results of transport and SLD also suggested that the standard deviation (STD) of the Korea/Tsushima Strait is larger than the STD of the observation, whereas the estimated results were almost identical to that observed for the Tsugaru and Soya/La Perouse Straits. The similarity between MME results enhances the reliability of the present MME estimation.
Anderson, Emma L; Tilling, Kate; Fraser, Abigail; Macdonald-Wallis, Corrie; Emmett, Pauline; Cribb, Victoria; Northstone, Kate; Lawlor, Debbie A; Howe, Laura D
2013-07-01
Methods for the assessment of changes in dietary intake across the life course are underdeveloped. We demonstrate the use of linear-spline multilevel models to summarize energy-intake trajectories through childhood and adolescence and their application as exposures, outcomes, or mediators. The Avon Longitudinal Study of Parents and Children assessed children's dietary intake several times between ages 3 and 13 years, using both food frequency questionnaires (FFQs) and 3-day food diaries. We estimated energy-intake trajectories for 12,032 children using linear-spline multilevel models. We then assessed the associations of these trajectories with maternal body mass index (BMI), and later offspring BMI, and also their role in mediating the relation between maternal and offspring BMIs. Models estimated average and individual energy intake at 3 years, and linear changes in energy intake from age 3 to 7 years and from age 7 to 13 years. By including the exposure (in this example, maternal BMI) in the multilevel model, we were able to estimate the average energy-intake trajectories across levels of the exposure. When energy-intake trajectories are the exposure for a later outcome (in this case offspring BMI) or a mediator (between maternal and offspring BMI), results were similar, whether using a two-step process (exporting individual-level intercepts and slopes from multilevel models and using these in linear regression/path analysis), or a single-step process (multivariate multilevel models). Trajectories were similar when FFQs and food diaries were assessed either separately, or when combined into one model. Linear-spline multilevel models provide useful summaries of trajectories of dietary intake that can be used as an exposure, outcome, or mediator.
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
NASA Astrophysics Data System (ADS)
Validi, AbdoulAhad
2014-03-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.
Tarasova, Irina A; Goloborodko, Anton A; Perlova, Tatyana Y; Pridatchenko, Marina L; Gorshkov, Alexander V; Evreinov, Victor V; Ivanov, Alexander R; Gorshkov, Mikhail V
2015-07-07
The theory of critical chromatography for biomacromolecules (BioLCCC) describes polypeptide retention in reversed-phase HPLC using the basic principles of statistical thermodynamics. However, whether this theory correctly depicts a variety of empirical observations and laws introduced for peptide chromatography over the last decades remains to be determined. In this study, by comparing theoretical results with experimental data, we demonstrate that the BioLCCC: (1) fits the empirical dependence of the polypeptide retention on the amino acid sequence length with R(2) > 0.99 and allows in silico determination of the linear regression coefficients of the log-length correction in the additive model for arbitrary sequences and lengths and (2) predicts the distribution coefficients of polypeptides with an accuracy from 0.98 to 0.99 R(2). The latter enables direct calculation of the retention factors for given solvent compositions and modeling of the migration dynamics of polypeptides separated under isocratic or gradient conditions. The obtained results demonstrate that the suggested theory correctly relates the main aspects of polypeptide separation in reversed-phase HPLC.
Meng, Jiang; Leung, Kelvin Sze-Yin; Dong, Xiao-Ping; Zhou, Yi-Sheng; Jiang, Zhi-Hong; Zhao, Zhong-Zhen
2009-12-01
An on-line high performance liquid chromatography (HPLC)-diode array detector (DAD)-electrospray ionization mass spectrometry (ESI-MS) method has been developed to quantify simultaneously eight bioactive chemical components in Houttuynia cordata Thunb and related Saururaceae medicinal plants. Simultaneous separation of these eight compounds was achieved on a C(18) analytical column with gradient elution of acetonitrile and 0.2% acetic acid (v/v) at a flow rate of 0.6 mL/min and being detected at 280 nm. These eight compounds were completely separated within 90 min. Good linear regression relationship (r(2)>0.9978) within test ranges was shown in all calibration curves. Good repeatabilty for the quantification of these eight compounds in H.cordata was also demonstrated in this method, with intra- and inter-day variations less than 3.0%. The method established was successfully applied to quantify eight bioactive compounds in closely related species of H.cordata, which provides a new basis for quality assessment of H.cordata.
Lotfy, Hayam Mahmoud; Hegazy, Maha A; Rezk, Mamdouh R; Omran, Yasmin Rostom
2014-05-21
Two smart and novel spectrophotometric methods namely; absorbance subtraction (AS) and amplitude modulation (AM) were developed and validated for the determination of a binary mixture of timolol maleate (TIM) and dorzolamide hydrochloride (DOR) in presence of benzalkonium chloride without prior separation, using unified regression equation. Additionally, simple, specific, accurate and precise spectrophotometric methods manipulating ratio spectra were developed and validated for simultaneous determination of the binary mixture namely; simultaneous ratio subtraction (SRS), ratio difference (RD), ratio subtraction (RS) coupled with extended ratio subtraction (EXRS), constant multiplication method (CM) and mean centering of ratio spectra (MCR). The proposed spectrophotometric procedures do not require any separation steps. Accuracy, precision and linearity ranges of the proposed methods were determined and the specificity was assessed by analyzing synthetic mixtures of both drugs. They were applied to their pharmaceutical formulation and the results obtained were statistically compared to that of a reported spectrophotometric method. The statistical comparison showed that there is no significant difference between the proposed methods and the reported one regarding both accuracy and precision. Copyright © 2014 Elsevier B.V. All rights reserved.
Kinetic rate constant prediction supports the conformational selection mechanism of protein binding.
Moal, Iain H; Bates, Paul A
2012-01-01
The prediction of protein-protein kinetic rate constants provides a fundamental test of our understanding of molecular recognition, and will play an important role in the modeling of complex biological systems. In this paper, a feature selection and regression algorithm is applied to mine a large set of molecular descriptors and construct simple models for association and dissociation rate constants using empirical data. Using separate test data for validation, the predicted rate constants can be combined to calculate binding affinity with accuracy matching that of state of the art empirical free energy functions. The models show that the rate of association is linearly related to the proportion of unbound proteins in the bound conformational ensemble relative to the unbound conformational ensemble, indicating that the binding partners must adopt a geometry near to that of the bound prior to binding. Mirroring the conformational selection and population shift mechanism of protein binding, the models provide a strong separate line of evidence for the preponderance of this mechanism in protein-protein binding, complementing structural and theoretical studies.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Image interpolation via regularized local linear regression.
Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang
2011-12-01
The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
Flexible cue combination in the guidance of attention in visual search
Brand, John; Oriet, Chris; Johnson, Aaron P.; Wolfe, Jeremy M.
2014-01-01
Hodsoll and Humphreys (2001) have assessed the relative contributions of stimulus-driven and user-driven knowledge on linearly- and nonlinearly separable search. However, the target feature used to determine linear separability in their task (i.e., target size) was required to locate the target. In the present work, we investigated the contributions of stimulus-driven and user-driven knowledge when a linearly- or nonlinearly-separable feature is available but not required for target identification. We asked observers to complete a series of standard color X orientation conjunction searches in which target size was either linearly- or nonlinearly separable from the size of the distractors. When guidance by color X orientation and by size information are both available, observers rely on whichever information results in the best search efficiency. This is the case irrespective of whether we provide target foreknowledge by blocking stimulus conditions, suggesting that feature information is used in both a stimulus-driven and user-driven fashion. PMID:25463553
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Kumar, K Vasanth; Porkodi, K; Rocha, F
2008-01-15
A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
Air cooled turbine component having an internal filtration system
Beeck, Alexander R [Orlando, FL
2012-05-15
A centrifugal particle separator is provided for removing particles such as microscopic dirt or dust particles from the compressed cooling air prior to reaching and cooling the turbine blades or turbine vanes of a turbine engine. The centrifugal particle separator structure has a substantially cylindrical body with an inlet arranged on a periphery of the substantially cylindrical body. Cooling air enters centrifugal particle separator through the separator inlet port having a linear velocity. When the cooling air impinges the substantially cylindrical body, the linear velocity is transformed into a rotational velocity, separating microscopic particles from the cooling air. Microscopic dust particles exit the centrifugal particle separator through a conical outlet and returned to a working medium.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma
2016-01-01
Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.
Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko
2016-03-01
In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
An Application to the Prediction of LOD Change Based on General Regression Neural Network
NASA Astrophysics Data System (ADS)
Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.
2011-07-01
Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.
DOT National Transportation Integrated Search
2016-09-01
We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation
NASA Astrophysics Data System (ADS)
Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.
A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.
Gender, marital status, and commercially prepared food expenditure.
Kroshus, Emily
2008-01-01
Assess how per capita expenditure on commercially prepared food as a proportion of total food expenditure varies by the sex and marital status of the head of the household. Prospective cohort study, data collected by the United States Bureau of Labor Statistics 2004 Consumer Expenditure Survey. United States. Randomly selected nationally representative sample of 5744 US citizens. Per capita spending on commercially prepared food (dependent variable) for every $1 increase in total per capita food spending (independent variable). Linear regressions run separately for each permutation of gender and marital status (never married, married, divorced/separated). Proportionate per capita household expenditure on commercially prepared food was found to vary by marital status and gender. Households headed by unmarried men (both divorced/separated and never married) spent a significantly greater proportion of their food budget on commercially prepared food than their married male peers (38% and 60% higher, respectively). Regardless of marital status, households headed by women were found to spend approximately one-third of their total food budget on commercially prepared foods outside the home. Households headed by never married men spent 63% more per capita than those headed by never married women and households headed by divorced or separated men spent 37% more than those headed by divorced or separated women. Marital status is significantly related to the dietary patterns of households headed by men. In light of the high rates of divorce, separation, and delay of marriage, marriage cannot be considered an inclusive or permanent solution to changing male eating patterns. It is important that nutrition educators learn more about the dietary patterns of households headed by males outside the institution of marriage.
Kanamori, Shogo; Castro, Marcia C.; Sow, Seydou; Matsuno, Rui; Cissokho, Alioune; Jimba, Masamine
2016-01-01
Background The 5S method is a lean management tool for workplace organization, with 5S being an abbreviation for five Japanese words that translate to English as Sort, Set in Order, Shine, Standardize, and Sustain. In Senegal, the 5S intervention program was implemented in 10 health centers in two regions between 2011 and 2014. Objective To identify the impact of the 5S intervention program on the satisfaction of clients (patients and caretakers) who visited the health centers. Design A standardized 5S intervention protocol was implemented in the health centers using a quasi-experimental separate pre-post samples design (four intervention and three control health facilities). A questionnaire with 10 five-point Likert items was used to measure client satisfaction. Linear regression analysis was conducted to identify the intervention's effect on the client satisfaction scores, represented by an equally weighted average of the 10 Likert items (Cronbach's alpha=0.83). Additional regression analyses were conducted to identify the intervention's effect on the scores of each Likert item. Results Backward stepwise linear regression (n=1,928) indicated a statistically significant effect of the 5S intervention, represented by an increase of 0.19 points in the client satisfaction scores in the intervention group, 6 to 8 months after the intervention (p=0.014). Additional regression analyses showed significant score increases of 0.44 (p=0.002), 0.14 (p=0.002), 0.06 (p=0.019), and 0.17 (p=0.044) points on four items, which, respectively were healthcare staff members’ communication, explanations about illnesses or cases, and consultation duration, and clients’ overall satisfaction. Conclusions The 5S has the potential to improve client satisfaction at resource-poor health facilities and could therefore be recommended as a strategic option for improving the quality of healthcare service in low- and middle-income countries. To explore more effective intervention modalities, further studies need to address the mechanisms by which 5S leads to attitude changes in healthcare staff. PMID:27900932
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
NASA Astrophysics Data System (ADS)
Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino
2018-07-01
Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.
Senn, Stephen; Graf, Erika; Caputo, Angelika
2007-12-30
Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.
Dai, Xiaoping; Han, Yuping; Zhang, Xiaohong; Hu, Wei; Huang, Liangji; Duan, Wenpei; Li, Siyi; Liu, Xiaolu; Wang, Qian
2017-09-01
A better understanding of willingness to separate waste and waste separation behaviour can aid the design and improvement of waste management policies. Based on the intercept questionnaire survey data of undergraduate students and residents in Zhengzhou City of China, this article compared factors affecting the willingness and behaviour of students and residents to participate in waste separation using two binary logistic regression models. Improvement opportunities for waste separation were also discussed. Binary logistic regression results indicate that knowledge of and attitude to waste separation and acceptance of waste education significantly affect the willingness of undergraduate students to separate waste, and demographic factors, such as gender, age, education level, and income, significantly affect the willingness of residents to do so. Presence of waste-specific bins and attitude to waste separation are drivers of waste separation behaviour for both students and residents. Improved education about waste separation and facilities are effective to stimulate waste separation, and charging on unsorted waste may be an effective way to improve it in Zhengzhou.
The relation between anxiety and BMI - is it all in our curves?
Haghighi, Mohammad; Jahangard, Leila; Ahmadpanah, Mohammad; Bajoghli, Hafez; Holsboer-Trachsler, Edith; Brand, Serge
2016-01-30
The relation between anxiety and excessive weight is unclear. The aims of the present study were three-fold: First, we examined the association between anxiety and Body Mass Index (BMI). Second, we examined this association separately for female and male participants. Next, we examined both linear and non-linear associations between anxiety and BMI. The BMI was assessed of 92 patients (mean age: M=27.52; 57% females) suffering from anxiety disorders. Patients completed the Beck Anxiety Inventory. Both linear and non-linear correlations were computed for the sample as a whole and separately by gender. No gender differences were observed in anxiety scores or BMI. No linear correlation between anxiety scores and BMI was observed. In contrast, a non-linear correlation showed an inverted U-shaped association, with lower anxiety scores both for lower and very high BMI indices, and higher anxiety scores for medium to high BMI indices. Separate computations revealed no differences between males and females. The pattern of results suggests that the association between BMI and anxiety is complex and more accurately captured with non-linear correlations. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A
2013-01-01
Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi
2007-10-01
Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H
2012-10-01
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
ERIC Educational Resources Information Center
Liou, Pey-Yan
2009-01-01
The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…
Rasmussen, Andrew; Cissé, Aïcha; Han, Ying; Roubeni, Sonia
2018-02-12
Immigrants make up large proportions of many low-income neighborhoods, but have been largely ignored in the neighborhood safety literature. We examined perceived safety's association with migration using a six-item, child-specific measure of parents' perceptions of school-aged (5-12 years of age) children's safety in a sample of 93 West African immigrant parents in New York City. Aims of the study were (a) to identify pre-migration correlates (e.g., trauma in home countries), (b) to identify migration-related correlates (e.g., immigration status, time spent separated from children during migration), and (c) to identify pre-migration and migration correlates that accounted for variance after controlling for non-migration-related correlates (e.g., neighborhood crime, parents' psychological distress). In a linear regression model, children's safety was associated with borough of residence, greater English ability, less emotional distress, less parenting difficulty, and a history of child separation. Parents' and children's gender, parents' immigration status, and the number of contacts in the U.S. pre-migration and pre-migration trauma were not associated with children's safety. That child separation was positively associated with safety perceptions suggests that the processes that facilitate parent-child separation might be reconceptualized as strengths for transnational families. Integrating migration-related factors into the discussion of neighborhood safety for immigrant populations allows for more nuanced views of immigrant families' well-being in host countries. © Society for Community Research and Action 2018.
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.
Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W
1992-03-01
The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
NASA Astrophysics Data System (ADS)
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Limit cycles in planar piecewise linear differential systems with nonregular separation line
NASA Astrophysics Data System (ADS)
Cardin, Pedro Toniol; Torregrosa, Joan
2016-12-01
In this paper we deal with planar piecewise linear differential systems defined in two zones. We consider the case when the two linear zones are angular sectors of angles α and 2 π - α, respectively, for α ∈(0 , π) . We study the problem of determining lower bounds for the number of isolated periodic orbits in such systems using Melnikov functions. These limit cycles appear studying higher order piecewise linear perturbations of a linear center. It is proved that the maximum number of limit cycles that can appear up to a sixth order perturbation is five. Moreover, for these values of α, we prove the existence of systems with four limit cycles up to fifth order and, for α = π / 2, we provide an explicit example with five up to sixth order. In general, the nonregular separation line increases the number of periodic orbits in comparison with the case where the two zones are separated by a straight line.
Orthogonal Projection in Teaching Regression and Financial Mathematics
ERIC Educational Resources Information Center
Kachapova, Farida; Kachapov, Ilias
2010-01-01
Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an
2013-01-01
Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present.
Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an
2013-01-01
Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present. PMID:23555040
Handy elementary algebraic properties of the geometry of entanglement
NASA Astrophysics Data System (ADS)
Blair, Howard A.; Alsing, Paul M.
2013-05-01
The space of separable states of a quantum system is a hyperbolic surface in a high dimensional linear space, which we call the separation surface, within the exponentially high dimensional linear space containing the quantum states of an n component multipartite quantum system. A vector in the linear space is representable as an n-dimensional hypermatrix with respect to bases of the component linear spaces. A vector will be on the separation surface iff every determinant of every 2-dimensional, 2-by-2 submatrix of the hypermatrix vanishes. This highly rigid constraint can be tested merely in time asymptotically proportional to d, where d is the dimension of the state space of the system due to the extreme interdependence of the 2-by-2 submatrices. The constraint on 2-by-2 determinants entails an elementary closed formformula for a parametric characterization of the entire separation surface with d-1 parameters in the char- acterization. The state of a factor of a partially separable state can be calculated in time asymptotically proportional to the dimension of the state space of the component. If all components of the system have approximately the same dimension, the time complexity of calculating a component state as a function of the parameters is asymptotically pro- portional to the time required to sort the basis. Metric-based entanglement measures of pure states are characterized in terms of the separation hypersurface.
Hunt, E R; Martin, F C; Running, S W
1991-01-01
Simulation models of ecosystem processes may be necessary to separate the long-term effects of climate change on forest productivity from the effects of year-to-year variations in climate. The objective of this study was to compare simulated annual stem growth with measured annual stem growth from 1930 to 1982 for a uniform stand of ponderosa pine (Pinus ponderosa Dougl.) in Montana, USA. The model, FOREST-BGC, was used to simulate growth assuming leaf area index (LAI) was either constant or increasing. The measured stem annual growth increased exponentially over time; the differences between the simulated and measured stem carbon accumulations were not large. Growth trends were removed from both the measured and simulated annual increments of stem carbon to enhance the year-to-year variations in growth resulting from climate. The detrended increments from the increasing LAI simulation fit the detrended increments of the stand data over time with an R(2) of 0.47; the R(2) increased to 0.65 when the previous year's simulated detrended increment was included with the current year's simulated increment to account for autocorrelation. Stepwise multiple linear regression of the detrended increments of the stand data versus monthly meteorological variables had an R(2) of 0.37, and the R(2) increased to 0.47 when the previous year's meteorological data were included to account for autocorrelation. Thus, FOREST-BGC was more sensitive to the effects of year-to-year climate variation on annual stem growth than were multiple linear regression models.
Fogelholm, M; Kanerva, N; Männistö, S
2015-09-01
High consumption of meat has been linked with the risk for obesity and chronic diseases. This could partly be explained by the association between meat and lower-quality diet. We studied whether high intake of red and processed meat was associated with lower-quality dietary habits, assessed against selected nutrients, other food groups and total diet. Moreover, we studied whether meat consumption was associated with obesity, after adjustment for all identified associations between meat and food consumption. The nationally representative cross-sectional study population consisted of 2190 Finnish men and 2530 women, aged 25-74 years. Food consumption over the previous 12 months was assessed using a validated 131-item Food Frequency Questionnaire. Associations between nutrients, foods, a modified Baltic Sea Diet Score and meat consumption (quintile classification) were analysed using linear regression. The models were adjusted for age and energy intake and additionally for education, physical activity and smoking. High consumption of red and processed meat was inversely associated with fruits, whole grain and nuts, and positively with potatoes, oil and coffee in both sexes. Results separately for the two types of meat were essentially similar. In a linear regression analysis, high consumption of meat was positively associated with body mass index in both men and women, even when using a model adjusted for all foods with a significant association with meat consumption in both sexes identified in this study. The association between meat consumption and a lower-quality diet may complicate studies on meat and health.
Yazdani, Kamran; Rahimi-Movaghar, Afarin; Nedjat, Saharnaz; Ghalichi, Leila; Khalili, Malahat
2015-01-01
Background: Since Tehran University of Medical Sciences (TUMS) has the oldest and highest number of research centers among all Iranian medical universities, this study was conducted to evaluate scientific output of research centers affiliated to Tehran University of Medical Sciences (TUMS) using scientometric indices and the affecting factors. Moreover, a number of scientometric indicators were introduced. Methods: This cross-sectional study was performed to evaluate a 5-year scientific performance of research centers of TUMS. Data were collected through questionnaires, annual evaluation reports of the Ministry of Health, and also from Scopus database. We used appropriate measures of central tendency and variation for descriptive analyses. Moreover, uni-and multi-variable linear regression were used to evaluate the effect of independent factors on the scientific output of the centers. Results: The medians of the numbers of papers and books during a 5-year period were 150.5 and 2.5 respectively. The median of the "articles per researcher" was 19.1. Based on multiple linear regression, younger age centers (p=0.001), having a separate budget line (p=0.016), and number of research personnel (p<0.001) had a direct significant correlation with the number of articles while real properties had a reverse significant correlation with it (p=0.004). Conclusion: The results can help policy makers and research managers to allocate sufficient resources to improve current situation of the centers. Newly adopted and effective scientometric indices are is suggested to be used to evaluate scientific outputs and functions of these centers. PMID:26157724
Habitat characteristics affecting fish assemblages on a Hawaiian coral reef
Friedlander, A.M.; Parrish, J.D.
1998-01-01
Habitat characteristics of a reef were examined as potential influences on fish assemblage structure, using underwater visual census to estimate numbers and biomass of all fishes visible on 42 benthic transects and making quantitative measurements of 13 variables of the corresponding physical habitat and sessile biota. Fish assemblages in the diverse set of benthic habitats were grouped by detrended correspondence analysis, and associated with six major habitat types. Statistical differences were shown between a number of these habitat types for various ensemble variables of the fish assemblages. Overall, both for complete assemblages and for component major trophic and mobility guilds, these variables tended to have higher values where reef substratum was more structurally or topographically complex, and closer to reef edges. When study sites were separately divided into five depth strata, the deeper strata tended to have statistically higher values of ensemble variables for the fish assemblages. Patterns with depth varied among the various trophic and mobility guilds. Multiple linear regression models indicated that for the complete assemblages and for most trophic and mobility guilds, a large part of the variability for most ensemble variables was explained by measures of holes in the substratum, with important contributions from measured substratum rugosity and depth. A strong linear relationship found by regression of mean fish length on mean volume of holes in the reef surface emphasized the importance of shelter for fish assemblages. Results of this study may have practical applications in designing reserve areas as well as theoretical value in helping to explain the organization of reef fish assemblages.
Inverse Association between Air Pressure and Rheumatoid Arthritis Synovitis
Furu, Moritoshi; Nakabo, Shuichiro; Ohmura, Koichiro; Nakashima, Ran; Imura, Yoshitaka; Yukawa, Naoichiro; Yoshifuji, Hajime; Matsuda, Fumihiko; Ito, Hiromu; Fujii, Takao; Mimori, Tsuneyo
2014-01-01
Rheumatoid arthritis (RA) is a bone destructive autoimmune disease. Many patients with RA recognize fluctuations of their joint synovitis according to changes of air pressure, but the correlations between them have never been addressed in large-scale association studies. To address this point we recruited large-scale assessments of RA activity in a Japanese population, and performed an association analysis. Here, a total of 23,064 assessments of RA activity from 2,131 patients were obtained from the KURAMA (Kyoto University Rheumatoid Arthritis Management Alliance) database. Detailed correlations between air pressure and joint swelling or tenderness were analyzed separately for each of the 326 patients with more than 20 assessments to regulate intra-patient correlations. Association studies were also performed for seven consecutive days to identify the strongest correlations. Standardized multiple linear regression analysis was performed to evaluate independent influences from other meteorological factors. As a result, components of composite measures for RA disease activity revealed suggestive negative associations with air pressure. The 326 patients displayed significant negative mean correlations between air pressure and swellings or the sum of swellings and tenderness (p = 0.00068 and 0.00011, respectively). Among the seven consecutive days, the most significant mean negative correlations were observed for air pressure three days before evaluations of RA synovitis (p = 1.7×10−7, 0.00027, and 8.3×10−8, for swellings, tenderness and the sum of them, respectively). Standardized multiple linear regression analysis revealed these associations were independent from humidity and temperature. Our findings suggest that air pressure is inversely associated with synovitis in patients with RA. PMID:24454853
Astudillo, Mariana; Kuendig, Hervé; Centeno-Gil, Adriana; Wicki, Matthias; Gmel, Gerhard
2014-09-01
This study investigated the associations of alcohol outlet density with specific alcohol outcomes (consumption and consequences) among young men in Switzerland and assessed the possible geographically related variations. Alcohol consumption and drinking consequences were measured in a 2010-2011 study assessing substance use risk factors (Cohort Study on Substance Use Risk Factors) among 5519 young Swiss men. Outlet density was based on the number of on- and off-premise outlets in the district of residence. Linear regression models were run separately for drinking level, heavy episodic drinking (HED) and drinking consequences. Geographically weighted regression models were estimated when variations were recorded at the district level. No consistent association was found between outlet density and drinking consequences. A positive association between drinking level and HED with on-premise outlet density was found. Geographically weighted regressions were run for drinking level and HED. The predicted values for HED were higher in the southwest part of Switzerland (French-speaking part). Among Swiss young men, the density of outlets and, in particular, the abundance of bars, clubs and other on-premise outlets was associated with drinking level and HED, even when drinking consequences were not significantly affected. These findings support the idea that outlet density needs to be considered when developing and implementing regional-based prevention initiatives. © 2014 Australasian Professional Society on Alcohol and other Drugs.
Analysis of Learning Curve Fitting Techniques.
1987-09-01
1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied
On vertical profile of ozone at Syowa
NASA Technical Reports Server (NTRS)
Chubachi, Shigeru
1994-01-01
The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.
Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija
2018-01-01
The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P
2006-04-01
The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.
Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M
2017-06-01
The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Linear solvation energy relationships in normal phase chromatography based on gradient separations.
Wu, Di; Lucy, Charles A
2017-09-22
Coupling the modified Soczewiñski model and one gradient run, a gradient method was developed to build a linear solvation energy relationship (LSER) for normal phase chromatography. The gradient method was tested on dinitroanilinopropyl (DNAP) and silica columns with hexane/dichloromethane (DCM) mobile phases. LSER models built based on the gradient separation agree with those derived from a series of isocratic separations. Both models have similar LSER coefficients and comparable goodness of fit, but the LSER model based on gradient separation required fewer trial and error experiments. Copyright © 2017 Elsevier B.V. All rights reserved.
Application of General Regression Neural Network to the Prediction of LOD Change
NASA Astrophysics Data System (ADS)
Zhang, Xiao-Hong; Wang, Qi-Jie; Zhu, Jian-Jun; Zhang, Hao
2012-01-01
Traditional methods for predicting the change in length of day (LOD change) are mainly based on some linear models, such as the least square model and autoregression model, etc. However, the LOD change comprises complicated non-linear factors and the prediction effect of the linear models is always not so ideal. Thus, a kind of non-linear neural network — general regression neural network (GRNN) model is tried to make the prediction of the LOD change and the result is compared with the predicted results obtained by taking advantage of the BP (back propagation) neural network model and other models. The comparison result shows that the application of the GRNN to the prediction of the LOD change is highly effective and feasible.
Werner, Jan; Griebeler, Eva Maria
2014-01-01
We tested if growth rates of recent taxa are unequivocally separated between endotherms and ectotherms, and compared these to dinosaurian growth rates. We therefore performed linear regression analyses on the log-transformed maximum growth rate against log-transformed body mass at maximum growth for extant altricial birds, precocial birds, eutherians, marsupials, reptiles, fishes and dinosaurs. Regression models of precocial birds (and fishes) strongly differed from Case's study (1978), which is often used to compare dinosaurian growth rates to those of extant vertebrates. For all taxonomic groups, the slope of 0.75 expected from the Metabolic Theory of Ecology was statistically supported. To compare growth rates between taxonomic groups we therefore used regressions with this fixed slope and group-specific intercepts. On average, maximum growth rates of ectotherms were about 10 (reptiles) to 20 (fishes) times (in comparison to mammals) or even 45 (reptiles) to 100 (fishes) times (in comparison to birds) lower than in endotherms. While on average all taxa were clearly separated from each other, individual growth rates overlapped between several taxa and even between endotherms and ectotherms. Dinosaurs had growth rates intermediate between similar sized/scaled-up reptiles and mammals, but a much lower rate than scaled-up birds. All dinosaurian growth rates were within the range of extant reptiles and mammals, and were lower than those of birds. Under the assumption that growth rate and metabolic rate are indeed linked, our results suggest two alternative interpretations. Compared to other sauropsids, the growth rates of studied dinosaurs clearly indicate that they had an ectothermic rather than an endothermic metabolic rate. Compared to other vertebrate growth rates, the overall high variability in growth rates of extant groups and the high overlap between individual growth rates of endothermic and ectothermic extant species make it impossible to rule out either of the two thermoregulation strategies for studied dinosaurs.
Werner, Jan; Griebeler, Eva Maria
2014-01-01
We tested if growth rates of recent taxa are unequivocally separated between endotherms and ectotherms, and compared these to dinosaurian growth rates. We therefore performed linear regression analyses on the log-transformed maximum growth rate against log-transformed body mass at maximum growth for extant altricial birds, precocial birds, eutherians, marsupials, reptiles, fishes and dinosaurs. Regression models of precocial birds (and fishes) strongly differed from Case’s study (1978), which is often used to compare dinosaurian growth rates to those of extant vertebrates. For all taxonomic groups, the slope of 0.75 expected from the Metabolic Theory of Ecology was statistically supported. To compare growth rates between taxonomic groups we therefore used regressions with this fixed slope and group-specific intercepts. On average, maximum growth rates of ectotherms were about 10 (reptiles) to 20 (fishes) times (in comparison to mammals) or even 45 (reptiles) to 100 (fishes) times (in comparison to birds) lower than in endotherms. While on average all taxa were clearly separated from each other, individual growth rates overlapped between several taxa and even between endotherms and ectotherms. Dinosaurs had growth rates intermediate between similar sized/scaled-up reptiles and mammals, but a much lower rate than scaled-up birds. All dinosaurian growth rates were within the range of extant reptiles and mammals, and were lower than those of birds. Under the assumption that growth rate and metabolic rate are indeed linked, our results suggest two alternative interpretations. Compared to other sauropsids, the growth rates of studied dinosaurs clearly indicate that they had an ectothermic rather than an endothermic metabolic rate. Compared to other vertebrate growth rates, the overall high variability in growth rates of extant groups and the high overlap between individual growth rates of endothermic and ectothermic extant species make it impossible to rule out either of the two thermoregulation strategies for studied dinosaurs. PMID:24586409
Dairy manure nutrient analysis using quick tests.
Singh, A; Bicudo, J R
2005-05-01
Rapid on-farm assessment of manure nutrient content can be achieved with the use of quick tests. These tests can be used to indirectly measure the nutrient content in animal slurries immediately before manure is applied on agricultural fields. The objective of this study was to assess the reliability of hydrometers, electrical conductivity meter and pens, and Agros N meter against standard laboratory methods. Manure samples were collected from 34 dairy farms in the Mammoth Cave area in central Kentucky. Regression equations were developed for combined and individual counties located In the area (Barren, Hart and Monroe). Our results indicated that accuracy in nutrient estimation could be improved if separate linear regressions were developed for farms with similar facilities in a county. Direct hydrometer estimates of total nitrogen were among the most accurate when separate regression equations were developed for each county (R2 = 0.61, 0.93, and 0.74 for Barren, Hart and Monroe county, respectively). Reasonably accurate estimates (R2 > 0.70) were also obtained for total nitrogen and total phosphorus using hydrometers, either by relating specific gravity to nutrient content or to total solids content. Estimation of ammoniacal nitrogen with Agros N meter and electrical conductivity meter/pens correlated well with standard laboratory determinations, especially while using the individual data sets from Hart County (R2 = 0.70 to 0.87). This study indicates that the use of quick test calibration equations developed for a small area or region where farms are similar in terms of manure handling and management, housing, and feed ration are more appropriate than using "universal" equations usually developed with combined data sets. Accuracy is expected to improve if individual farms develop their own calibration curves. Nevertheless, we suggest confidence intervals always be specified for nutrients estimated through quick testing for any specific region, county, or farm.
Estimating effects of limiting factors with regression quantiles
Cade, B.S.; Terrell, J.W.; Schroeder, R.L.
1999-01-01
In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Nagarajan, Mahesh B.; Huber, Markus B.; Schlossbauer, Thomas; Leinsinger, Gerda; Krol, Andrzej; Wismüller, Axel
2014-01-01
Objective While dimension reduction has been previously explored in computer aided diagnosis (CADx) as an alternative to feature selection, previous implementations of its integration into CADx do not ensure strict separation between training and test data required for the machine learning task. This compromises the integrity of the independent test set, which serves as the basis for evaluating classifier performance. Methods and Materials We propose, implement and evaluate an improved CADx methodology where strict separation is maintained. This is achieved by subjecting the training data alone to dimension reduction; the test data is subsequently processed with out-of-sample extension methods. Our approach is demonstrated in the research context of classifying small diagnostically challenging lesions annotated on dynamic breast magnetic resonance imaging (MRI) studies. The lesions were dynamically characterized through topological feature vectors derived from Minkowski functionals. These feature vectors were then subject to dimension reduction with different linear and non-linear algorithms applied in conjunction with out-of-sample extension techniques. This was followed by classification through supervised learning with support vector regression. Area under the receiver-operating characteristic curve (AUC) was evaluated as the metric of classifier performance. Results Of the feature vectors investigated, the best performance was observed with Minkowski functional ’perimeter’ while comparable performance was observed with ’area’. Of the dimension reduction algorithms tested with ’perimeter’, the best performance was observed with Sammon’s mapping (0.84 ± 0.10) while comparable performance was achieved with exploratory observation machine (0.82 ± 0.09) and principal component analysis (0.80 ± 0.10). Conclusions The results reported in this study with the proposed CADx methodology present a significant improvement over previous results reported with such small lesions on dynamic breast MRI. In particular, non-linear algorithms for dimension reduction exhibited better classification performance than linear approaches, when integrated into our CADx methodology. We also note that while dimension reduction techniques may not necessarily provide an improvement in classification performance over feature selection, they do allow for a higher degree of feature compaction. PMID:24355697
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Job satisfaction of primary care physicians in Switzerland: an observational study.
Goetz, Katja; Jossen, Marianne; Szecsenyi, Joachim; Rosemann, Thomas; Hahn, Karolin; Hess, Sigrid
2016-10-01
Job satisfaction of physicians is an important issue for performance of a health care system. The aim of the study was to evaluate the job satisfaction of primary care physicians in Switzerland and to explore associations between overall job satisfaction, individual characteristics and satisfaction with aspects of work within the practice separated by gender. This cross-sectional study was based on a job satisfaction survey. Data were collected from 176 primary care physicians working in 91 primary care practices. Job satisfaction was measured with the 10-item Warr-Cook-Wall job satisfaction scale. Stepwise linear regression analysis was performed for physicians separated by gender. The response rate was 92.6%. Primary care physicians reported the highest level of satisfaction with 'freedom of working method' (mean = 6.45) and the lowest satisfaction for 'hours of work' (mean = 5.38) and 'income' (mean = 5.49). Moreover, some aspects of job satisfaction were rated higher by female physicians than male physicians. Within the stepwise regression analysis, the aspect 'opportunity to use abilities' (β = 0.644) showed the highest association to overall job satisfaction for male physicians while for female physicians it was income (β = 0.733). The presented results contribute to an understanding of factors that influence levels of satisfaction of female and male physicians. Therefore, research and intervention about job satisfaction should consider gender as well as the stereotypes that come along with these social roles. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Morrell, Glen R.; Ikizler, Talat A.; Chen, Xiaorui; Heilbrun, Marta E.; Wei, Guo; Boucher, Robert; Beddhu, Srinivasan
2016-01-01
Objective We investigate whether psoas or paraspinous muscle area measured on a single L4–5 image is a useful measure of whole lean body mass compared to dedicated mid-thigh magnetic resonance imaging (MRI). Design Observational study. Setting Outpatient dialysis units and a research clinic. Subjects 105 adult participants on maintenance hemodialysis. No control group was used. Exposure variables Psoas muscle area, paraspinous muscle area, and mid-thigh muscle area (MTMA) were measured by MRI. Main outcome measure Lean body mass was measured by dual-energy absorptiometry (DEXA) scan. Results In separate multivariable linear regression models, psoas, paraspinous, and mid-thigh muscle area were associated with increase in lean body mass. In separate multivariate logistic regression models, c-statistics for diagnosis of sarcopenia (defined as < 25th percentile of lean body mass) were 0.69 for paraspinous muscle area, 0.81 for psoas muscle area, and 0.89 for mid-thigh muscle area. With sarcopenia defined as < 10th percentile of lean body mass, the corresponding c-statistics were 0.71, 0.92, and 0.94. Conclusions We conclude that psoas muscle area provides a good measure of whole body muscle mass, better than paraspinous muscle area but slightly inferior to mid thigh measurement. Hence, in body composition studies a single axial MR image at the L4–L5 level can be used to provide information on both fat and muscle and may eliminate the need for time-consuming measurement of muscle area in the thigh. PMID:26994780
40 CFR 1066.220 - Linearity verification for chassis dynamometer systems.
Code of Federal Regulations, 2014 CFR
2014-07-01
... dynamometer speed and torque at least as frequently as indicated in Table 1 of § 1066.215. The intent of... linear regression and the linearity criteria specified in Table 1 of this section. (b) Performance requirements. If a measurement system does not meet the applicable linearity criteria in Table 1 of this...
ERIC Educational Resources Information Center
Hovardas, Tasos
2016-01-01
Although ecological systems at varying scales involve non-linear interactions, learners insist thinking in a linear fashion when they deal with ecological phenomena. The overall objective of the present contribution was to propose a hypothetical learning progression for developing non-linear reasoning in prey-predator systems and to provide…
ERIC Educational Resources Information Center
Ker, H. W.
2014-01-01
Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…
Artes, Paul H; Crabb, David P
2010-01-01
To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.
van Vught, Anneke J A H; Heitmann, Berit L; Nieuwenhuizen, Arie G; Veldhorst, Margriet A B; Andersen, Lars Bo; Hasselstrom, Henriette; Brummer, Robert-Jan M; Westerterp-Plantenga, Margriet S
2010-05-01
Growth hormone (GH) affects linear growth and body composition, by increasing the secretion of insulin-like growth factor-I (IGF-I), muscle protein synthesis and lipolysis. The intake of protein (PROT) as well as the specific amino acids arginine (ARG) and lysine (LYS) stimulates GH/IGF-I secretion. The present paper aimed to investigate associations between PROT intake as well as intake of the specific amino acids ARG and LYS, and subsequent 3-year-change in linear growth and body composition among 6-year-old children. Children's data were collected from Copenhagen (Denmark), during 2001-2002, and again 3 years later. Boys and girls were separated into normal weight and overweight, based on BMI quintiles. Fat-free mass index (FFMI) and fat mass index (FMI) were calculated. Associations between change (Delta) in height, FMI and FFMI, respectively, and habitual PROT intake as well as ARG and LYS were analysed by multiple linear regressions, adjusted for baseline height, FMI or FFMI and energy intake, age, physical activity and socio-economic status. Eighteen schools in two suburban communities in the Copenhagen (Denmark) area participated in the study. In all, 223 children's data were collected for the present study. High ARG intake was associated with linear growth (beta = 1.09 (se 0.54), P = 0.05) among girls. Furthermore, in girls, DeltaFMI had a stronger inverse association with high ARG intake, if it was combined with high LYS intake, instead of low LYS intake (P = 0.03). No associations were found in boys.ConclusionIn prepubertal girls, linear growth may be influenced by habitual ARG intake and body fat gain may be relatively prevented over time by the intake of the amino acids ARG and LYS.
Assessment of the Uniqueness of Wind Tunnel Strain-Gage Balance Load Predictions
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2016-01-01
A new test was developed to assess the uniqueness of wind tunnel strain-gage balance load predictions that are obtained from regression models of calibration data. The test helps balance users to gain confidence in load predictions of non-traditional balance designs. It also makes it possible to better evaluate load predictions of traditional balances that are not used as originally intended. The test works for both the Iterative and Non-Iterative Methods that are used in the aerospace testing community for the prediction of balance loads. It is based on the hypothesis that the total number of independently applied balance load components must always match the total number of independently measured bridge outputs or bridge output combinations. This hypothesis is supported by a control volume analysis of the inputs and outputs of a strain-gage balance. It is concluded from the control volume analysis that the loads and bridge outputs of a balance calibration data set must separately be tested for linear independence because it cannot always be guaranteed that a linearly independent load component set will result in linearly independent bridge output measurements. Simple linear math models for the loads and bridge outputs in combination with the variance inflation factor are used to test for linear independence. A highly unique and reversible mapping between the applied load component set and the measured bridge output set is guaranteed to exist if the maximum variance inflation factor of both sets is less than the literature recommended threshold of five. Data from the calibration of a six{component force balance is used to illustrate the application of the new test to real-world data.
Spector, Logan G; Davies, Stella M; Robison, Leslie L; Hilden, Joanne M; Roesler, Michelle; Ross, Julie A
2007-01-01
Leukemias with MLL gene rearrangements predominate in infants (<1 year of age), but not in older children, and may have a distinct etiology. High birth weight, higher birth order, and prior fetal loss have, with varying consistency, been associated with infant leukemia, but no studies have reported results with respect to MLL status. Here, we report for the first time such an analysis. During 1999 to 2003, mothers of 240 incident cases (113 MLL(+), 80 MLL(-), and 47 indeterminate) and 255 random digit dialed controls completed a telephone interview. Odds ratios and 95% confidence intervals for quartile of birth weight, birth order, gestational age, maternal age at delivery, prior fetal loss, pre-pregnancy body mass index, and weight gain during pregnancy were obtained using unconditional logistic regression; P for linear trend was obtained by modeling continuous variables. There was a borderline significant linear trend of increasing birth weight with MLL(+) (P = 0.06), but not MLL(-) (P = 0.93), infant leukemia. Increasing birth order showed a significant inverse linear trend, independent of birth weight, with MLL(+) (P = 0.01), but not MLL(-) (P = 0.18), infant leukemia. Other variables of interest were not notably associated with infant leukemia regardless of MLL status. This investigation further supports the contention that molecularly defined subtypes of infant leukemia have separate etiologies.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
NASA Astrophysics Data System (ADS)
Chu, Hone-Jay; Kong, Shish-Jeng; Chang, Chih-Hua
2018-03-01
The turbidity (TB) of a water body varies with time and space. Water quality is traditionally estimated via linear regression based on satellite images. However, estimating and mapping water quality require a spatio-temporal nonstationary model, while TB mapping necessitates the use of geographically and temporally weighted regression (GTWR) and geographically weighted regression (GWR) models, both of which are more precise than linear regression. Given the temporal nonstationary models for mapping water quality, GTWR offers the best option for estimating regional water quality. Compared with GWR, GTWR provides highly reliable information for water quality mapping, boasts a relatively high goodness of fit, improves the explanation of variance from 44% to 87%, and shows a sufficient space-time explanatory power. The seasonal patterns of TB and the main spatial patterns of TB variability can be identified using the estimated TB maps from GTWR and by conducting an empirical orthogonal function (EOF) analysis.
NASA Astrophysics Data System (ADS)
Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.
2017-12-01
Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.
Mental chronometry with simple linear regression.
Chen, J Y
1997-10-01
Typically, mental chronometry is performed by means of introducing an independent variable postulated to affect selectively some stage of a presumed multistage process. However, the effect could be a global one that spreads proportionally over all stages of the process. Currently, there is no method to test this possibility although simple linear regression might serve the purpose. In the present study, the regression approach was tested with tasks (memory scanning and mental rotation) that involved a selective effect and with a task (word superiority effect) that involved a global effect, by the dominant theories. The results indicate (1) the manipulation of the size of a memory set or of angular disparity affects the intercept of the regression function that relates the times for memory scanning with different set sizes or for mental rotation with different angular disparities and (2) the manipulation of context affects the slope of the regression function that relates the times for detecting a target character under word and nonword conditions. These ratify the regression approach as a useful method for doing mental chronometry.
El Yazbi, Fawzy A.; Hassan, Ekram M.; Khamis, Essam F.; Ragab, Marwa A.A.; Hamdy, Mohamed M.A.
2016-01-01
A validated and highly selective high-performance thin-layer chromatography (HPTLC) method was developed for the determination of ketorolac tromethamine (KTC) with phenylephrine hydrochloride (PHE) (Mixture 1) and with febuxostat (FBX) (Mixture 2) in bulk drug and in combined dosage forms. The proposed method was based on HPTLC separation of the drugs followed by densitometric measurements of their spots at 273 and 320 nm for Mixtures 1 and 2, respectively. The separation was carried out on Merck HPTLC aluminum sheets of silica gel 60 F254 using chloroform–methanol–ammonia (7:3:0.1, v/v) and (7.5:2.5:0.1, v/v) as mobile phase for KTC/PHE and KTC/FBX mixtures, respectively. Linear regression lines were obtained over the concentration ranges 0.20–0.60 and 0.60–1.95 µg band−1 for KTC and PHE (Mixture 1), respectively, and 0.10–1.00 and 0.25–2.50 µg band−1 for KTC and FBX (Mixture 2), respectively, with correlation coefficients higher than 0.999. The method was successfully applied to the analysis of the two drugs in their synthetic mixtures and in their dosage forms. The mean percentage recoveries were in the range of 98–102%, and the RSD did not exceed 2%. The method was validated according to ICH guidelines and showed good performances in terms of linearity, sensitivity, precision, accuracy and stability. PMID:26847918
Guan, Yongtao; Li, Yehua; Sinha, Rajita
2011-01-01
In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material. PMID:21984854
Temporal discrimination threshold with healthy aging.
Ramos, Vesper Fe Marie Llaneza; Esquenazi, Alina; Villegas, Monica Anne Faye; Wu, Tianxia; Hallett, Mark
2016-07-01
The temporal discrimination threshold (TDT) is the shortest interstimulus interval at which a subject can perceive successive stimuli as separate. To investigate the effects of aging on TDT, we studied tactile TDT using the method of limits with 120% of sensory threshold in each hand for each of 100 healthy volunteers, equally divided among men and women, across 10 age groups, from 18 to 79 years. Linear regression analysis showed that age was significantly related to left-hand mean, right-hand mean, and mean of 2 hands with R-square equal to 0.08, 0.164, and 0.132, respectively. Reliability analysis indicated that the 3 measures had fair-to-good reliability (intraclass correlation coefficient: 0.4-0.8). We conclude that TDT is affected by age and has fair-to-good reproducibility using our technique. Published by Elsevier Inc.
Kim, Dae-Hee; Choi, Jae-Hun; Lim, Myung-Eun; Park, Soo-Jun
2008-01-01
This paper suggests the method of correcting distance between an ambient intelligence display and a user based on linear regression and smoothing method, by which distance information of a user who approaches to the display can he accurately output even in an unanticipated condition using a passive infrared VIR) sensor and an ultrasonic device. The developed system consists of an ambient intelligence display and an ultrasonic transmitter, and a sensor gateway. Each module communicates with each other through RF (Radio frequency) communication. The ambient intelligence display includes an ultrasonic receiver and a PIR sensor for motion detection. In particular, this system selects and processes algorithms such as smoothing or linear regression for current input data processing dynamically through judgment process that is determined using the previous reliable data stored in a queue. In addition, we implemented GUI software with JAVA for real time location tracking and an ambient intelligence display.
How is the weather? Forecasting inpatient glycemic control
Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M
2017-01-01
Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G
2015-01-01
The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer’s disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM. PMID:26900412
Liquid electrolyte informatics using an exhaustive search with linear regression.
Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato
2018-06-14
Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.
Huang, Jian; Zhang, Cun-Hui
2013-01-01
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-06-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-01-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
Dehghani, Mahdieh; Shadkam, Elaheh; Ahrari, Farzaneh; Dehghani, Mahboobe
2018-04-01
Age estimation in adults is an important issue in forensic science. This study aimed to estimate the chronological age of Iranians by means of pulp/tooth area ratio (AR) of canines in digital panoramic radiographs. The sample consisted of panoramic radiographs of 271 male and female subjects aged 16-64 years. The pulp/tooth area ratio (AR) of upper and lower canines was calculated by AutoCAD software. Data were subjected to correlation and regression analysis. There was a significant and inverse correlation between age and pulp/tooth area ratio of upper and lower canines (r=-0.794 for upper canine and r=-0.282 for lower canine; p-value<0.001). Linear regression equations were derived separately for upper, lower and both canines. The mean difference between actual and estimated age using upper canine was 6.07±1.7. The results showed that the pulp/tooth area ratios of canines are a reliable method for age estimation in Iranians. The pulp/tooth area ratio of upper canine was better correlated with chronological age than that of lower canine. Copyright © 2018 Elsevier B.V. All rights reserved.
Volberg, Rachel A; McNamara, Lauren M; Carris, Kari L
2018-06-01
While population surveys have been carried out in numerous jurisdictions internationally, little has been done to assess the relative strength of different risk factors that may contribute to the development of problem gambling. This is an important preparatory step for future research on the etiology of problem gambling. Using data from the 2006 California Problem Gambling Prevalence Survey, a telephone survey of adult California residents that used the NODS to assess respondents for gambling problems, binary logistic regression analysis was used to identify demographic characteristics, health-related behaviors, and gambling participation variables that statistically predicted the odds of being a problem or pathological gambler. In a separate approach, linear regression analysis was used to assess the impact of changes in these variables on the severity of the disorder. In both of the final models, the greatest statistical predictor of problem gambling status was past year Internet gambling. Furthermore, the unique finding of a significant interaction between physical or mental disability, Internet gambling, and problem gambling highlights the importance of exploring the interactions between different forms of gambling, the experience of mental and physical health issues, and the development of problem gambling using a longitudinal lens.
Bolduc, F.; Afton, A.D.
2008-01-01
Wetland use by waterbirds is highly dependent on water depth, and depth requirements generally vary among species. Furthermore, water depth within wetlands often varies greatly over time due to unpredictable hydrological events, making comparisons of waterbird abundance among wetlands difficult as effects of habitat variables and water depth are confounded. Species-specific relationships between bird abundance and water depth necessarily are non-linear; thus, we developed a methodology to correct waterbird abundance for variation in water depth, based on the non-parametric regression of these two variables. Accordingly, we used the difference between observed and predicted abundances from non-parametric regression (analogous to parametric residuals) as an estimate of bird abundance at equivalent water depths. We scaled this difference to levels of observed and predicted abundances using the formula: ((observed - predicted abundance)/(observed + predicted abundance)) ?? 100. This estimate also corresponds to the observed:predicted abundance ratio, which allows easy interpretation of results. We illustrated this methodology using two hypothetical species that differed in water depth and wetland preferences. Comparisons of wetlands, using both observed and relative corrected abundances, indicated that relative corrected abundance adequately separates the effect of water depth from the effect of wetlands. ?? 2008 Elsevier B.V.
NASA Astrophysics Data System (ADS)
de Souza Pereira, Francisca Rocha; Kampel, Milton; Cunha-Lignon, Marilia
2016-07-01
The potential use of phased array type L-band synthetic aperture radar (PALSAR) data for discriminating distinct physiographic mangrove types with different forest structure developments in a subtropical mangrove forest located in Cananéia on the Southern coast of São Paulo, Brazil, is investigated. The basin and fringe physiographic types and the structural development of mangrove vegetation were identified with the application of the Kruskal-Wallis statistical test to the SAR backscatter values of 10 incoherent attributes. The best results to separate basin to fringe types were obtained using copolarized HH, cross-polarized HV, and the biomass index (BMI). Mangrove structural parameters were also estimated using multiple linear regressions. BMI and canopy structure index were used as explanatory variables for canopy height, mean height, and mean diameter at breast height regression models, with significant R2=0.69, 0.73, and 0.67, respectively. The current study indicates that SAR L-band images can be used as a tool to discriminate physiographic types and to characterize mangrove forests. The results are relevant considering the crescent availability of freely distributed SAR images that can be more utilized for analysis, monitoring, and conservation of the mangrove ecosystem.
NASA Astrophysics Data System (ADS)
Haris, A.; Nafian, M.; Riyanto, A.
2017-07-01
Danish North Sea Fields consist of several formations (Ekofisk, Tor, and Cromer Knoll) that was started from the age of Paleocene to Miocene. In this study, the integration of seismic and well log data set is carried out to determine the chalk sand distribution in the Danish North Sea field. The integration of seismic and well log data set is performed by using the seismic inversion analysis and seismic multi-attribute. The seismic inversion algorithm, which is used to derive acoustic impedance (AI), is model-based technique. The derived AI is then used as external attributes for the input of multi-attribute analysis. Moreover, the multi-attribute analysis is used to generate the linear and non-linear transformation of among well log properties. In the case of the linear model, selected transformation is conducted by weighting step-wise linear regression (SWR), while for the non-linear model is performed by using probabilistic neural networks (PNN). The estimated porosity, which is resulted by PNN shows better suited to the well log data compared with the results of SWR. This result can be understood since PNN perform non-linear regression so that the relationship between the attribute data and predicted log data can be optimized. The distribution of chalk sand has been successfully identified and characterized by porosity value ranging from 23% up to 30%.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
Functional Relationships and Regression Analysis.
ERIC Educational Resources Information Center
Preece, Peter F. W.
1978-01-01
Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression
ERIC Educational Resources Information Center
Beckstead, Jason W.
2012-01-01
The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…
Suppression Situations in Multiple Linear Regression
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Wang, Kan; Gan, Xuhua; Tang, Xinyun; Wang, Shuo; Tan, Huarong
2010-02-01
Kombucha is a health tonic. D-saccharic acid-1,4-lactone (DSL), a component of kombucha, inhibits the activity of glucuronidase, an enzyme indirectly related with cancers. To date, there is no efficient method to determine the content of DSL in kombucha samples. In this paper, we report a rapid and simple method for the separation and determination of DSL in kombucha samples, using the high-performance capillary electrophoresis (HPCE) method with diode array detection (DAD). With optimized conditions, DSL can be separated in a 50 cm length capillary at a separation voltage of 20 kV in 40 mmol/L borax buffer (pH 6.5) containing 30 mmol/L SDS and 15% methanol (v/v). Quantitative evaluation of DSL was determined by ultraviolet absorption at lambda=190 nm. The relationship between the peak areas and the DSL concentrations, in a specified working range with linear response, was determined by first-order polynomial regression over the range 50-1500 microg/mL with a detection limit of 17.5 microg/mL. Our method demonstrated excellent reproducibility and accuracy with relative standard deviations (RSD) of less than 5% DSL content (n=5). This is the first report to determine DSL by HPCE. We have successfully applied this method to determine DSL in kombucha samples in various fermented conditions. 2009 Elsevier B.V. All rights reserved.
Genetic evaluation of aspects of temperament in Nellore-Angus calves.
Riley, D G; Gill, C A; Herring, A D; Riggs, P K; Sawyer, J E; Lunt, D K; Sanders, J O
2014-08-01
The objective of this work was to estimate heritability of each of 5 subjectively measured aspects of temperament of cattle and the genetic correlations of pairs of those traits. From 2003 to 2013, Nellore-Angus F2 and F3 calves (n = 1,816) were evaluated for aspects of temperament at an average 259 d of age, which was approximately 2 mo after weaning. Calves were separated from a group and subjectively scored from 1 (calm, good temperament) to 9 (wild, poor temperament) for aggressiveness (willingness to hit an evaluator), nervousness, flightiness, gregariousness (willingness to separate from the group), and a distinct overall score by 4 evaluators. Data were analyzed using threshold and linear models with additive genetic random effects. Two-trait animal models (nonthreshold) included the additive genetic covariance for pairs of traits and were used to estimate additive genetic correlations. Contemporary groups (n = 104) represented calves penned together for evaluation on given evaluation days. Heifers had greater (worse) means for all traits than steers (P < 0.05). The regression of score on age in days was included in final models for flightiness (P = 0.05; -0.006 ± 0.003) and gregariousness (P = 0.025; -0.007 ± 0.003). Estimates of heritability were large (0.51, 0.4, 0.45, 0.49, and 0.47 for aggressiveness, nervousness, flightiness, gregariousness, and overall temperament, respectively; SE = 0.07 for each). The ability to use this methodology to distinctly separate different aspects of calf temperament appeared to be limited, as estimates of additive genetic correlations were near unity for all pairs of traits; estimates of phenotypic correlation ranged from 0.88 ± 0.01 to 0.99 ± 0.002 for pairs of traits. Distinct subsequent analyses indicated a significant negative relationship of 4 of the various temperament scores with weight at weaning (regression coefficients ranged from -0.008 ± 0.002 for nervousness, flightiness, and gregariousness to -0.003 ± 0.002 for aggressiveness). In subsequent analyses, the regression of temperament trait on sequence of evaluation within a pen was highly significant and solutions ranged from 0.05 ± 0.007 for aggressiveness to 0.08 ± 0.007 for all other traits. The apparent large additive genetic variance for any one of these traits may be useful in identification of genes responsible for differences in cattle temperament.
Yokoi, Masayuki; Tashiro, Takao
2014-01-01
We studied how the separation of dispensing and prescribing of medicines between pharmacies and clinics (the “separation system”) can reduce internal medicine costs. To do so, we obtained publicly available data by searching electronic databases and official web pages of the Japanese government and non-profit public service corporations on the Internet. For Japanese medical institutions, participation in the separation system is optional. Consequently, the expansion rate of the separation system for each of the administrative districts is highly variable. The data were subjected to multiple regression analysis; daily internal medicines were the objective variable and expansion rate of the separation system was the explanatory variable. A multiple regression analysis revealed that the expansion rate of the separation system and the rate of replacing brand name medicine with generic medicine showed a significant negative partial correlation with daily internal medicine costs. Thus, the separation system was as effective in reducing medicine costs as the use of generic medicines. Because of its medical economic efficiency, the separation system should be expanded, especially in Asian countries in which the system is underdeveloped. PMID:24999122
Yokoi, Masayuki; Tashiro, Takao
2014-04-07
We studied how the separation of dispensing and prescribing of medicines between pharmacies and clinics (the "separation system") can reduce internal medicine costs. To do so, we obtained publicly available data by searching electronic databases and official web pages of the Japanese government and non-profit public service corporations on the Internet. For Japanese medical institutions, participation in the separation system is optional. Consequently, the expansion rate of the separation system for each of the administrative districts is highly variable. The data were subjected to multiple regression analysis; daily internal medicines were the objective variable and expansion rate of the separation system was the explanatory variable. A multiple regression analysis revealed that the expansion rate of the separation system and the rate of replacing brand name medicine with generic medicine showed a significant negative partial correlation with daily internal medicine costs. Thus, the separation system was as effective in reducing medicine costs as the use of generic medicines. Because of its medical economic efficiency, the separation system should be expanded, especially in Asian countries in which the system is underdeveloped.
Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A
2005-04-15
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
HYDRORECESSION: A toolbox for streamflow recession analysis
NASA Astrophysics Data System (ADS)
Arciniega, S.
2015-12-01
Streamflow recession curves are hydrological signatures allowing to study the relationship between groundwater storage and baseflow and/or low flows at the catchment scale. Recent studies have showed that streamflow recession analysis can be quite sensitive to the combination of different models, extraction techniques and parameter estimation methods. In order to better characterize streamflow recession curves, new methodologies combining multiple approaches have been recommended. The HYDRORECESSION toolbox, presented here, is a Matlab graphical user interface developed to analyse streamflow recession time series with the support of different tools allowing to parameterize linear and nonlinear storage-outflow relationships through four of the most useful recession models (Maillet, Boussinesq, Coutagne and Wittenberg). The toolbox includes four parameter-fitting techniques (linear regression, lower envelope, data binning and mean squared error) and three different methods to extract hydrograph recessions segments (Vogel, Brutsaert and Aksoy). In addition, the toolbox has a module that separates the baseflow component from the observed hydrograph using the inverse reservoir algorithm. Potential applications provided by HYDRORECESSION include model parameter analysis, hydrological regionalization and classification, baseflow index estimates, catchment-scale recharge and low-flows modelling, among others. HYDRORECESSION is freely available for non-commercial and academic purposes.
Faraji, Mohammad; Noorbakhsh, Roya; Shafieyan, Hooshang; Ramezani, Mohammadkazem
2018-02-01
A QuEChERS based methodology was developed for the simultaneous identification and quantification of acetamiprid, imidacloprid, and spirotetramat and their relevant metabolites in pistachio by liquid chromatography-tandem mass spectrometry for the first time. First, sample extraction was done with MeCN:citrate buffer:NaHCO 3 followed by phase separation with the addition of MgSO 4 :NaCl. The supernatant was then cleaned by a primary-secondary amine (PSA), GCB, and MgSO 4 . The proposed method provides a linearity in the range of 5-200µgL -1 , and the linear regression coefficients were higher than 0.99. LOD and LOQ were obtained to be 2 and 5µgkg -1 for the studied insecticides, respectively, with the exception of imidacloprid-olefin (5 and 10µgkg -1 ). Acceptable recoveries (91-110%) were obtained for all the analytes with good intra- and inter-precisions (0.4≥RSD ≤11.0). The method was then used for the pistachio samples collected from a field trial to estimate the maximum residue limits (MRLs) in next step. Copyright © 2017 Elsevier Ltd. All rights reserved.
Principles of proportional recovery after stroke generalize to neglect and aphasia.
Marchi, N A; Ptak, R; Di Pietro, M; Schnider, A; Guggisberg, A G
2017-08-01
Motor recovery after stroke can be characterized into two different patterns. A majority of patients recover about 70% of initial impairment, whereas some patients with severe initial deficits show little or no improvement. Here, we investigated whether recovery from visuospatial neglect and aphasia is also separated into two different groups and whether similar proportions of recovery can be expected for the two cognitive functions. We assessed 35 patients with neglect and 14 patients with aphasia at 3 weeks and 3 months after stroke using standardized tests. Recovery patterns were classified with hierarchical clustering and the proportion of recovery was estimated from initial impairment using a linear regression analysis. Patients were reliably clustered into two different groups. For patients in the first cluster (n = 40), recovery followed a linear model where improvement was proportional to initial impairment and achieved 71% of maximal possible recovery for both cognitive deficits. Patients in the second cluster (n = 9) exhibited poor recovery (<25% of initial impairment). Our findings indicate that improvement from neglect or aphasia after stroke shows the same dichotomy and proportionality as observed in motor recovery. This is suggestive of common underlying principles of plasticity, which apply to motor and cognitive functions. © 2017 EAN.
NASA Technical Reports Server (NTRS)
Remsberg, Ellis E.
2009-01-01
Fourteen-year time series of mesospheric and upper stratospheric temperatures from the Halogen Occultation Experiment (HALOE) are analyzed and reported. The data have been binned according to ten-degree wide latitude zones from 40S to 40N and at 10 altitudes from 43 to 80 km-a total of 90 separate time series. Multiple linear regression (MLR) analysis techniques have been applied to those time series. This study focuses on resolving their 11-yr solar cycle (or SC-like) responses and their linear trend terms. Findings for T(z) from HALOE are compared directly with published results from ground-based Rayleigh lidar and rocketsonde measurements. SC-like responses from HALOE compare well with those from lidar station data at low latitudes. The cooling trends from HALOE also agree reasonably well with those from the lidar data for the concurrent decade. Cooling trends of the lower mesosphere from HALOE are not as large as those from rocketsondes and from lidar station time series of the previous two decades, presumably because the changes in the upper stratospheric ozone were near zero during the HALOE time period and did not affect those trends.
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M
In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.
Wu, Lingtao; Lord, Dominique
2017-05-01
This study further examined the use of regression models for developing crash modification factors (CMFs), specifically focusing on the misspecification in the link function. The primary objectives were to validate the accuracy of CMFs derived from the commonly used regression models (i.e., generalized linear models or GLMs with additive linear link functions) when some of the variables have nonlinear relationships and quantify the amount of bias as a function of the nonlinearity. Using the concept of artificial realistic data, various linear and nonlinear crash modification functions (CM-Functions) were assumed for three variables. Crash counts were randomly generated based on these CM-Functions. CMFs were then derived from regression models for three different scenarios. The results were compared with the assumed true values. The main findings are summarized as follows: (1) when some variables have nonlinear relationships with crash risk, the CMFs for these variables derived from the commonly used GLMs are all biased, especially around areas away from the baseline conditions (e.g., boundary areas); (2) with the increase in nonlinearity (i.e., nonlinear relationship becomes stronger), the bias becomes more significant; (3) the quality of CMFs for other variables having linear relationships can be influenced when mixed with those having nonlinear relationships, but the accuracy may still be acceptable; and (4) the misuse of the link function for one or more variables can also lead to biased estimates for other parameters. This study raised the importance of the link function when using regression models for developing CMFs. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Dai, Xiaoqian; Tian, Jie; Chen, Zhe
2010-03-01
Parametric images can represent both spatial distribution and quantification of the biological and physiological parameters of tracer kinetics. The linear least square (LLS) method is a well-estimated linear regression method for generating parametric images by fitting compartment models with good computational efficiency. However, bias exists in LLS-based parameter estimates, owing to the noise present in tissue time activity curves (TTACs) that propagates as correlated error in the LLS linearized equations. To address this problem, a volume-wise principal component analysis (PCA) based method is proposed. In this method, firstly dynamic PET data are properly pre-transformed to standardize noise variance as PCA is a data driven technique and can not itself separate signals from noise. Secondly, the volume-wise PCA is applied on PET data. The signals can be mostly represented by the first few principle components (PC) and the noise is left in the subsequent PCs. Then the noise-reduced data are obtained using the first few PCs by applying 'inverse PCA'. It should also be transformed back according to the pre-transformation method used in the first step to maintain the scale of the original data set. Finally, the obtained new data set is used to generate parametric images using the linear least squares (LLS) estimation method. Compared with other noise-removal method, the proposed method can achieve high statistical reliability in the generated parametric images. The effectiveness of the method is demonstrated both with computer simulation and with clinical dynamic FDG PET study.
Linear regression models for solvent accessibility prediction in proteins.
Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław
2005-04-01
The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
ERIC Educational Resources Information Center
Jurs, Stephen; And Others
The scree test and its linear regression technique are reviewed, and results of its use in factor analysis and Delphi data sets are described. The scree test was originally a visual approach for making judgments about eigenvalues, which considered the relationships of the eigenvalues to one another as well as their actual values. The graph that is…
Zong, Shi-Yu; Han, Han; Wang, Bing; Li, Ning; Dong, Tina Ting-Xia; Zhang, Tong; Tsim, Karl W K
2015-12-04
A reliable ultra-high-performance liquid chromatography-electrospray ionization-tandem mass spectrometry (UHPLC-ESI-MS/MS) method for the fast simultaneous determination of 13 nucleosides and nucleobases in Cordyceps sinensis (C. sinensis) with 2-chloroadenosine as internal standard was developed and validated. Samples were ultrasonically extracted in an ice bath thrice, and the optimum analyte separation was performed on an ACQUITY UPLC(TM) HSS C18 column (100 mm × 2.1 mm, 1.8 μm) with gradient elution. All targeted analytes were separated in 5.5 min. Furthermore, all calibration curves showed good linear regression (r > 0.9970) within the test ranges, and the limits of quantitation and detection of the 13 analytes were less than 150 and 75 ng/mL, respectively. The relative standard deviations (RSDs) of intra- and inter-day precisions were <6.23%. Recoveries of the quantified analytes ranged within 85.3%-117.3%, with RSD < 6.18%. The developed UHPLC-ESI-MS/MS method was successfully applied to determine nucleosides and nucleobases in 11 batches of C. sinensis samples from different regions in China. The range for the total content in the analyzed samples was 1329-2057 µg/g.
Abushoffa, Adel M; Fillet, Marianne; Hubert, Phillipe; Crommen, Jacques
2002-03-01
The single-isomer polyanionic cyclodextrin (CD) derivative heptakis-6-sulfato-beta-cyclodextrin (HSbetaCD) has been tested as chiral additive for the enantioseparation of non-steroidal anti-inflammatory drugs, such as fenoprofen, flurbiprofen, ibuprofen and ketoprofen, in capillary electrophoresis, using a pH 2.5 phosphoric acid-triethanolamine buffer in the reversed polarity mode. In most cases, the enantiomers of these acidic compounds, present in uncharged form at that pH, were only poorly resolved with HSbetaCD alone. However, the use of HSbetaCD in combination with the neutral CD derivative, heptakis-(2,3,6-tri-O-methyl)-beta-cyclodextrin (TMbetaCD), which has a particularly high enantioselectivity towards these compounds, has led to complete enantioresolution in reasonably low migration times in most cases. Affinity constants for the enantiomers with the two cyclodextrins were determined, using linear regression in a two-step approach. Affinity constants with the charged HSbetaCD were first calculated in single systems while those with the neutral TMbetaCD were determined in dual systems. Selectivity for the enantiomeric separation of these compounds in dual CD systems could be predicted using recently developed mathematical models.
Madarang, Krish J; Kang, Joo-Hyon
2014-06-01
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method.
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2015-11-18
Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2016-01-01
Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889
Some comparisons of complexity in dictionary-based and linear computational models.
Gnecco, Giorgio; Kůrková, Věra; Sanguineti, Marcello
2011-03-01
Neural networks provide a more flexible approximation of functions than traditional linear regression. In the latter, one can only adjust the coefficients in linear combinations of fixed sets of functions, such as orthogonal polynomials or Hermite functions, while for neural networks, one may also adjust the parameters of the functions which are being combined. However, some useful properties of linear approximators (such as uniqueness, homogeneity, and continuity of best approximation operators) are not satisfied by neural networks. Moreover, optimization of parameters in neural networks becomes more difficult than in linear regression. Experimental results suggest that these drawbacks of neural networks are offset by substantially lower model complexity, allowing accuracy of approximation even in high-dimensional cases. We give some theoretical results comparing requirements on model complexity for two types of approximators, the traditional linear ones and so called variable-basis types, which include neural networks, radial, and kernel models. We compare upper bounds on worst-case errors in variable-basis approximation with lower bounds on such errors for any linear approximator. Using methods from nonlinear approximation and integral representations tailored to computational units, we describe some cases where neural networks outperform any linear approximator. Copyright © 2010 Elsevier Ltd. All rights reserved.
Montoye, Alexander H K; Begum, Munni; Henning, Zachary; Pfeiffer, Karin A
2017-02-01
This study had three purposes, all related to evaluating energy expenditure (EE) prediction accuracy from body-worn accelerometers: (1) compare linear regression to linear mixed models, (2) compare linear models to artificial neural network models, and (3) compare accuracy of accelerometers placed on the hip, thigh, and wrists. Forty individuals performed 13 activities in a 90 min semi-structured, laboratory-based protocol. Participants wore accelerometers on the right hip, right thigh, and both wrists and a portable metabolic analyzer (EE criterion). Four EE prediction models were developed for each accelerometer: linear regression, linear mixed, and two ANN models. EE prediction accuracy was assessed using correlations, root mean square error (RMSE), and bias and was compared across models and accelerometers using repeated-measures analysis of variance. For all accelerometer placements, there were no significant differences for correlations or RMSE between linear regression and linear mixed models (correlations: r = 0.71-0.88, RMSE: 1.11-1.61 METs; p > 0.05). For the thigh-worn accelerometer, there were no differences in correlations or RMSE between linear and ANN models (ANN-correlations: r = 0.89, RMSE: 1.07-1.08 METs. Linear models-correlations: r = 0.88, RMSE: 1.10-1.11 METs; p > 0.05). Conversely, one ANN had higher correlations and lower RMSE than both linear models for the hip (ANN-correlation: r = 0.88, RMSE: 1.12 METs. Linear models-correlations: r = 0.86, RMSE: 1.18-1.19 METs; p < 0.05), and both ANNs had higher correlations and lower RMSE than both linear models for the wrist-worn accelerometers (ANN-correlations: r = 0.82-0.84, RMSE: 1.26-1.32 METs. Linear models-correlations: r = 0.71-0.73, RMSE: 1.55-1.61 METs; p < 0.01). For studies using wrist-worn accelerometers, machine learning models offer a significant improvement in EE prediction accuracy over linear models. Conversely, linear models showed similar EE prediction accuracy to machine learning models for hip- and thigh-worn accelerometers and may be viable alternative modeling techniques for EE prediction for hip- or thigh-worn accelerometers.
Diagnosis of Enzyme Inhibition Using Excel Solver: A Combined Dry and Wet Laboratory Exercise
ERIC Educational Resources Information Center
Dias, Albino A.; Pinto, Paula A.; Fraga, Irene; Bezerra, Rui M. F.
2014-01-01
In enzyme kinetic studies, linear transformations of the Michaelis-Menten equation, such as the Lineweaver-Burk double-reciprocal transformation, present some constraints. The linear transformation distorts the experimental error and the relationship between "x" and "y" axes; consequently, linear regression of transformed data…
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
1981-09-01
corresponds to the same square footage that consumed the electrical energy. 3. The basic assumptions of multiple linear regres- sion, as enumerated in...7. Data related to the sample of bases is assumed to be representative of bases in the population. Limitations Basic limitations on this research were... Ratemaking --Overview. Rand Report R-5894, Santa Monica CA, May 1977. Chatterjee, Samprit, and Bertram Price. Regression Analysis by Example. New York: John
Short separation channel location impacts the performance of short channel regression in NIRS
Gagnon, Louis; Cooper, Robert J.; Yücel, Meryem A.; Perdue, Katherine L.; Greve, Douglas N.; Boas, David A.
2011-01-01
Near-Infrared Spectroscopy (NIRS) allows the recovery of cortical oxy-and deoxyhemoglobin changes associated with evoked brain activity. NIRS is a back-reflection measurement making it very sensitive to the superficial layers of the head, i.e. the skin and the skull, where systemic interference occurs. As a result, the NIRS signal is strongly contaminated with systemic interference of superficial origin. A recent approach to overcome this problem has been the use of additional short source-detector separation optodes as regressors. Since these additional measurements are mainly sensitive to superficial layers in adult humans, they can be used to remove the systemic interference present in longer separation measurements, improving the recovery of the cortical hemodynamic response function (HRF). One question that remains to answer is whether or not a short separation measurement is required in close proximity to each long separation NIRS channel. Here, we show that the systemic interference occurring in the superficial layers of the human head is inhomogeneous across the surface of the scalp. As a result, the improvement obtained by using a short separation optode decreases as the relative distance between the short and the long measurement is increased. NIRS data was acquired on 6 human subjects both at rest and during a motor task consisting of finger tapping. The effect of distance between the short and the long channel was first quantified by recovering a synthetic hemodynamic response added over the resting-state data. The effect was also observed in the functional data collected during the finger tapping task. Together, these results suggest that the short separation measurement must be located as close as 1.5 cm from the standard NIRS channel in order to provide an improvement which is of practical use. In this case, the improvement in Contrast-to-Noise Ratio (CNR) compared to a standard General Linear Model (GLM) procedure without using any small separation optode reached 50 % for HbO and 100 % for HbR. Using small separations located farther than 2 cm away resulted in mild or negligible improvements only. PMID:21945793
Study on power grid characteristics in summer based on Linear regression analysis
NASA Astrophysics Data System (ADS)
Tang, Jin-hui; Liu, You-fei; Liu, Juan; Liu, Qiang; Liu, Zhuan; Xu, Xi
2018-05-01
The correlation analysis of power load and temperature is the precondition and foundation for accurate load prediction, and a great deal of research has been made. This paper constructed the linear correlation model between temperature and power load, then the correlation of fault maintenance work orders with the power load is researched. Data details of Jiangxi province in 2017 summer such as temperature, power load, fault maintenance work orders were adopted in this paper to develop data analysis and mining. Linear regression models established in this paper will promote electricity load growth forecast, fault repair work order review, distribution network operation weakness analysis and other work to further deepen the refinement.
Separation and reconstruction of high pressure water-jet reflective sound signal based on ICA
NASA Astrophysics Data System (ADS)
Yang, Hongtao; Sun, Yuling; Li, Meng; Zhang, Dongsu; Wu, Tianfeng
2011-12-01
The impact of high pressure water-jet on the different materials target will produce different reflective mixed sound. In order to reconstruct the reflective sound signals distribution on the linear detecting line accurately and to separate the environment noise effectively, the mixed sound signals acquired by linear mike array were processed by ICA. The basic principle of ICA and algorithm of FASTICA were described in detail. The emulation experiment was designed. The environment noise signal was simulated by using band-limited white noise and the reflective sound signal was simulated by using pulse signal. The reflective sound signal attenuation produced by the different distance transmission was simulated by weighting the sound signal with different contingencies. The mixed sound signals acquired by linear mike array were synthesized by using the above simulated signals and were whitened and separated by ICA. The final results verified that the environment noise separation and the reconstruction of the detecting-line sound distribution can be realized effectively.
Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.
ERIC Educational Resources Information Center
Thompson, Bruce
Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…
Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Quantile Regression in the Study of Developmental Sciences
ERIC Educational Resources Information Center
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
Experimental and computational prediction of glass transition temperature of drugs.
Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S
2014-12-22
Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.
NASA Astrophysics Data System (ADS)
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C
2015-01-01
We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
A New SEYHAN's Approach in Case of Heterogeneity of Regression Slopes in ANCOVA.
Ankarali, Handan; Cangur, Sengul; Ankarali, Seyit
2018-06-01
In this study, when the assumptions of linearity and homogeneity of regression slopes of conventional ANCOVA are not met, a new approach named as SEYHAN has been suggested to use conventional ANCOVA instead of robust or nonlinear ANCOVA. The proposed SEYHAN's approach involves transformation of continuous covariate into categorical structure when the relationship between covariate and dependent variable is nonlinear and the regression slopes are not homogenous. A simulated data set was used to explain SEYHAN's approach. In this approach, we performed conventional ANCOVA in each subgroup which is constituted according to knot values and analysis of variance with two-factor model after MARS method was used for categorization of covariate. The first model is a simpler model than the second model that includes interaction term. Since the model with interaction effect has more subjects, the power of test also increases and the existing significant difference is revealed better. We can say that linearity and homogeneity of regression slopes are not problem for data analysis by conventional linear ANCOVA model by helping this approach. It can be used fast and efficiently for the presence of one or more covariates.
The Influential Effect of Blending, Bump, Changing Period, and Eclipsing Cepheids on the Leavitt Law
NASA Astrophysics Data System (ADS)
García-Varela, A.; Muñoz, J. R.; Sabogal, B. E.; Vargas Domínguez, S.; Martínez, J.
2016-06-01
The investigation of the nonlinearity of the Leavitt law (LL) is a topic that began more than seven decades ago, when some of the studies in this field found that the LL has a break at about 10 days. The goal of this work is to investigate a possible statistical cause of this nonlinearity. By applying linear regressions to OGLE-II and OGLE-IV data, we find that to obtain the LL by using linear regression, robust techniques to deal with influential points and/or outliers are needed instead of the ordinary least-squares regression traditionally used. In particular, by using M- and MM-regressions we establish firmly and without doubt the linearity of the LL in the Large Magellanic Cloud, without rejecting or excluding Cepheid data from the analysis. This implies that light curves of Cepheids suggesting blending, bumps, eclipses, or period changes do not affect the LL for this galaxy. For the Small Magellanic Cloud, when including Cepheids of this kind, it is not possible to find an adequate model, probably because of the geometry of the galaxy. In that case, a possible influence of these stars could exist.
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Relationships between Circulating and Intraprostatic Sex Steroid Hormone Concentrations.
Cook, Michael B; Stanczyk, Frank Z; Wood, Shannon N; Pfeiffer, Ruth M; Hafi, Muhannad; Veneroso, Carmela C; Lynch, Barlow; Falk, Roni T; Zhou, Cindy Ke; Niwa, Shelley; Emanuel, Eric; Gao, Yu-Tang; Hemstreet, George P; Zolfghari, Ladan; Carroll, Peter R; Manyak, Michael J; Sesterhann, Isabell A; Levine, Paul H; Hsing, Ann W
2017-11-01
Background: Sex hormones have been implicated in prostate carcinogenesis, yet epidemiologic studies have not provided substantiating evidence. We tested the hypothesis that circulating concentrations of sex steroid hormones reflect intraprostatic concentrations using serum and adjacent microscopically verified benign prostate tissue from prostate cancer cases. Methods: Incident localized prostate cancer cases scheduled for surgery were invited to participate. Consented participants completed surveys, and provided resected tissues and blood. Histologic assessment of the ends of fresh frozen tissue confirmed adjacent microscopically verified benign pathology. Sex steroid hormones in sera and tissues were extracted, chromatographically separated, and then quantitated by radioimmunoassays. Linear regression was used to account for variations in intraprostatic hormone concentrations by age, body mass index, race, and study site, and subsequently to assess relationships with serum hormone concentrations. Gleason score (from adjacent tumor tissue), race, and age were assessed as potential effect modifiers. Results: Circulating sex steroid hormone concentrations had low-to-moderate correlations with, and explained small proportions of variations in, intraprostatic sex steroid hormone concentrations. Androstane-3α,17β-diol glucuronide (3α-diol G) explained the highest variance of tissue concentrations of 3α-diol G (linear regression r 2 = 0.21), followed by serum testosterone and tissue dihydrotestosterone ( r 2 = 0.10), and then serum estrone and tissue estrone ( r 2 = 0.09). There was no effect modification by Gleason score, race, or age. Conclusions: Circulating concentrations of sex steroid hormones are poor surrogate measures of the intraprostatic hormonal milieu. Impact: The high exposure misclassification provided by circulating sex steroid hormone concentrations for intraprostatic levels may partly explain the lack of any consistent association of circulating hormones with prostate cancer risk. Cancer Epidemiol Biomarkers Prev; 26(11); 1660-6. ©2017 AACR . ©2017 American Association for Cancer Research.
Estimating the effect of multiple environmental stressors on coral bleaching and mortality.
Welle, Paul D; Small, Mitchell J; Doney, Scott C; Azevedo, Inês L
2017-01-01
Coral cover has been declining in recent decades due to increased temperatures and environmental stressors. However, the extent to which different stressors contribute both individually and in concert to bleaching and mortality is still very uncertain. We develop and use a novel regression approach, using non-linear parametric models that control for unobserved time invariant effects to estimate the effects on coral bleaching and mortality due to temperature, solar radiation, depth, hurricanes and anthropogenic stressors using historical data from a large bleaching event in 2005 across the Caribbean. Two separate models are created, one to predict coral bleaching, and the other to predict near-term mortality. A large ensemble of supporting data is assembled to control for omitted variable bias and improve fit, and a significant improvement in fit is observed from univariate linear regression based on temperature alone. The results suggest that climate stressors (temperature and radiation) far outweighed direct anthropogenic stressors (using distance from shore and nearby human population density as a proxy for such stressors) in driving coral health outcomes during the 2005 event. Indeed, temperature was found to play a role ~4 times greater in both the bleaching and mortality response than population density across their observed ranges. The empirical models tested in this study have large advantages over ordinary-least squares-they offer unbiased estimates for censored data, correct for spatial correlation, and are capable of handling more complex relationships between dependent and independent variables. The models offer a framework for preparing for future warming events and climate change; guiding monitoring and attribution of other bleaching and mortality events regionally and around the globe; and informing adaptive management and conservation efforts.
Estimating the effect of multiple environmental stressors on coral bleaching and mortality
Welle, Paul D.; Small, Mitchell J.; Doney, Scott C.; Azevedo, Inês L.
2017-01-01
Coral cover has been declining in recent decades due to increased temperatures and environmental stressors. However, the extent to which different stressors contribute both individually and in concert to bleaching and mortality is still very uncertain. We develop and use a novel regression approach, using non-linear parametric models that control for unobserved time invariant effects to estimate the effects on coral bleaching and mortality due to temperature, solar radiation, depth, hurricanes and anthropogenic stressors using historical data from a large bleaching event in 2005 across the Caribbean. Two separate models are created, one to predict coral bleaching, and the other to predict near-term mortality. A large ensemble of supporting data is assembled to control for omitted variable bias and improve fit, and a significant improvement in fit is observed from univariate linear regression based on temperature alone. The results suggest that climate stressors (temperature and radiation) far outweighed direct anthropogenic stressors (using distance from shore and nearby human population density as a proxy for such stressors) in driving coral health outcomes during the 2005 event. Indeed, temperature was found to play a role ~4 times greater in both the bleaching and mortality response than population density across their observed ranges. The empirical models tested in this study have large advantages over ordinary-least squares–they offer unbiased estimates for censored data, correct for spatial correlation, and are capable of handling more complex relationships between dependent and independent variables. The models offer a framework for preparing for future warming events and climate change; guiding monitoring and attribution of other bleaching and mortality events regionally and around the globe; and informing adaptive management and conservation efforts. PMID:28472031
Chen, Tzu-An; Baranowski, Janice; Thompson, Deborah; Baranowski, Tom
2013-01-01
Abstract Background Children's physical activity (PA) is inversely associated with children's weight status. Parents may be an important influence on children's PA by restricting sedentary time or supporting PA. The aim of this study was to investigate the association of PA and screen-media–related [television (TV) and videogame] parenting practices with children's PA. Methods Secondary analyses of baseline data were performed from an intervention with 9- to 12-year-olds who received active or inactive videogames (n=83) to promote PA. Children's PA was assessed with 1 week of accelerometry at baseline. Parents reported their PA, TV, and videogame parenting practices and child's bedroom screen-media availability. Associations were investigated using Spearman's partial correlations and linear regressions. Results Although several TV and videogame parenting practices were significantly intercorrelated, only a few significant correlations existed between screen-media and PA parenting practices. In linear regression models, restrictive TV parenting practices were associated with greater child sedentary time (p=0.03) and less moderate-to-vigorous PA (MVPA; p=0.01). PA logistic support parenting practices were associated with greater child MVPA (p=0.03). Increased availability of screen-media equipment in the child's bedroom was associated with more sedentary time (p=0.02) and less light PA (p=0.01) and MVPA (p=0.05) in all three models. Conclusion In this cross-sectional sample, restrictive screen-media and supportive PA parenting practices had opposite associations with children's PA. Longitudinal and experimental child PA studies should assess PA and screen-media parenting separately to understand how parents influence their child's PA behaviors and whether the child's baseline PA or screen media behaviors affect the parent's use of parenting practices. Recommendations to remove screens from children's bedrooms may also affect their PA. PMID:24028564
Use of iDXA spine scans to evaluate total and visceral abdominal fat.
Bea, J W; Hsu, C-H; Blew, R M; Irving, A P; Caan, B J; Kwan, M L; Abraham, I; Going, S B
2018-01-01
Abdominal fat may be a better predictor than body mass index (BMI) for risk of metabolically-related diseases, such as diabetes, cardiovascular disease, and some cancers. We sought to validate the percent fat reported on dual energy X-ray absorptiometry (DXA) regional spine scans (spine fat fraction, SFF) against abdominal fat obtained from total body scans using the iDXA machine (General Electric, Madison, WI), as previously done on the Prodigy model. Total body scans and regional spine scans were completed on the same day (N = 50). In alignment with the Prodigy-based study, the following regions of interest (ROI) were assessed from total body scans and compared to the SFF from regional spine scans: total abdominal fat at (1) lumbar vertebrae L2-L4 and (2) L2-Iliac Crest (L2-IC); (3) total trunk fat; and (4) visceral fat in the android region. Separate linear regression models were used to predict each total body scan ROI from SFF; models were validated by bootstrapping. The sample was 84% female, a mean age of 38.5 ± 17.4 years, and mean BMI of 23.0 ± 3.8 kg/m 2 . The SFF, adjusted for BMI, predicted L2-L4 and L2-IC total abdominal fat (%; Adj. R 2 : 0.90) and total trunk fat (%; Adj. R 2 : 0.88) well; visceral fat (%) adjusted R 2 was 0.83. Linear regression models adjusted for additional participant characteristics resulted in similar adjusted R 2 values. This replication of the strong correlation between SFF and abdominal fat measures on the iDXA in a new population confirms the previous Prodigy model findings and improves generalizability. © 2017 Wiley Periodicals, Inc.
Does childhood motor skill proficiency predict adolescent fitness?
Barnett, Lisa M; Van Beurden, Eric; Morgan, Philip J; Brooks, Lyndon O; Beard, John R
2008-12-01
To determine whether childhood fundamental motor skill proficiency predicts subsequent adolescent cardiorespiratory fitness. In 2000, children's proficiency in a battery of skills was assessed as part of an elementary school-based intervention. Participants were followed up during 2006/2007 as part of the Physical Activity and Skills Study, and cardiorespiratory fitness was measured using the Multistage Fitness Test. Linear regression was used to examine the relationship between childhood fundamental motor skill proficiency and adolescent cardiorespiratory fitness controlling for gender. Composite object control (kick, catch, throw) and locomotor skill (hop, side gallop, vertical jump) were constructed for analysis. A separate linear regression examined the ability of the sprint run to predict cardiorespiratory fitness. Of the 928 original intervention participants, 481 were in 28 schools, 276 (57%) of whom were assessed. Two hundred and forty-four students (88.4%) completed the fitness test. One hundred and twenty-seven were females (52.1%), 60.1% of whom were in grade 10 and 39.0% were in grade 11. As children, almost all 244 completed each motor assessments, except for the sprint run (n = 154, 55.8%). The mean composite skill score in 2000 was 17.7 (SD 5.1). In 2006/2007, the mean number of laps on the Multistage Fitness Test was 50.5 (SD 24.4). Object control proficiency in childhood, adjusting for gender (P = 0.000), was associated with adolescent cardiorespiratory fitness (P = 0.012), accounting for 26% of fitness variation. Children with good object control skills are more likely to become fit adolescents. Fundamental motor skill development in childhood may be an important component of interventions aiming to promote long-term fitness.
Houde, Francis; Cabana, François; Léonard, Guillaume
2016-01-01
Previous studies have revealed a weak to moderate relationship between pain and disability in individuals suffering from low back pain (LBP). However, to our knowledge, no studies have evaluated if this relationship is different between young and older adults. The objective of this descriptive, cross-sectional study was to determine whether the relationship between LBP intensity and physical disability is different between young and older adults. Pain intensity (measured with a visual analog scale) and physical disability scores (measured with the Oswestry Disability Index) were collected from the medical files of 164 patients with LBP. Separate Pearson correlation coefficients were calculated between these 2 variables for young (mean age 40 ± 6 years, n = 82) and older (62 ± 9 years, n = 82) individuals and a Fisher r-to-z transformation was used to test for group differences in the strength of the relationship. Linear regression analyses were also performed to determine whether the slope of the association was different between the 2 groups. A significant and positive association was found between pain intensity and disability for both young and older individuals. However, the correlation was stronger in the young group (r = 0.66; P < .01) than in the older group (r = 0.44; P < .01) (Fisher Z = 2.03; P < .05). The linear regression model also revealed that the slope of the relationship was steeper in the young group (P < .05). Although both young and older individuals showed a significant association between pain intensity and disability, the relationship between these 2 variables was more tenuous in older individuals than in young patients. Future research is essential to identify the factors underlying this age-related difference.
Assessing the role of pavement macrotexture in preventing crashes on highways.
Pulugurtha, Srinivas S; Kusam, Prasanna R; Patel, Kuvleshay J
2010-02-01
The objective of this article is to assess the role of pavement macrotexture in preventing crashes on highways in the State of North Carolina. Laser profilometer data obtained from the North Carolina Department of Transportation (NCDOT) for highways comprising four corridors are processed to calculate pavement macrotexture at 100-m (approximately 330-ft) sections according to the American Society for Testing and Materials (ASTM) standards. Crash data collected over the same lengths of the corridors were integrated with the calculated pavement macrotexture for each section. Scatterplots were generated to assess the role of pavement macrotexture on crashes and logarithm of crashes. Regression analyses were conducted by considering predictor variables such as million vehicle miles of travel (as a function of traffic volume and length), the number of interchanges, the number of at-grade intersections, the number of grade-separated interchanges, and the number of bridges, culverts, and overhead signs along with pavement macrotexture to study the statistical significance of relationship between pavement macrotexture and crashes (both linear and log-linear) when compared to other predictor variables. Scatterplots and regression analysis conducted indicate a more statistically significant relationship between pavement macrotexture and logarithm of crashes than between pavement macrotexture and crashes. The coefficient for pavement macrotexture, in general, is negative, indicating that the number of crashes or logarithm of crashes decreases as it increases. The relation between pavement macrotexture and logarithm of crashes is generally stronger than between most other predictor variables and crashes or logarithm of crashes. Based on results obtained, it can be concluded that maintaining pavement macrotexture greater than or equal to 1.524 mm (0.06 in.) as a threshold limit would possibly reduce crashes and provide safe transportation to road users on highways.
Maternal and neonatal vitamin D status, genotype and childhood celiac disease.
Mårild, Karl; Tapia, German; Haugen, Margareta; Dahl, Sandra R; Cohen, Arieh S; Lundqvist, Marika; Lie, Benedicte A; Stene, Lars C; Størdal, Ketil
2017-01-01
Low concentration of 25-hydroxyvitamin D during pregnancy may be associated with offspring autoimmune disorders. Little is known about environmental triggers except gluten for celiac disease, a common immune-mediated disorder where seasonality of birth has been reported as a risk factor. We therefore aimed to test whether low maternal and neonatal 25-hydroxyvitamin D predicted higher risk of childhood celiac disease. In this Norwegian nationwide pregnancy cohort (n = 113,053) and nested case-control study, we analyzed 25-hydroxyvitamin D in maternal blood from mid-pregnancy, postpartum and cord plasma of 416 children who developed celiac disease and 570 randomly selected controls. Mothers and children were genotyped for established celiac disease and vitamin D metabolism variants. We used mixed linear regression models and logistic regression to study associations. There was no significant difference in average 25-hydroxyvitamin D between cases and controls (63.1 and 62.1 nmol/l, respectively, p = 0.28), and no significant linear trend (adjusted odds ratio per 10 nM increase 1.05, 95% CI: 0.93-1.17). Results were similar when analyzing the mid-pregnancy, postpartum or cord plasma separately. Genetic variants for vitamin D deficiency were not associated with celiac disease (odds ratio per risk allele of the child, 1.00; 95% CI, 0.90 to 1.10, odds ratio per risk allele of the mother 0.94; 95% CI 0.85 to 1.04). Vitamin D intake in pregnancy or by the child in early life did not predict later celiac disease. Adjustment for established genetic risk markers for celiac disease gave similar results. We found no support for the hypothesis that maternal or neonatal vitamin D status is related to the risk of childhood celiac disease.
NASA Astrophysics Data System (ADS)
Gilmore, A. M.
2015-12-01
This study describes a method based on simultaneous absorbance and fluorescence excitation-emission mapping for rapidly and accurately monitoring dissolved organic carbon concentration and disinfection by-product formation potential for surface water sourced drinking water treatment. The method enables real-time monitoring of the Dissolved Organic Carbon (DOC), absorbance at 254 nm (UVA), the Specific UV Absorbance (SUVA) as well as the Simulated Distribution System Trihalomethane (THM) Formation Potential (SDS-THMFP) for the source and treated water among other component parameters. The method primarily involves Parallel Factor Analysis (PARAFAC) decomposition of the high and lower molecular weight humic and fulvic organic component concentrations. The DOC calibration method involves calculating a single slope factor (with the intercept fixed at 0 mg/l) by linear regression for the UVA divided by the ratio of the high and low molecular weight component concentrations. This method thus corrects for the changes in the molecular weight component composition as a function of the source water composition and coagulation treatment effects. The SDS-THMFP calibration involves a multiple linear regression of the DOC, organic component ratio, chlorine residual, pH and alkalinity. Both the DOC and SDS-THMFP correlations over a period of 18 months exhibited adjusted correlation coefficients with r2 > 0.969. The parameters can be reported as a function of compliance rules associated with required % removals of DOC (as a function of alkalinity) and predicted maximum contaminant levels (MCL) of THMs. The single instrument method, which is compatible with continuous flow monitoring or grab sampling, provides a rapid (2-3 minute) and precise indicator of drinking water disinfectant treatability without the need for separate UV photometric and DOC meter measurements or independent THM determinations.
Limbers, Christine A; Young, Danielle
2015-05-01
Executive functions play a critical role in regulating eating behaviors and have been shown to be associated with overeating which over time can result in overweight and obesity. There has been a paucity of research examining the associations among healthy dietary behaviors and executive functions utilizing behavioral rating scales of executive functioning. The objective of the present cross-sectional study was to evaluate the associations among fruit and vegetable consumption, intake of foods high in saturated fat, and executive functions using the Behavioral Rating Inventory of Executive Functioning-Adult Version. A total of 240 university students completed the Behavioral Rating Inventory of Executive Functioning-Adult Version, the 26-Item Eating Attitudes Test, and the Diet subscale of the Summary of Diabetes Self-Care Activities Questionnaire. Multiple linear regression analysis was conducted with two separate models in which fruit and vegetable consumption and saturated fat intake were the outcomes. Demographic variables, body mass index, and eating styles were controlled for in the analysis. Better initiation skills were associated with greater intake of fruits and vegetables in the last 7 days (standardized beta = -0.17; p < 0.05). Stronger inhibitory control was associated with less consumption of high fat foods in the last 7 days (standardized beta = 0.20; p < 0.05) in the multiple linear regression analysis. Executive functions that predict fruit and vegetable consumption are distinct from those that predict avoidance of foods high in saturated fat. Future research should investigate whether continued skill enhancement in initiation and inhibition following standard behavioral interventions improves long-term maintenance of weight loss. © The Author(s) 2015.
Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan
2016-01-01
Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489
Costa, Patrício; de Carvalho-Filho, Marco Antonio; Schweller, Marcelo; Thiemann, Pia; Salgueira, Ana; Benson, John; Costa, Manuel João; Quince, Thelma
2017-06-01
Understanding medical student empathy is important to future patient care; however, the definition and development of clinical empathy remain unclear. The authors sought to examine the underlying constructs of two of the most widely used self-report instruments-Davis's Interpersonal Reactivity Index (IRI) and the Jefferson Scale of Empathy version for medical students (JSE-S)-plus, the distinctions and associations between these instruments. Between 2007 and 2014, the authors administered the IRI and JSE-S in three separate studies in five countries, (Brazil, Ireland, New Zealand, Portugal, and the United Kingdom). They collected data from 3,069 undergraduate medical students and performed exploratory factor analyses, correlation analyses, and multiple linear regression analyses. Exploratory factor analysis yielded identical results in each country, confirming the subscale structures of each instrument. Results of correlation analyses indicated significant but weak correlations (r = 0.313) between the total IRI and JSE-S scores. All intercorrelations of IRI and JSE-S subscale scores were statistically significant but weak (range r = -0.040 to 0.306). Multiple linear regression models revealed that the IRI subscales were weak predictors of all JSE-S subscale and total scores. The IRI subscales explained between 9.0% and 15.3% of variance for JSE-S subscales and 19.5% for JSE-S total score. The IRI and JSE-S are only weakly related, suggesting that they may measure different constructs. To better understand this distinction, more studies using both instruments and involving students at different stages in their medical education, as well as more longitudinal and qualitative studies, are needed.
Regidor, Enrique; Pascual, Cruz; Giráldez-García, Carolina; Galindo, Silvia; Martínez, David; Kunst, Anton E
2015-12-01
To evaluate the effect of tobacco prices and the implementation of smoke-free legislation on smoking cessation in Spain, by educational level, across the period 1993-2012. National Health Surveys data for the above two decades were used to calculate smoking cessation in people aged 25-64 years. The relationship between tobacco prices and smoking quit-ratio was estimated using multiple linear regression adjusted for time and the presence of smoke-free legislation. The immediate as well as the longer-term impact of the 2006 smoke-free law on quit-ratio was estimated using segmented linear regression analysis. The analyses were performed separately in men and women with high and low education, respectively. No relationship was observed between tobacco prices and smoking quit-ratio, except in women having a low educational level, among whom a rise in price was associated with a decrease in quit-ratio. The smoke-free law altered the smoking quit-ratio in the short term and altered also pre-existing trends. Smoking quit-ratio increased immediately after the ban - though this increase was significant only among women with a low educational level - and then decreased in subsequent years except among men with a high educational level. A clear relationship between tobacco prices and smoking quit-ratio was not observed in a recent period. After the implementation of smoke-free legislation the trend in the quit ratio in most of the socio-economic groups was different from the trend observed before implementation, so existing inequalities in smoking quit-ratio were not widened or narrowed. Copyright © 2015 Elsevier B.V. All rights reserved.
Sörös, Peter; Bachmann, Katharina; Lam, Alexandra P; Kanat, Manuela; Hoxhaj, Eliza; Matthies, Swantje; Feige, Bernd; Müller, Helge H O; Thiel, Christiane; Philipsen, Alexandra
2017-01-01
Attention-deficit/hyperactivity disorder (ADHD) in adulthood is a serious and frequent psychiatric disorder with the core symptoms inattention, impulsivity, and hyperactivity. The principal aim of this study was to investigate associations between brain morphology, i.e., cortical thickness and volumes of subcortical gray matter, and individual symptom severity in adult ADHD. Surface-based brain morphometry was performed in 35 women and 29 men with ADHD using FreeSurfer. Linear regressions were calculated between cortical thickness and the volumes of subcortical gray matter and the inattention, hyperactivity, and impulsivity subscales of the Conners Adult ADHD Rating Scales (CAARS). Two separate analyses were performed. For the first analysis, age was included as additional regressor. For the second analysis, both age and severity of depression were included as additional regressors. Study participants were recruited between June 2012 and January 2014. Linear regression identified an area in the left occipital cortex of men, covering parts of the middle occipital sulcus and gyrus, in which the score on the CAARS inattention subscale predicted increased mean cortical thickness [ F (1,27) = 26.27, p < 0.001, adjusted R 2 = 0.4744]. No significant associations were found between cortical thickness and the scores on CAARS subscales in women. No significant associations were found between the volumes of subcortical gray matter and the scores on CAARS subscales, neither in men nor in women. These results remained stable when severity of depression was included as additional regressor, together with age. Increased cortical thickness in the left occipital cortex may represent a mechanism to compensate for dysfunctional attentional networks in male adult ADHD patients.
NASA Astrophysics Data System (ADS)
Reyer, D.; Philipp, S. L.
2014-09-01
Information about geomechanical and physical rock properties, particularly uniaxial compressive strength (UCS), are needed for geomechanical model development and updating with logging-while-drilling methods to minimise costs and risks of the drilling process. The following parameters with importance at different stages of geothermal exploitation and drilling are presented for typical sedimentary and volcanic rocks of the Northwest German Basin (NWGB): physical (P wave velocities, porosity, and bulk and grain density) and geomechanical parameters (UCS, static Young's modulus, destruction work and indirect tensile strength both perpendicular and parallel to bedding) for 35 rock samples from quarries and 14 core samples of sandstones and carbonate rocks. With regression analyses (linear- and non-linear) empirical relations are developed to predict UCS values from all other parameters. Analyses focus on sedimentary rocks and were repeated separately for clastic rock samples or carbonate rock samples as well as for outcrop samples or core samples. Empirical relations have high statistical significance for Young's modulus, tensile strength and destruction work; for physical properties, there is a wider scatter of data and prediction of UCS is less precise. For most relations, properties of core samples plot within the scatter of outcrop samples and lie within the 90% prediction bands of developed regression functions. The results indicate the applicability of empirical relations that are based on outcrop data on questions related to drilling operations when the database contains a sufficient number of samples with varying rock properties. The presented equations may help to predict UCS values for sedimentary rocks at depth, and thus develop suitable geomechanical models for the adaptation of the drilling strategy on rock mechanical conditions in the NWGB.
School league tables: a new population based predictor of dental restorative treatment need.
Crowley, Evelyn; O'Brien, Graham; Marcenes, Wagner
2003-06-01
To test whether dental restorative treatment need was related to the school league tables and level of social deprivation of the school ward. An ecological study using clinical data aggregated at school level, collected in the school dental screening examinations (1996-97), National Census (1991) and the results of the UK school league tables--Key Stage 2 SATs (1996-97). State primary schools in the Greenwich District of SE London, UK (1996-97). 12,854 pupils (6-11 years of age) in 62 schools. The percentage of 6 to 11 year old pupils per school requiring dental restorative treatment. Deprivation as measured by the overall Jarman Under Privileged Area Index (UPA) of the school ward was not associated with dental restorative treatment need (p > 0.05). Only two components of the Jarman Index, level of unemployment and the number of lone parent families in the school ward were found to be significantly associated with dental restorative treatment need (p < 0.05). Results of stepwise multiple linear regression analysis showed that the association with the school league table results in all three subjects, English, Mathematics and Science remained statistically significant after adjusting for levels of unemployment and single parents. Results of multiple linear regression analysis showed that a high level of dental restorative treatment need was significantly associated with poor school league table results in English, Mathematics and Science (p < 0.05) after adjusting for the overall Jarman score of the school ward. A separate analysis for the 11-year-old pupils aggregated by school (n = 46 schools) gave similar results. Aggregate measures of academic achievement may be a potential indicator of dental restorative treatment need.
Fasting time and vitamin B12 levels in a community-based population.
Orton, Dennis J; Naugler, Christopher; Sadrzadeh, S M Hossein
2016-07-01
Vitamin B12, also known as cobalamin (Cbl), is an essential vitamin that manifests with numerous severe but non-specific symptoms in cases of deficiency. Assessing Cbl status often requires fasting, although this requirement is not standard between institutions. This study evaluated the impact of fasting on Cbl levels in a large community-based cohort in an effort to promote standardization of Cbl testing between sites. Laboratory data for Cbl, fasting time, patient age and sex were obtained from laboratory information service from Calgary Laboratory Services (CLS) for the period of April 2011 to June 2015. CLS is the sole supplier of laboratory services in the Southern Alberta region in Canada (population, approximately 1.4 million). To investigate potential sex-specific effects of fasting on Cbl levels, males and females were analyzed separately using linear regression models. A total of 346,957 individual patient results (196,849 females, 146,085 males) were obtained. The mean plasma Cbl level was 386.5 (±195.6) pmol/L and 412.0 (±220.8) pmol/L for males and females, respectively. Linear regression analysis showed fasting had no significant association with Cbl levels in females; however a statistically significant decrease of 0.9pmol/L/hour fasting (p<0.001) was noted in males. The broad population variance in Cbl suggests the slight gender-specific differences noted in this study are insignificant. Despite this, fasting has the potential to contribute to higher rates of Cbl deficiency in men. Together, these data suggest fasting should be excluded as a requirement for evaluating plasma Cbl. Copyright © 2016 Elsevier B.V. All rights reserved.
Kumar, K Vasanth
2006-10-11
Batch kinetic experiments were carried out for the sorption of methylene blue onto activated carbon. The experimental kinetics were fitted to the pseudo first-order and pseudo second-order kinetics by linear and a non-linear method. The five different types of Ho pseudo second-order expression have been discussed. A comparison of linear least-squares method and a trial and error non-linear method of estimating the pseudo second-order rate kinetic parameters were examined. The sorption process was found to follow a both pseudo first-order kinetic and pseudo second-order kinetic model. Present investigation showed that it is inappropriate to use a type 1 and type pseudo second-order expressions as proposed by Ho and Blanachard et al. respectively for predicting the kinetic rate constants and the initial sorption rate for the studied system. Three correct possible alternate linear expressions (type 2 to type 4) to better predict the initial sorption rate and kinetic rate constants for the studied system (methylene blue/activated carbon) was proposed. Linear method was found to check only the hypothesis instead of verifying the kinetic model. Non-linear regression method was found to be the more appropriate method to determine the rate kinetic parameters.
Adjusted variable plots for Cox's proportional hazards regression model.
Hall, C B; Zeger, S L; Bandeen-Roche, K J
1996-01-01
Adjusted variable plots are useful in linear regression for outlier detection and for qualitative evaluation of the fit of a model. In this paper, we extend adjusted variable plots to Cox's proportional hazards model for possibly censored survival data. We propose three different plots: a risk level adjusted variable (RLAV) plot in which each observation in each risk set appears, a subject level adjusted variable (SLAV) plot in which each subject is represented by one point, and an event level adjusted variable (ELAV) plot in which the entire risk set at each failure event is represented by a single point. The latter two plots are derived from the RLAV by combining multiple points. In each point, the regression coefficient and standard error from a Cox proportional hazards regression is obtained by a simple linear regression through the origin fit to the coordinates of the pictured points. The plots are illustrated with a reanalysis of a dataset of 65 patients with multiple myeloma.
NASA Astrophysics Data System (ADS)
Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.
2018-03-01
The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.
Jupiter, Daniel C
2012-01-01
In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, Fei; Mitra, Nandita
2018-03-01
Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.
Estimation of stature using hand and foot dimensions in Slovak adults.
Uhrová, Petra; Beňuš, Radoslav; Masnicová, Soňa; Obertová, Zuzana; Kramárová, Daniela; Kyselicová, Klaudia; Dörnhöferová, Michaela; Bodoriková, Silvia; Neščáková, Eva
2015-03-01
Hand and foot dimensions used for stature estimation help to formulate a biological profile in the process of personal identification. Morphological variability of hands and feet shows the importance of generating population-specific equations to estimate stature. The stature, hand length, hand breadth, foot length and foot breadth of 250 young Slovak males and females, aged 18-24 years, were measured according to standard anthropometric procedures. The data were statistically analyzed using independent t-test for sex and bilateral differences. Pearson correlation coefficient was used for assessing relationship between stature and hand/foot parameters, and subsequently linear regression analysis was used to estimate stature. The results revealed significant sex differences in hand and foot dimensions as well as in stature (p<0.05). There was a positive and statistically significant correlation between stature and all measurements in both sexes (p<0.01). The highest correlation coefficient was found for foot length in males (r=0.71) as well as in females (r=0.63). Regression equations were computed separately for each sex. The accuracy of stature prediction ranged from ±4.6 to ±6.1cm. The results of this study indicate that hand and foot dimension can be used to estimate stature for Slovak for the purpose of forensic field. The regression equations can be of use for stature estimation particularly in cases of dismembered bodies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Modification of the USLE K factor for soil erodibility assessment on calcareous soils in Iran
NASA Astrophysics Data System (ADS)
Ostovari, Yaser; Ghorbani-Dashtaki, Shoja; Bahrami, Hossein-Ali; Naderi, Mehdi; Dematte, Jose Alexandre M.; Kerry, Ruth
2016-11-01
The measurement of soil erodibility (K) in the field is tedious, time-consuming and expensive; therefore, its prediction through pedotransfer functions (PTFs) could be far less costly and time-consuming. The aim of this study was to develop new PTFs to estimate the K factor using multiple linear regression, Mamdani fuzzy inference systems, and artificial neural networks. For this purpose, K was measured in 40 erosion plots with natural rainfall. Various soil properties including the soil particle size distribution, calcium carbonate equivalent, organic matter, permeability, and wet-aggregate stability were measured. The results showed that the mean measured K was 0.014 t h MJ- 1 mm- 1 and 2.08 times less than the estimated mean K (0.030 t h MJ- 1 mm- 1) using the USLE model. Permeability, wet-aggregate stability, very fine sand, and calcium carbonate were selected as independent variables by forward stepwise regression in order to assess the ability of multiple linear regression, Mamdani fuzzy inference systems and artificial neural networks to predict K. The calcium carbonate equivalent, which is not accounted for in the USLE model, had a significant impact on K in multiple linear regression due to its strong influence on the stability of aggregates and soil permeability. Statistical indices in validation and calibration datasets determined that the artificial neural networks method with the highest R2, lowest RMSE, and lowest ME was the best model for estimating the K factor. A strong correlation (R2 = 0.81, n = 40, p < 0.05) between the estimated K from multiple linear regression and measured K indicates that the use of calcium carbonate equivalent as a predictor variable gives a better estimation of K in areas with calcareous soils.