Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
Sparse partial least squares regression for simultaneous dimension reduction and variable selection
Chun, Hyonho; Keleş, Sündüz
2010-01-01
Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611
Sparse brain network using penalized linear regression
NASA Astrophysics Data System (ADS)
Lee, Hyekyoung; Lee, Dong Soo; Kang, Hyejin; Kim, Boong-Nyun; Chung, Moo K.
2011-03-01
Sparse partial correlation is a useful connectivity measure for brain networks when it is difficult to compute the exact partial correlation in the small-n large-p setting. In this paper, we formulate the problem of estimating partial correlation as a sparse linear regression with a l1-norm penalty. The method is applied to brain network consisting of parcellated regions of interest (ROIs), which are obtained from FDG-PET images of the autism spectrum disorder (ASD) children and the pediatric control (PedCon) subjects. To validate the results, we check their reproducibilities of the obtained brain networks by the leave-one-out cross validation and compare the clustered structures derived from the brain networks of ASD and PedCon.
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions.
Drouard, Vincent; Horaud, Radu; Deleforge, Antoine; Ba, Sileye; Evangelidis, Georgios
2017-03-01
Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging, because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose to use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available data sets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.
Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Chaurasia, Ashok; Harel, Ofer
2015-02-10
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
Enhancement of partial robust M-regression (PRM) performance using Bisquare weight function
NASA Astrophysics Data System (ADS)
Mohamad, Mazni; Ramli, Norazan Mohamed; Ghani@Mamat, Nor Azura Md; Ahmad, Sanizah
2014-09-01
Partial Least Squares (PLS) regression is a popular regression technique for handling multicollinearity in low and high dimensional data which fits a linear relationship between sets of explanatory and response variables. Several robust PLS methods are proposed to accommodate the classical PLS algorithms which are easily affected with the presence of outliers. The recent one was called partial robust M-regression (PRM). Unfortunately, the use of monotonous weighting function in the PRM algorithm fails to assign appropriate and proper weights to large outliers according to their severity. Thus, in this paper, a modified partial robust M-regression is introduced to enhance the performance of the original PRM. A re-descending weight function, known as Bisquare weight function is recommended to replace the fair function in the PRM. A simulation study is done to assess the performance of the modified PRM and its efficiency is also tested in both contaminated and uncontaminated simulated data under various percentages of outliers, sample sizes and number of predictors.
On vertical profile of ozone at Syowa
NASA Technical Reports Server (NTRS)
Chubachi, Shigeru
1994-01-01
The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672
Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung
2012-07-01
In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Doubly robust estimation of generalized partial linear models for longitudinal data with dropouts.
Lin, Huiming; Fu, Bo; Qin, Guoyou; Zhu, Zhongyi
2017-12-01
We develop a doubly robust estimation of generalized partial linear models for longitudinal data with dropouts. Our method extends the highly efficient aggregate unbiased estimating function approach proposed in Qu et al. (2010) to a doubly robust one in the sense that under missing at random (MAR), our estimator is consistent when either the linear conditional mean condition is satisfied or a model for the dropout process is correctly specified. We begin with a generalized linear model for the marginal mean, and then move forward to a generalized partial linear model, allowing for nonparametric covariate effect by using the regression spline smoothing approximation. We establish the asymptotic theory for the proposed method and use simulation studies to compare its finite sample performance with that of Qu's method, the complete-case generalized estimating equation (GEE) and the inverse-probability weighted GEE. The proposed method is finally illustrated using data from a longitudinal cohort study. © 2017, The International Biometric Society.
Shah, A A; Xing, W W; Triantafyllidis, V
2017-04-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.
Xing, W. W.; Triantafyllidis, V.
2017-01-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Robust estimation for partially linear models with large-dimensional covariates
Zhu, LiPing; Li, RunZe; Cui, HengJian
2014-01-01
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087
Robust estimation for partially linear models with large-dimensional covariates.
Zhu, LiPing; Li, RunZe; Cui, HengJian
2013-10-01
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
NASA Astrophysics Data System (ADS)
Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan
2017-01-01
Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.
ELASTIC NET FOR COX'S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM.
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox's proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox's proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems.
Using Remote Sensing Data to Evaluate Surface Soil Properties in Alabama Ultisols
NASA Technical Reports Server (NTRS)
Sullivan, Dana G.; Shaw, Joey N.; Rickman, Doug; Mask, Paul L.; Luvall, Jeff
2005-01-01
Evaluation of surface soil properties via remote sensing could facilitate soil survey mapping, erosion prediction and allocation of agrochemicals for precision management. The objective of this study was to evaluate the relationship between soil spectral signature and surface soil properties in conventionally managed row crop systems. High-resolution RS data were acquired over bare fields in the Coastal Plain, Appalachian Plateau, and Ridge and Valley provinces of Alabama using the Airborne Terrestrial Applications Sensor multispectral scanner. Soils ranged from sandy Kandiudults to fine textured Rhodudults. Surface soil samples (0-1 cm) were collected from 163 sampling points for soil organic carbon, particle size distribution, and citrate dithionite extractable iron content. Surface roughness, soil water content, and crusting were also measured during sampling. Two methods of analysis were evaluated: 1) multiple linear regression using common spectral band ratios, and 2) partial least squares regression. Our data show that thermal infrared spectra are highly, linearly related to soil organic carbon, sand and clay content. Soil organic carbon content was the most difficult to quantify in these highly weathered systems, where soil organic carbon was generally less than 1.2%. Estimates of sand and clay content were best using partial least squares regression at the Valley site, explaining 42-59% of the variability. In the Coastal Plain, sandy surfaces prone to crusting limited estimates of sand and clay content via partial least squares and regression with common band ratios. Estimates of iron oxide content were a function of mineralogy and best accomplished using specific band ratios, with regression explaining 36-65% of the variability at the Valley and Coastal Plain sites, respectively.
ELASTIC NET FOR COX’S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox’s proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox’s proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems. PMID:23226932
Partial Least Squares Regression Models for the Analysis of Kinase Signaling.
Bourgeois, Danielle L; Kreeger, Pamela K
2017-01-01
Partial least squares regression (PLSR) is a data-driven modeling approach that can be used to analyze multivariate relationships between kinase networks and cellular decisions or patient outcomes. In PLSR, a linear model relating an X matrix of dependent variables and a Y matrix of independent variables is generated by extracting the factors with the strongest covariation. While the identified relationship is correlative, PLSR models can be used to generate quantitative predictions for new conditions or perturbations to the network, allowing for mechanisms to be identified. This chapter will provide a brief explanation of PLSR and provide an instructive example to demonstrate the use of PLSR to analyze kinase signaling.
Structured penalties for functional linear models-partially empirical eigenvectors for regression.
Randolph, Timothy W; Harezlak, Jaroslaw; Feng, Ziding
2012-01-01
One of the challenges with functional data is incorporating geometric structure, or local correlation, into the analysis. This structure is inherent in the output from an increasing number of biomedical technologies, and a functional linear model is often used to estimate the relationship between the predictor functions and scalar responses. Common approaches to the problem of estimating a coefficient function typically involve two stages: regularization and estimation. Regularization is usually done via dimension reduction, projecting onto a predefined span of basis functions or a reduced set of eigenvectors (principal components). In contrast, we present a unified approach that directly incorporates geometric structure into the estimation process by exploiting the joint eigenproperties of the predictors and a linear penalty operator. In this sense, the components in the regression are 'partially empirical' and the framework is provided by the generalized singular value decomposition (GSVD). The form of the penalized estimation is not new, but the GSVD clarifies the process and informs the choice of penalty by making explicit the joint influence of the penalty and predictors on the bias, variance and performance of the estimated coefficient function. Laboratory spectroscopy data and simulations are used to illustrate the concepts.
Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E
2002-01-01
Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.
NASA Astrophysics Data System (ADS)
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Experimental and computational prediction of glass transition temperature of drugs.
Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S
2014-12-22
Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.
Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija
2018-01-01
The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Zheng, Xueying; Qin, Guoyou; Tu, Dongsheng
2017-05-30
Motivated by the analysis of quality of life data from a clinical trial on early breast cancer, we propose in this paper a generalized partially linear mean-covariance regression model for longitudinal proportional data, which are bounded in a closed interval. Cholesky decomposition of the covariance matrix for within-subject responses and generalized estimation equations are used to estimate unknown parameters and the nonlinear function in the model. Simulation studies are performed to evaluate the performance of the proposed estimation procedures. Our new model is also applied to analyze the data from the cancer clinical trial that motivated this research. In comparison with available models in the literature, the proposed model does not require specific parametric assumptions on the density function of the longitudinal responses and the probability function of the boundary values and can capture dynamic changes of time or other interested variables on both mean and covariance of the correlated proportional responses. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Hyper-Spectral Image Analysis With Partially Latent Regression and Spatial Markov Dependencies
NASA Astrophysics Data System (ADS)
Deleforge, Antoine; Forbes, Florence; Ba, Sileye; Horaud, Radu
2015-09-01
Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.
Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M
2017-06-01
The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Data-driven discovery of partial differential equations.
Rudy, Samuel H; Brunton, Steven L; Proctor, Joshua L; Kutz, J Nathan
2017-04-01
We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg-de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable.
Linear Least Squares for Correlated Data
NASA Technical Reports Server (NTRS)
Dean, Edwin B.
1988-01-01
Throughout the literature authors have consistently discussed the suspicion that regression results were less than satisfactory when the independent variables were correlated. Camm, Gulledge, and Womer, and Womer and Marcotte provide excellent applied examples of these concerns. Many authors have obtained partial solutions for this problem as discussed by Womer and Marcotte and Wonnacott and Wonnacott, which result in generalized least squares algorithms to solve restrictive cases. This paper presents a simple but relatively general multivariate method for obtaining linear least squares coefficients which are free of the statistical distortion created by correlated independent variables.
NASA Astrophysics Data System (ADS)
Leroux, Romain; Chatellier, Ludovic; David, Laurent
2018-01-01
This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-25
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N
2017-07-01
Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Bobbett, Gordon C.; And Others
The relationships among factors reported on school district (SD) report cards were studied for 121 Tennessee SDs. The report cards provided data on student outcomes (achievement test scores) and SD characteristics. Relationships were studied through linear regression, Pearson product moment correlation, and Guttman's partial correlation. Six…
Data-driven discovery of partial differential equations
Rudy, Samuel H.; Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan
2017-01-01
We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity-promoting techniques to select the nonlinear and partial derivative terms of the governing equations that most accurately represent the data, bypassing a combinatorially large search through all possible candidate models. The method balances model complexity and regression accuracy by selecting a parsimonious model via Pareto analysis. Time series measurements can be made in an Eulerian framework, where the sensors are fixed spatially, or in a Lagrangian framework, where the sensors move with the dynamics. The method is computationally efficient, robust, and demonstrated to work on a variety of canonical problems spanning a number of scientific domains including Navier-Stokes, the quantum harmonic oscillator, and the diffusion equation. Moreover, the method is capable of disambiguating between potentially nonunique dynamical terms by using multiple time series taken with different initial data. Thus, for a traveling wave, the method can distinguish between a linear wave equation and the Korteweg–de Vries equation, for instance. The method provides a promising new technique for discovering governing equations and physical laws in parameterized spatiotemporal systems, where first-principles derivations are intractable. PMID:28508044
NASA Astrophysics Data System (ADS)
Talebpour, Zahra; Tavallaie, Roya; Ahmadi, Seyyed Hamid; Abdollahpour, Assem
2010-09-01
In this study, a new method for the simultaneous determination of penicillin G salts in pharmaceutical mixture via FT-IR spectroscopy combined with chemometrics was investigated. The mixture of penicillin G salts is a complex system due to similar analytical characteristics of components. Partial least squares (PLS) and radial basis function-partial least squares (RBF-PLS) were used to develop the linear and nonlinear relation between spectra and components, respectively. The orthogonal signal correction (OSC) preprocessing method was used to correct unexpected information, such as spectral overlapping and scattering effects. In order to compare the influence of OSC on PLS and RBF-PLS models, the optimal linear (PLS) and nonlinear (RBF-PLS) models based on conventional and OSC preprocessed spectra were established and compared. The obtained results demonstrated that OSC clearly enhanced the performance of both RBF-PLS and PLS calibration models. Also in the case of some nonlinear relation between spectra and component, OSC-RBF-PLS gave satisfactory results than OSC-PLS model which indicated that the OSC was helpful to remove extrinsic deviations from linearity without elimination of nonlinear information related to component. The chemometric models were tested on an external dataset and finally applied to the analysis commercialized injection product of penicillin G salts.
NASA Technical Reports Server (NTRS)
Flynn, Clare; Pickering, Kenneth E.; Crawford, James H.; Lamsol, Lok; Krotkov, Nickolay; Herman, Jay; Weinheimer, Andrew; Chen, Gao; Liu, Xiong; Szykman, James;
2014-01-01
To investigate the ability of column (or partial column) information to represent surface air quality, results of linear regression analyses between surface mixing ratio data and column abundances for O3 and NO2 are presented for the July 2011 Maryland deployment of the DISCOVER-AQ mission. Data collected by the P-3B aircraft, ground-based Pandora spectrometers, Aura/OMI satellite instrument, and simulations for July 2011 from the CMAQ air quality model during this deployment provide a large and varied data set, allowing this problem to be approached from multiple perspectives. O3 columns typically exhibited a statistically significant and high degree of correlation with surface data (R(sup 2) > 0.64) in the P- 3B data set, a moderate degree of correlation (0.16 < R(sup 2) < 0.64) in the CMAQ data set, and a low degree of correlation (R(sup 2) < 0.16) in the Pandora and OMI data sets. NO2 columns typically exhibited a low to moderate degree of correlation with surface data in each data set. The results of linear regression analyses for O3 exhibited smaller errors relative to the observations than NO2 regressions. These results suggest that O3 partial column observations from future satellite instruments with sufficient sensitivity to the lower troposphere can be meaningful for surface air quality analysis.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
Home on the Range: Factors Explaining Partial Migration of African Buffalo in a Tropical Environment
Naidoo, Robin; Du Preez, Pierre; Stuart-Hill, Greg; Jago, Mark; Wegmann, Martin
2012-01-01
Partial migration (when only some individuals in a population undertake seasonal migrations) is common in many species and geographical contexts. Despite the development of modern statistical methods for analyzing partial migration, there have been no studies on what influences partial migration in tropical environments. We present research on factors affecting partial migration in African buffalo (Syncerus caffer) in northeastern Namibia. Our dataset is derived from 32 satellite tracking collars, spans 4 years and contains over 35,000 locations. We used remotely sensed data to quantify various factors that buffalo experience in the dry season when making decisions on whether and how far to migrate, including potential man-made and natural barriers, as well as spatial and temporal heterogeneity in environmental conditions. Using an information-theoretic, non-linear regression approach, our analyses showed that buffalo in this area can be divided into 4 migratory classes: migrants, non-migrants, dispersers, and a new class that we call “expanders”. Multimodel inference from least-squares regressions of wet season movements showed that environmental conditions (rainfall, fires, woodland cover, vegetation biomass), distance to the nearest barrier (river, fence, cultivated area) and social factors (age, size of herd at capture) were all important in explaining variation in migratory behaviour. The relative contributions of these variables to partial migration have not previously been assessed for ungulates in the tropics. Understanding the factors driving migratory decisions of wildlife will lead to better-informed conservation and land-use decisions in this area. PMID:22570722
2008-03-01
was introduced, Vroom (1964) performed a partial review of the turnover literature. His modest review of the literature found a consistent...for the moderators job satisfaction and organizational commitment while controlling for rank and gender. Linear regressions were used to determine...if the relationship between OPTEMPO and turnover intentions were significant. When accounting for job satisfaction and organizational commitment the
Weather Impact on Airport Arrival Meter Fix Throughput
NASA Technical Reports Server (NTRS)
Wang, Yao
2017-01-01
Time-based flow management provides arrival aircraft schedules based on arrival airport conditions, airport capacity, required spacing, and weather conditions. In order to meet a scheduled time at which arrival aircraft can cross an airport arrival meter fix prior to entering the airport terminal airspace, air traffic controllers make regulations on air traffic. Severe weather may create an airport arrival bottleneck if one or more of airport arrival meter fixes are partially or completely blocked by the weather and the arrival demand has not been reduced accordingly. Under these conditions, aircraft are frequently being put in holding patterns until they can be rerouted. A model that predicts the weather impacted meter fix throughput may help air traffic controllers direct arrival flows into the airport more efficiently, minimizing arrival meter fix congestion. This paper presents an analysis of air traffic flows across arrival meter fixes at the Newark Liberty International Airport (EWR). Several scenarios of weather impacted EWR arrival fix flows are described. Furthermore, multiple linear regression and regression tree ensemble learning approaches for translating multiple sector Weather Impacted Traffic Indexes (WITI) to EWR arrival meter fix throughputs are examined. These weather translation models are developed and validated using the EWR arrival flight and weather data for the period of April-September in 2014. This study also compares the performance of the regression tree ensemble with traditional multiple linear regression models for estimating the weather impacted throughputs at each of the EWR arrival meter fixes. For all meter fixes investigated, the results from the regression tree ensemble weather translation models show a stronger correlation between model outputs and observed meter fix throughputs than that produced from multiple linear regression method.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
Grandke, Fabian; Singh, Priyanka; Heuven, Henri C M; de Haan, Jorn R; Metzler, Dirk
2016-08-24
Association studies are an essential part of modern plant breeding, but are limited for polyploid crops. The increased number of possible genotype classes complicates the differentiation between them. Available methods are limited with respect to the ploidy level or data producing technologies. While genotype classification is an established noise reduction step in diploids, it gains complexity with increasing ploidy levels. Eventually, the errors produced by misclassifications exceed the benefits of genotype classes. Alternatively, continuous genotype values can be used for association analysis in higher polyploids. We associated continuous genotypes to three different traits and compared the results to the output of the genotype caller SuperMASSA. Linear, Bayesian and partial least squares regression were applied, to determine if the use of continuous genotypes is limited to a specific method. A disease, a flowering and a growth trait with h (2) of 0.51, 0.78 and 0.91 were associated with a hexaploid chrysanthemum genotypes. The data set consisted of 55,825 probes and 228 samples. We were able to detect associating probes using continuous genotypes for multiple traits, using different regression methods. The identified probe sets were overlapping, but not identical between the methods. Baysian regression was the most restrictive method, resulting in ten probes for one trait and none for the others. Linear and partial least squares regression led to numerous associating probes. Association based on genotype classes resulted in similar values, but missed several significant probes. A simulation study was used to successfully validate the number of associating markers. Association of various phenotypic traits with continuous genotypes is successful with both uni- and multivariate regression methods. Genotype calling does not improve the association and shows no advantages in this study. Instead, use of continuous genotypes simplifies the analysis, saves computational time and results more potential markers.
Plans-Rubió, Pedro; Navas, Encarna; Godoy, Pere; Carmona, Gloria; Domínguez, Angela; Jané, Mireia; Muñoz-Almagro, Carmen; Brotons, Pedro
2018-05-14
The aim of this study was to assess direct health costs in children with pertussis aged 0-9 years who were vaccinated, partially vaccinated, and unvaccinated during childhood, and to assess the association between pertussis costs and pertussis vaccination in Catalonia (Spain) in 2012-2013. Direct healthcare costs included pertussis treatment, pertussis detection, and preventive chemotherapy of contacts. Pertussis patients were considered vaccinated when they had received 4-5 doses, and unvaccinated or partially vaccinated when they had received 0-3 doses of vaccine. The Chi square test and the odds ratios were used to compare percentages and the t test was used to compare mean pertussis costs in different groups, considering a p < 0.05 as statistically significant. The correlation between pertussis costs and study variables was assessed using the Spearman's ρ, with a p < 0.05 as statistically significant. Multiple linear regression analysis (IBM-SPSS program) was used to quantify the association of pertussis vaccination and other study variables with pertussis costs. Vaccinated children with pertussis aged 0-9 years had significantly lower odds ratios of hospitalizations (OR 0.02, p < 0.001), laboratory confirmation (OR 0.21, p < 0.001), and severe disease (OR 0.02, p < 0.001) than unvaccinated or partially vaccinated children with pertussis of the same age. Mean direct healthcare costs were significantly lower (p < 0.001) in vaccinated patients (€190.6) than in unvaccinated patients (€3550.8), partially vaccinated patients (€1116.9), and unvaccinated/partially vaccinated patients (€2330). Multivariable linear regression analysis showed that pertussis vaccination with 4-5 doses was associated with a non-significant reduction of pertussis costs of €107.9 per case after taking into account the effect of other study variables, and €200 per case after taking into account pertussis severity. Direct healthcare costs were lower in children with pertussis aged 0-9 years vaccinated with 4-5 doses of acellular vaccines than in unvaccinated or partially vaccinated children with pertussis of the same age.
Guertin, Kristin A; Loftfield, Erikka; Boca, Simina M; Sampson, Joshua N; Moore, Steven C; Xiao, Qian; Huang, Wen-Yi; Xiong, Xiaoqin; Freedman, Neal D; Cross, Amanda J; Sinha, Rashmi
2015-05-01
Coffee intake may be inversely associated with colorectal cancer; however, previous studies have been inconsistent. Serum coffee metabolites are integrated exposure measures that may clarify associations with cancer and elucidate underlying mechanisms. Our aims were 2-fold as follows: 1) to identify serum metabolites associated with coffee intake and 2) to examine these metabolites in relation to colorectal cancer. In a nested case-control study of 251 colorectal cancer cases and 247 matched control subjects from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, we conducted untargeted metabolomics analyses of baseline serum by using ultrahigh-performance liquid-phase chromatography-tandem mass spectrometry and gas chromatography-mass spectrometry. Usual coffee intake was self-reported in a food-frequency questionnaire. We used partial Pearson correlations and linear regression to identify serum metabolites associated with coffee intake and conditional logistic regression to evaluate associations between coffee metabolites and colorectal cancer. After Bonferroni correction for multiple comparisons (P = 0.05 ÷ 657 metabolites), 29 serum metabolites were positively correlated with coffee intake (partial correlation coefficients: 0.18-0.61; P < 7.61 × 10(-5)); serum metabolites most highly correlated with coffee intake (partial correlation coefficients >0.40) included trigonelline (N'-methylnicotinate), quinate, and 7 unknown metabolites. Of 29 serum metabolites, 8 metabolites were directly related to caffeine metabolism, and 3 of these metabolites, theophylline (OR for 90th compared with 10th percentiles: 0.44; 95% CI: 0.25, 0.79; P-linear trend = 0.006), caffeine (OR for 90th compared with 10th percentiles: 0.56; 95% CI: 0.35, 0.89; P-linear trend = 0.015), and paraxanthine (OR for 90th compared with 10th percentiles: 0.58; 95% CI: 0.36, 0.94; P-linear trend = 0.027), were inversely associated with colorectal cancer. Serum metabolites can distinguish coffee drinkers from nondrinkers; some caffeine-related metabolites were inversely associated with colorectal cancer and should be studied further to clarify the role of coffee in the cause of colorectal cancer. The Prostate, Lung, Colorectal, and Ovarian trial was registered at clinicaltrials.gov as NCT00002540. © 2015 American Society for Nutrition.
Liao, Zhipeng; Yoda, Nobuhiro; Chen, Junning; Zheng, Keke; Sasaki, Keiichi; Swain, Michael V; Li, Qing
2017-04-01
This paper aimed to develop a clinically validated bone remodeling algorithm by integrating bone's dynamic properties in a multi-stage fashion based on a four-year clinical follow-up of implant treatment. The configurational effects of fixed partial dentures (FPDs) were explored using a multi-stage remodeling rule. Three-dimensional real-time occlusal loads during maximum voluntary clenching were measured with a piezoelectric force transducer and were incorporated into a computerized tomography-based finite element mandibular model. Virtual X-ray images were generated based on simulation and statistically correlated with clinical data using linear regressions. The strain energy density-driven remodeling parameters were regulated over the time frame considered. A linear single-stage bone remodeling algorithm, with a single set of constant remodeling parameters, was found to poorly fit with clinical data through linear regression (low [Formula: see text] and R), whereas a time-dependent multi-stage algorithm better simulated the remodeling process (high [Formula: see text] and R) against the clinical results. The three-implant-supported and distally cantilevered FPDs presented noticeable and continuous bone apposition, mainly adjacent to the cervical and apical regions. The bridged and mesially cantilevered FPDs showed bone resorption or no visible bone formation in some areas. Time-dependent variation of bone remodeling parameters is recommended to better correlate remodeling simulation with clinical follow-up. The position of FPD pontics plays a critical role in mechanobiological functionality and bone remodeling. Caution should be exercised when selecting the cantilever FPD due to the risk of overloading bone resorption.
Stamey, Timothy C.
1998-01-01
Simple and reliable methods for estimating hourly streamflow are needed for the calibration and verification of a Chattahoochee River basin model between Buford Dam and Franklin, Ga. The river basin model is being developed by Georgia Department of Natural Resources, Environmental Protection Division, as part of their Chattahoochee River Modeling Project. Concurrent streamflow data collected at 19 continuous-record, and 31 partial-record streamflow stations, were used in ordinary least-squares linear regression analyses to define estimating equations, and in verifying drainage-area prorations. The resulting regression or drainage-area ratio estimating equations were used to compute hourly streamflow at the partial-record stations. The coefficients of determination (r-squared values) for the regression estimating equations ranged from 0.90 to 0.99. Observed and estimated hourly and daily streamflow data were computed for May 1, 1995, through October 31, 1995. Comparisons of observed and estimated daily streamflow data for 12 continuous-record tributary stations, that had available streamflow data for all or part of the period from May 1, 1995, to October 31, 1995, indicate that the mean error of estimate for the daily streamflow was about 25 percent.
Incremental online learning in high dimensions.
Vijayakumar, Sethu; D'Souza, Aaron; Schaal, Stefan
2005-12-01
Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high-dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally efficient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high-dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it (1) learns rapidly with second-order learning methods based on incremental training, (2) uses statistically sound stochastic leave-one-out cross validation for learning without the need to memorize training data, (3) adjusts its weighting kernels based on only local information in order to minimize the danger of negative interference of incremental learning, (4) has a computational complexity that is linear in the number of inputs, and (5) can deal with a large number of-possibly redundant-inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and efficiently operate in very high-dimensional spaces.
NASA Astrophysics Data System (ADS)
Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.
2017-07-01
During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.
Elkhoudary, Mahmoud M; Naguib, Ibrahim A; Abdel Salam, Randa A; Hadad, Ghada M
2017-05-01
Four accurate, sensitive and reliable stability indicating chemometric methods were developed for the quantitative determination of Agomelatine (AGM) whether in pure form or in pharmaceutical formulations. Two supervised learning machines' methods; linear artificial neural networks (PC-linANN) preceded by principle component analysis and linear support vector regression (linSVR), were compared with two principle component based methods; principle component regression (PCR) as well as partial least squares (PLS) for the spectrofluorimetric determination of AGM and its degradants. The results showed the benefits behind using linear learning machines' methods and the inherent merits of their algorithms in handling overlapped noisy spectral data especially during the challenging determination of AGM alkaline and acidic degradants (DG1 and DG2). Relative mean squared error of prediction (RMSEP) for the proposed models in the determination of AGM were 1.68, 1.72, 0.68 and 0.22 for PCR, PLS, SVR and PC-linANN; respectively. The results showed the superiority of supervised learning machines' methods over principle component based methods. Besides, the results suggested that linANN is the method of choice for determination of components in low amounts with similar overlapped spectra and narrow linearity range. Comparison between the proposed chemometric models and a reported HPLC method revealed the comparable performance and quantification power of the proposed models.
ERIC Educational Resources Information Center
And Others; Werts, Charles E.
1979-01-01
It is shown how partial covariance, part and partial correlation, and regression weights can be estimated and tested for significance by means of a factor analytic model. Comparable partial covariance, correlations, and regression weights have identical significance tests. (Author)
Hao, Yong; Sun, Xu-Dong; Yang, Qiang
2012-12-01
Variables selection strategy combined with local linear embedding (LLE) was introduced for the analysis of complex samples by using near infrared spectroscopy (NIRS). Three methods include Monte Carlo uninformation variable elimination (MCUVE), successive projections algorithm (SPA) and MCUVE connected with SPA were used for eliminating redundancy spectral variables. Partial least squares regression (PLSR) and LLE-PLSR were used for modeling complex samples. The results shown that MCUVE can both extract effective informative variables and improve the precision of models. Compared with PLSR models, LLE-PLSR models can achieve more accurate analysis results. MCUVE combined with LLE-PLSR is an effective modeling method for NIRS quantitative analysis.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy
NASA Astrophysics Data System (ADS)
Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen
2017-05-01
This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
Liu, Weijian; Wang, Yilong; Chen, Yuanchen; Tao, Shu; Liu, Wenxin
2017-07-01
The total concentrations and component profiles of polycyclic aromatic hydrocarbons (PAHs) in ambient air, surface soil and wheat grain collected from wheat fields near a large steel-smelting manufacturer in Northern China were determined. Based on the specific isomeric ratios of paired species in ambient air, principle component analysis and multivariate linear regression, the main emission source of local PAHs was identified as a mixture of industrial and domestic coal combustion, biomass burning and traffic exhaust. The total organic carbon (TOC) fraction was considerably correlated with the total and individual PAH concentrations in surface soil. The total concentrations of PAHs in wheat grain were relatively low, with dominant low molecular weight constituents, and the compositional profile was more similar to that in ambient air than in topsoil. Combined with more significant results from partial correlation and linear regression models, the contribution from air PAHs to grain PAHs may be greater than that from soil PAHs. Copyright © 2016. Published by Elsevier B.V.
Predicting major element mineral/melt equilibria - A statistical approach
NASA Technical Reports Server (NTRS)
Hostetler, C. J.; Drake, M. J.
1980-01-01
Empirical equations have been developed for calculating the mole fractions of NaO0.5, MgO, AlO1.5, SiO2, KO0.5, CaO, TiO2, and FeO in a solid phase of initially unknown identity given only the composition of the coexisting silicate melt. The approach involves a linear multivariate regression analysis in which solid composition is expressed as a Taylor series expansion of the liquid compositions. An internally consistent precision of approximately 0.94 is obtained, that is, the nature of the liquidus phase in the input data set can be correctly predicted for approximately 94% of the entries. The composition of the liquidus phase may be calculated to better than 5 mol % absolute. An important feature of this 'generalized solid' model is its reversibility; that is, the dependent and independent variables in the linear multivariate regression may be inverted to permit prediction of the composition of a silicate liquid produced by equilibrium partial melting of a polymineralic source assemblage.
O'Neill, William; Penn, Richard; Werner, Michael; Thomas, Justin
2015-06-01
Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible.
Optimizing methods for linking cinematic features to fMRI data.
Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia
2015-04-15
One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
Hunt, Julie E A; Stodart, Clare; Ferguson, Richard A
2016-07-01
Previous investigations to establish factors influencing the blood flow restriction (BFR) stimulus have determined cuff pressures required for complete arterial occlusion, which does not reflect the partial restriction prescribed for this training technique. This study aimed to establish characteristics that should be accounted for when prescribing cuff pressures required for partial BFR. Fifty participants were subjected to incremental blood flow restriction of the upper and lower limbs by proximal pneumatic cuff inflation. Popliteal and brachial artery diameter, blood velocity and blood flow was assessed with Doppler ultrasound. Height, body mass, limb circumference, muscle-bone cross-sectional area, adipose thickness (AT) and arterial blood pressure were measured and used in different models of hierarchical linear regression to predict the pressure at which 60 % BFR (partial occlusion) occurred. Combined analysis revealed a difference in cuff pressures required to elicit 60 % BFR in the popliteal (111 ± 12 mmHg) and brachial arteries (101 ± 12 mmHg). MAP (r = 0.58) and AT (r = -0.45) were the largest independent determinants of lower and upper body partial occlusion pressures. However, greater variance was explained by upper and lower limb regression models composed of DBP and BMI (48 %), and arm AT and DBP (30 %), respectively. Limb circumference has limited impact on the cuff pressure required for partial blood flow restriction which is in contrast to its recognised relationship with complete arterial occlusion. The majority of the variance in partial occlusion pressure remains unexplained by the predictor variables assessed in the present study.
Guertin, Kristin A; Loftfield, Erikka; Boca, Simina M; Sampson, Joshua N; Moore, Steven C; Xiao, Qian; Huang, Wen-Yi; Xiong, Xiaoqin; Freedman, Neal D; Cross, Amanda J; Sinha, Rashmi
2015-01-01
Background: Coffee intake may be inversely associated with colorectal cancer; however, previous studies have been inconsistent. Serum coffee metabolites are integrated exposure measures that may clarify associations with cancer and elucidate underlying mechanisms. Objectives: Our aims were 2-fold as follows: 1) to identify serum metabolites associated with coffee intake and 2) to examine these metabolites in relation to colorectal cancer. Design: In a nested case-control study of 251 colorectal cancer cases and 247 matched control subjects from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, we conducted untargeted metabolomics analyses of baseline serum by using ultrahigh-performance liquid-phase chromatography–tandem mass spectrometry and gas chromatography–mass spectrometry. Usual coffee intake was self-reported in a food-frequency questionnaire. We used partial Pearson correlations and linear regression to identify serum metabolites associated with coffee intake and conditional logistic regression to evaluate associations between coffee metabolites and colorectal cancer. Results: After Bonferroni correction for multiple comparisons (P = 0.05 ÷ 657 metabolites), 29 serum metabolites were positively correlated with coffee intake (partial correlation coefficients: 0.18–0.61; P < 7.61 × 10−5); serum metabolites most highly correlated with coffee intake (partial correlation coefficients >0.40) included trigonelline (N′-methylnicotinate), quinate, and 7 unknown metabolites. Of 29 serum metabolites, 8 metabolites were directly related to caffeine metabolism, and 3 of these metabolites, theophylline (OR for 90th compared with 10th percentiles: 0.44; 95% CI: 0.25, 0.79; P-linear trend = 0.006), caffeine (OR for 90th compared with 10th percentiles: 0.56; 95% CI: 0.35, 0.89; P-linear trend = 0.015), and paraxanthine (OR for 90th compared with 10th percentiles: 0.58; 95% CI: 0.36, 0.94; P-linear trend = 0.027), were inversely associated with colorectal cancer. Conclusions: Serum metabolites can distinguish coffee drinkers from nondrinkers; some caffeine-related metabolites were inversely associated with colorectal cancer and should be studied further to clarify the role of coffee in the cause of colorectal cancer. The Prostate, Lung, Colorectal, and Ovarian trial was registered at clinicaltrials.gov as NCT00002540. PMID:25762808
Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Wang, Xuchen
2016-02-01
Hyperspectral estimation of soil organic matter (SOM) in coal mining regions is an important tool for enhancing fertilization in soil restoration programs. The correlation--partial least squares regression (PLSR) method effectively solves the information loss problem of correlation--multiple linear stepwise regression, but results of the correlation analysis must be optimized to improve precision. This study considers the relationship between spectral reflectance and SOM based on spectral reflectance curves of soil samples collected from coal mining regions. Based on the major absorption troughs in the 400-1006 nm spectral range, PLSR analysis was performed using 289 independent bands of the second derivative (SDR) with three levels and measured SOM values. A wavelet-correlation-PLSR (W-C-PLSR) model was then constructed. By amplifying useful information that was previously obscured by noise, the W-C-PLSR model was optimal for estimating SOM content, with smaller prediction errors in both calibration (R(2) = 0.970, root mean square error (RMSEC) = 3.10, and mean relative error (MREC) = 8.75) and validation (RMSEV = 5.85 and MREV = 14.32) analyses, as compared with other models. Results indicate that W-C-PLSR has great potential to estimate SOM in coal mining regions.
A novel multi-target regression framework for time-series prediction of drug efficacy.
Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin
2017-01-18
Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task.
A novel multi-target regression framework for time-series prediction of drug efficacy
Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin
2017-01-01
Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task. PMID:28098186
NASA Astrophysics Data System (ADS)
Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang
2006-01-01
Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.
Fowler, Stephanie M; Schmidt, Heinar; van de Ven, Remy; Wynn, Peter; Hopkins, David L
2014-12-01
A Raman spectroscopic hand held device was used to predict shear force (SF) of 80 fresh lamb m. longissimus lumborum (LL) at 1 and 5days post mortem (PM). Traditional predictors of SF including sarcomere length (SL), particle size (PS), cooking loss (CL), percentage myofibrillar breaks and pH were also measured. SF values were regressed against Raman spectra using partial least squares regression and against the traditional predictors using linear regression. The best prediction of shear force values used spectra at 1day PM to predict shear force at 1day which gave a root mean square error of prediction (RMSEP) of 13.6 (Null=14.0) and the R(2) between observed and cross validated predicted values was 0.06 (R(2)cv). Overall, for fresh LL, the predictability SF, by either the Raman hand held probe or traditional predictors was low. Copyright © 2014 Elsevier Ltd. All rights reserved.
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
Aryee, Samuel; Walumbwa, Fred O; Seidu, Emmanuel Y M; Otaye, Lilian E
2012-03-01
We proposed and tested a multilevel model, underpinned by empowerment theory, that examines the processes linking high-performance work systems (HPWS) and performance outcomes at the individual and organizational levels of analyses. Data were obtained from 37 branches of 2 banking institutions in Ghana. Results of hierarchical regression analysis revealed that branch-level HPWS relates to empowerment climate. Additionally, results of hierarchical linear modeling that examined the hypothesized cross-level relationships revealed 3 salient findings. First, experienced HPWS and empowerment climate partially mediate the influence of branch-level HPWS on psychological empowerment. Second, psychological empowerment partially mediates the influence of empowerment climate and experienced HPWS on service performance. Third, service orientation moderates the psychological empowerment-service performance relationship such that the relationship is stronger for those high rather than low in service orientation. Last, ordinary least squares regression results revealed that branch-level HPWS influences branch-level market performance through cross-level and individual-level influences on service performance that emerges at the branch level as aggregated service performance.
Sato, Kohei; Sadamoto, Tomoko; Hirasawa, Ai; Oue, Anna; Subudhi, Andrew W; Miyazawa, Taiki; Ogoh, Shigehiko
2012-01-01
Arterial CO2 serves as a mediator of cerebral blood flow (CBF), and its relative influence on the regulation of CBF is defined as cerebral CO2 reactivity. Our previous studies have demonstrated that there are differences in CBF responses to physiological stimuli (i.e. dynamic exercise and orthostatic stress) between arteries in humans. These findings suggest that dynamic CBF regulation and cerebral CO2 reactivity may be different in the anterior and posterior cerebral circulation. The aim of this study was to identify cerebral CO2 reactivity by measuring blood flow and examine potential differences in CO2 reactivity between the internal carotid artery (ICA), external carotid artery (ECA) and vertebral artery (VA). In 10 healthy young subjects, we evaluated the ICA, ECA, and VA blood flow responses by duplex ultrasonography (Vivid-e, GE Healthcare), and mean blood flow velocity in middle cerebral artery (MCA) and basilar artery (BA) by transcranial Doppler (Vivid-7, GE healthcare) during two levels of hypercapnia (3% and 6% CO2), normocapnia and hypocapnia to estimate CO2 reactivity. To characterize cerebrovascular reactivity to CO2, we used both exponential and linear regression analysis between CBF and estimated partial pressure of arterial CO2, calculated by end-tidal partial pressure of CO2. CO2 reactivity in VA was significantly lower than in ICA (coefficient of exponential regression 0.021 ± 0.008 vs. 0.030 ± 0.008; slope of linear regression 2.11 ± 0.84 vs. 3.18 ± 1.09% mmHg−1: VA vs. ICA, P < 0.01). Lower CO2 reactivity in the posterior cerebral circulation was persistent in distal intracranial arteries (exponent 0.023 ± 0.006 vs. 0.037 ± 0.009; linear 2.29 ± 0.56 vs. 3.31 ± 0.87% mmHg−1: BA vs. MCA). In contrast, CO2 reactivity in ECA was markedly lower than in the intra-cerebral circulation (exponent 0.006 ± 0.007; linear 0.63 ± 0.64% mmHg−1, P < 0.01). These findings indicate that vertebro-basilar circulation has lower CO2 reactivity than internal carotid circulation, and that CO2 reactivity of the external carotid circulation is markedly diminished compared to that of the cerebral circulation, which may explain different CBF responses to physiological stress. PMID:22526884
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Reliable two-dimensional phase unwrapping method using region growing and local linear estimation.
Zhou, Kun; Zaitsev, Maxim; Bao, Shanglian
2009-10-01
In MRI, phase maps can provide useful information about parameters such as field inhomogeneity, velocity of blood flow, and the chemical shift between water and fat. As phase is defined in the (-pi,pi] range, however, phase wraps often occur, which complicates image analysis and interpretation. This work presents a two-dimensional phase unwrapping algorithm that uses quality-guided region growing and local linear estimation. The quality map employs the variance of the second-order partial derivatives of the phase as the quality criterion. Phase information from unwrapped neighboring pixels is used to predict the correct phase of the current pixel using a linear regression method. The algorithm was tested on both simulated and real data, and is shown to successfully unwrap phase images that are corrupted by noise and have rapidly changing phase. (c) 2009 Wiley-Liss, Inc.
Agier, Lydiane; Portengen, Lützen; Chadeau-Hyam, Marc; Basagaña, Xavier; Giorgis-Allemand, Lise; Siroux, Valérie; Robinson, Oliver; Vlaanderen, Jelle; González, Juan R; Nieuwenhuijsen, Mark J; Vineis, Paolo; Vrijheid, Martine; Slama, Rémy; Vermeulen, Roel
2016-12-01
The exposome constitutes a promising framework to improve understanding of the effects of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. In a simulation study, we generated 237 exposure covariates with a realistic correlation structure and with a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and an FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm revealed a sensitivity of 81% and an FDP of 34%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%) despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study were limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. Although GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods. Citation: Agier L, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, Robinson O, Vlaanderen J, González JR, Nieuwenhuijsen MJ, Vineis P, Vrijheid M, Slama R, Vermeulen R. 2016. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect 124:1848-1856; http://dx.doi.org/10.1289/EHP172.
Dawson, Debra Ann; Lam, Jack; Lewis, Lindsay B.; Carbonell, Felix; Mendola, Janine D.
2016-01-01
Abstract Numerous studies have demonstrated functional magnetic resonance imaging (fMRI)-based resting-state functional connectivity (RSFC) between cortical areas. Recent evidence suggests that synchronous fluctuations in blood oxygenation level-dependent fMRI reflect functional organization at a scale finer than that of visual areas. In this study, we investigated whether RSFCs within and between lower visual areas are retinotopically organized and whether retinotopically organized RSFC merely reflects cortical distance. Subjects underwent retinotopic mapping and separately resting-state fMRI. Visual areas V1, V2, and V3, were subdivided into regions of interest (ROIs) according to quadrants and visual field eccentricity. Functional connectivity (FC) was computed based on Pearson's linear correlation (correlation), and Pearson's linear partial correlation (correlation between two time courses after the time courses from all other regions in the network are regressed out). Within a quadrant, within visual areas, all correlation and nearly all partial correlation FC measures showed statistical significance. Consistently in V1, V2, and to a lesser extent in V3, correlation decreased with increasing eccentricity separation. Consistent with previously reported monkey anatomical connectivity, correlation/partial correlation values between regions from adjacent areas (V1-V2 and V2-V3) were higher than those between nonadjacent areas (V1-V3). Within a quadrant, partial correlation showed consistent significance between regions from two different areas with the same or adjacent eccentricities. Pairs of ROIs with similar eccentricity showed higher correlation/partial correlation than pairs distant in eccentricity. Between dorsal and ventral quadrants, partial correlation between common and adjacent eccentricity regions within a visual area showed statistical significance; this extended to more distant eccentricity regions in V1. Within and between quadrants, correlation decreased approximately linearly with increasing distances separating the tested ROIs. Partial correlation showed a more complex dependence on cortical distance: it decreased exponentially with increasing distance within a quadrant, but was best fit by a quadratic function between quadrants. We conclude that RSFCs within and between lower visual areas are retinotopically organized. Correlation-based FC is nonselectively high across lower visual areas, even between regions that do not share direct anatomical connections. The mechanisms likely involve network effects caused by the dense anatomical connectivity within this network and projections from higher visual areas. FC based on partial correlation, which minimizes network effects, follows expectations based on direct anatomical connections in the monkey visual cortex better than correlation. Last, partial correlation-based retinotopically organized RSFC reflects more than cortical distance effects. PMID:26415043
Dawson, Debra Ann; Lam, Jack; Lewis, Lindsay B; Carbonell, Felix; Mendola, Janine D; Shmuel, Amir
2016-02-01
Numerous studies have demonstrated functional magnetic resonance imaging (fMRI)-based resting-state functional connectivity (RSFC) between cortical areas. Recent evidence suggests that synchronous fluctuations in blood oxygenation level-dependent fMRI reflect functional organization at a scale finer than that of visual areas. In this study, we investigated whether RSFCs within and between lower visual areas are retinotopically organized and whether retinotopically organized RSFC merely reflects cortical distance. Subjects underwent retinotopic mapping and separately resting-state fMRI. Visual areas V1, V2, and V3, were subdivided into regions of interest (ROIs) according to quadrants and visual field eccentricity. Functional connectivity (FC) was computed based on Pearson's linear correlation (correlation), and Pearson's linear partial correlation (correlation between two time courses after the time courses from all other regions in the network are regressed out). Within a quadrant, within visual areas, all correlation and nearly all partial correlation FC measures showed statistical significance. Consistently in V1, V2, and to a lesser extent in V3, correlation decreased with increasing eccentricity separation. Consistent with previously reported monkey anatomical connectivity, correlation/partial correlation values between regions from adjacent areas (V1-V2 and V2-V3) were higher than those between nonadjacent areas (V1-V3). Within a quadrant, partial correlation showed consistent significance between regions from two different areas with the same or adjacent eccentricities. Pairs of ROIs with similar eccentricity showed higher correlation/partial correlation than pairs distant in eccentricity. Between dorsal and ventral quadrants, partial correlation between common and adjacent eccentricity regions within a visual area showed statistical significance; this extended to more distant eccentricity regions in V1. Within and between quadrants, correlation decreased approximately linearly with increasing distances separating the tested ROIs. Partial correlation showed a more complex dependence on cortical distance: it decreased exponentially with increasing distance within a quadrant, but was best fit by a quadratic function between quadrants. We conclude that RSFCs within and between lower visual areas are retinotopically organized. Correlation-based FC is nonselectively high across lower visual areas, even between regions that do not share direct anatomical connections. The mechanisms likely involve network effects caused by the dense anatomical connectivity within this network and projections from higher visual areas. FC based on partial correlation, which minimizes network effects, follows expectations based on direct anatomical connections in the monkey visual cortex better than correlation. Last, partial correlation-based retinotopically organized RSFC reflects more than cortical distance effects.
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Gierlinger, Notburga; Luss, Saskia; König, Christian; Konnerth, Johannes; Eder, Michaela; Fratzl, Peter
2010-01-01
The functional characteristics of plant cell walls depend on the composition of the cell wall polymers, as well as on their highly ordered architecture at scales from a few nanometres to several microns. Raman spectra of wood acquired with linear polarized laser light include information about polymer composition as well as the alignment of cellulose microfibrils with respect to the fibre axis (microfibril angle). By changing the laser polarization direction in 3 degrees steps, the dependency between cellulose and laser orientation direction was investigated. Orientation-dependent changes of band height ratios and spectra were described by quadratic linear regression and partial least square regressions, respectively. Using the models and regressions with high coefficients of determination (R(2) > 0.99) microfibril orientation was predicted in the S1 and S2 layers distinguished by the Raman imaging approach in cross-sections of spruce normal, opposite, and compression wood. The determined microfibril angle (MFA) in the different S2 layers ranged from 0 degrees to 49.9 degrees and was in coincidence with X-ray diffraction determination. With the prerequisite of geometric sample and laser alignment, exact MFA prediction can complete the picture of the chemical cell wall design gained by the Raman imaging approach at the micron level in all plant tissues.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Optimizing the physical ergonomics indices for the use of partial pressure suits.
Ding, Li; Li, Xianxue; Hedge, Alan; Hu, Huimin; Feathers, David; Qin, Zhifeng; Xiao, Huajun; Xue, Lihao; Zhou, Qianxiang
2015-03-01
This study developed an ergonomic evaluation system for the design of high-altitude partial pressure suits (PPSs). A total of twenty-one Chinese males participated in the experiment which tested three types of ergonomics indices (manipulative mission, operational reach and operational strength) were studied using a three-dimensional video-based motion capture system, a target-pointing board, a hand dynamometer, and a step-tread apparatus. In total, 36 ergonomics indices were evaluated and optimized using regression and fitting analysis. Some indices that were found to be linearly related and redundant were removed from the study. An optimal ergonomics index system was established that can be used to conveniently and quickly evaluate the performance of different pressurized/non-pressurized suit designs. The resulting ergonomics index system will provide a theoretical basis and practical guidance for mission planners, suit designers and engineers to design equipment for human use, and to aid in assessing partial pressure suits. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Kim, Dae Keun; Jang, Yujin; Lee, Jaeseon; Hong, Helen; Kim, Ki Hong; Shin, Tae Young; Jung, Dae Chul; Choi, Young Deuk; Rha, Koon Ho
2015-12-01
To analyze long-term changes in both kidneys, and to predict renal function and contralateral hypertrophy after robot-assisted partial nephrectomy. A total of 62 patients underwent robot-assisted partial nephrectomy, and renal parenchymal volume was calculated using three-dimensional semi-automatic segmentation technology. Patients were evaluated within 1 month preoperatively, and postoperatively at 6 months, 1 year and continued up to 2-year follow up. Linear regression models were used to identify the factors predicting variables that correlated with estimated glomerular filtration rate changes and contralateral hypertrophy 2 years after robot-assisted partial nephrectomy. The median global estimated glomerular filtration rate changes were -10.4%, -11.9%, and -2.4% at 6 months, 1 and 2 years post-robot-assisted partial nephrectomy, respectively. The ipsilateral kidney median parenchymal volume changes were -24%, -24.4%, and -21% at 6 months, 1 and 2 years post-robot-assisted partial nephrectomy, respectively. The contralateral renal volume changes were 2.3%, 9.6% and 12.9%, respectively. On multivariable linear analysis, preoperative estimated glomerular filtration rate was the best predictive factor for global estimated glomerular filtration rate change on 2 years post-robot-assisted partial nephrectomy (B -0.452; 95% confidence interval -0.84 to -0.14; P = 0.021), whereas the parenchymal volume loss rate (B -0.43; 95% confidence interval -0.89 to -0.15; P = 0.017) and tumor size (B 5.154; 95% confidence interval -0.11 to 9.98; P = 0.041) were the significant predictive factors for the degree of contralateral renal hypertrophy on 2 years post-robot-assisted partial nephrectomy. Preoperative estimated glomerular filtration rate significantly affects post-robot-assisted partial nephrectomy renal function. Renal mass size and renal parenchyma volume loss correlates with compensatory hypertrophy of the contralateral kidney. Contralateral hypertrophy of the renal parenchyma compensates for the functional loss of the ipsilateral kidney. © 2015 The Japanese Urological Association.
Zhu, Hongyan; Chu, Bingquan; Fan, Yangyang; Tao, Xiaoya; Yin, Wenxin; He, Yong
2017-08-10
We investigated the feasibility and potentiality of determining firmness, soluble solids content (SSC), and pH in kiwifruits using hyperspectral imaging, combined with variable selection methods and calibration models. The images were acquired by a push-broom hyperspectral reflectance imaging system covering two spectral ranges. Weighted regression coefficients (BW), successive projections algorithm (SPA) and genetic algorithm-partial least square (GAPLS) were compared and evaluated for the selection of effective wavelengths. Moreover, multiple linear regression (MLR), partial least squares regression and least squares support vector machine (LS-SVM) were developed to predict quality attributes quantitatively using effective wavelengths. The established models, particularly SPA-MLR, SPA-LS-SVM and GAPLS-LS-SVM, performed well. The SPA-MLR models for firmness (R pre = 0.9812, RPD = 5.17) and SSC (R pre = 0.9523, RPD = 3.26) at 380-1023 nm showed excellent performance, whereas GAPLS-LS-SVM was the optimal model at 874-1734 nm for predicting pH (R pre = 0.9070, RPD = 2.60). Image processing algorithms were developed to transfer the predictive model in every pixel to generate prediction maps that visualize the spatial distribution of firmness and SSC. Hence, the results clearly demonstrated that hyperspectral imaging has the potential as a fast and non-invasive method to predict the quality attributes of kiwifruits.
Yu, Peigen; Low, Mei Yin; Zhou, Weibiao
2018-01-01
In order to develop products that would be preferred by consumers, the effects of the chemical compositions of ready-to-drink green tea beverages on consumer liking were studied through regression analyses. Green tea model systems were prepared by dosing solutions of 0.1% green tea extract with differing concentrations of eight flavour keys deemed to be important for green tea aroma and taste, based on a D-optimal experimental design, before undergoing commercial sterilisation. Sensory evaluation of the green tea model system was carried out using an untrained consumer panel to obtain hedonic liking scores of the samples. Regression models were subsequently trained to objectively predict the consumer liking scores of the green tea model systems. A linear partial least squares (PLS) regression model was developed to describe the effects of the eight flavour keys on consumer liking, with a coefficient of determination (R 2 ) of 0.733, and a root-mean-square error (RMSE) of 3.53%. The PLS model was further augmented with an artificial neural network (ANN) to establish a PLS-ANN hybrid model. The established hybrid model was found to give a better prediction of consumer liking scores, based on its R 2 (0.875) and RMSE (2.41%). Copyright © 2017 Elsevier Ltd. All rights reserved.
Methods for estimating low-flow statistics for Massachusetts streams
Ries, Kernell G.; Friesz, Paul J.
2000-01-01
Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The streamgaging stations had from 2 to 81 years of record, with a mean record length of 37 years. The low-flow partial-record stations had from 8 to 36 streamflow measurements, with a median of 14 measurements. All basin characteristics were determined from digital map data. The basin characteristics that were statistically significant in most of the final regression equations were drainage area, the area of stratified-drift deposits per unit of stream length plus 0.1, mean basin slope, and an indicator variable that was 0 in the eastern region and 1 in the western region of Massachusetts. The equations were developed by use of weighted-least-squares regression analyses, with weights assigned proportional to the years of record and inversely proportional to the variances of the streamflow statistics for the stations. Standard errors of prediction ranged from 70.7 to 17.5 percent for the equations to predict the 7-day, 10-year low flow and 50-percent duration flow, respectively. The equations are not applicable for use in the Southeast Coastal region of the State, or where basin characteristics for the selected ungaged site are outside the ranges of those for the stations used in the regression analyses. A World Wide Web application was developed that provides streamflow statistics for data collection stations from a data base and for ungaged sites by measuring the necessary basin characteristics for the site and solving the regression equations. Output provided by the Web application for ungaged sites includes a map of the drainage-basin boundary determined for the site, the measured basin characteristics, the estimated streamflow statistics, and 90-percent prediction intervals for the estimates. An equation is provided for combining regression and correlation estimates to obtain improved estimates of the streamflow statistics for low-flow partial-record stations. An equation is also provided for combining regression and drainage-area ratio estimates to obtain improved e
NASA Astrophysics Data System (ADS)
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Variable screening via quantile partial correlation
Ma, Shujie; Tsai, Chih-Ling
2016-01-01
In quantile linear regression with ultra-high dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. PMID:28943683
Saha, Rajib; Misra, Raghunath; Saha, Indranil
2015-10-01
To assess the quality of life among thalassemic children and to find out association of quality of life (QOL) with the socio-demographic factors, and clinico-therapeutic profile. This cross sectional descriptive epidemiological study was conducted from July 2011 through June 2012 on 365 admitted thalassemic patients of 5 to 12 y of age in the Burdwan Medical College and Hospital. Parents of the children were interviewed using Paediatric Quality of Life Inventory 4.0 Generic Core Scale. Statistically significant variables in bivariate analysis were considered for correlation matrix where independent variables were found inter related. So, partial correlation was done and statistically significant variables in partial correlation were considered for linear regression. The mean age of 365 thalassemic children was 8.3 ± 2.4 y. Multiple linear regressions predicted that only 70.5 % variation of total summary score depended on duration since splenectomy (31.2 % variation), last pre transfusion Hb level (20.7 %), family history of thalassemia (17.3 %) and frequency of blood transfusions (1.3 %). After splenectomy, thalassemic children could lead a better quality of life upto 5 y only. The betterment of the quality of life needs maintaining pre transfusion Hb level above 7 g/dl. Previous experience of the disease among the family members enriches the awareness among them and helps them to take correct decisions timely about the child and that leads to better QOL. More awareness regarding the maintenance of pre transfusion Hb level should be built up among parents and families where such disease has occurred for the first time.
Ferreira, Marcel T; Leite, Nathalie C; Cardoso, Claudia R L; Salles, Gil F
2015-05-01
The correlates of serial changes in aortic stiffness in patients with diabetes have never been investigated. We aimed to examine the importance of glycemic control on progression/regression of carotid-femoral pulse wave velocity (cf-PWV) in type 2 diabetes. In a prospective study, two cf-PWV measurements were performed with the Complior equipment in 417 patients with type 2 diabetes over a mean follow-up of 4.2 years. Clinical laboratory data were obtained at baseline and throughout follow-up. Multivariable linear/logistic regressions assessed the independent correlates of changes in cf-PWV. Median cf-PWV increase was 0.11 m/s per year (1.1% per year). Overall, 212 patients (51%) increased/persisted with high cf-PWV, while 205 (49%) reduced/persisted with low cf-PWV. Multivariate linear regression demonstrated direct associations between cf-PWV changes and mean HbA1c during follow-up (partial correlation 0.14, P = 0.005). On logistic regression, a mean HbA1c ≥7.5% (58 mmol/mol) was associated with twofold higher odds of having increased/persistently high cf-PWV during follow-up. Furthermore, the rate of HbA1c reduction relative to baseline levels was inversely associated with cf-PWV changes (partial correlation -0.11, P = 0.011) and associated with reduced risk of having increased/persistently high aortic stiffness (odds ratio 0.82 [95% CI 0.69-0.96]; P = 0.017). Other independent correlates of progression in aortic stiffness were increases in systolic blood pressure and heart rate between the two cf-PWV measurements, older age, female sex, and presence of dyslipidemia and retinopathy. Better glycemic control, together with reductions in blood pressure and heart rate, was the most important correlate to attenuate/prevent progression of aortic stiffness in patients with type 2 diabetes. © 2015 by the American Diabetes Association. Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered.
Penn, Richard; Werner, Michael; Thomas, Justin
2015-01-01
Background Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. Methods In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. Results We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Conclusions Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible. PMID:26029638
The Hydrothermal Chemistry of Gold, Arsenic, Antimony, Mercury and Silver
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bessinger, Brad; Apps, John A.
2003-03-23
A comprehensive thermodynamic database based on the Helgeson-Kirkham-Flowers (HKF) equation of state was developed for metal complexes in hydrothermal systems. Because this equation of state has been shown to accurately predict standard partial molal thermodynamic properties of aqueous species at elevated temperatures and pressures, this study provides the necessary foundation for future exploration into transport and depositional processes in polymetallic ore deposits. The HKF equation of state parameters for gold, arsenic, antimony, mercury, and silver sulfide and hydroxide complexes were derived from experimental equilibrium constants using nonlinear regression calculations. In order to ensure that the resulting parameters were internally consistent,more » those experiments utilizing incompatible thermodynamic data were re-speciated prior to regression. Because new experimental studies were used to revise the HKF parameters for H2S0 and HS-1, those metal complexes for which HKF parameters had been previously derived were also updated. It was found that predicted thermodynamic properties of metal complexes are consistent with linear correlations between standard partial molal thermodynamic properties. This result allowed assessment of several complexes for which experimental data necessary to perform regression calculations was limited. Oxygen fugacity-temperature diagrams were calculated to illustrate how thermodynamic data improves our understanding of depositional processes. Predicted thermodynamic properties were used to investigate metal transport in Carlin-type gold deposits. Assuming a linear relationship between temperature and pressure, metals are predicted to predominantly be transported as sulfide complexes at a total aqueous sulfur concentration of 0.05 m. Also, the presence of arsenic and antimony mineral phases in the deposits are shown to restrict mineralization within a limited range of chemical conditions. Finally, at a lesser aqueous sulfur concentration of 0.01 m, host rock sulfidation can explain the origin of arsenic and antimony minerals within the paragenetic sequence.« less
Choi, Y K; Park, D Y; Kim, Y
2014-11-01
Many health issues have been reported to be associated with poor nutritional status. We sought to examine the association between nutritional intake and oral health status in elderly people. The association between perceived disability in mastication and prosthodontic status was analysed using multiple logistic regression. Multiple linear regression was used to analyse the association between prosthodontic status and nutritional intake. The elderly subjects with partial or full dentures reported chewing difficulties 1.62-fold more frequently (95% CI: 1.06-2.49) than those with natural teeth or a fixed prosthesis after adjusting for gender, TMD (temporomandibular disorder), household income and education level. Additionally, daily nutritional intakes of energy, protein, fat, ash, calcium, phosphorus and thiamine were decreased significantly in elderly with partial or full dentures compared with those with no prosthesis or with a fixed prosthesis (P < 0.05). Our findings underline oral health status and perceived disability in mastication are associated with dietary imbalances in the elderly. We suggest that the evaluation of patients' nutritional status should be considered as a part of an overall plan for dental hygiene care. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Igne, Benoît; Drennen, James K; Anderson, Carl A
2014-01-01
Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In near-infrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi
2011-07-01
Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Does Group-Level Commitment Predict Employee Well-Being?: A Prospective Analysis.
Clausen, Thomas; Christensen, Karl Bang; Nielsen, Karina
2015-11-01
To investigate the links between group-level affective organizational commitment (AOC) and individual-level psychological well-being, self-reported sickness absence, and sleep disturbances. A total of 5085 care workers from 301 workgroups in the Danish eldercare services participated in both waves of the study (T1 [2005] and T2 [2006]). The three outcomes were analyzed using linear multilevel regression analysis, multilevel Poisson regression analysis, and multilevel logistic regression analysis, respectively. Group-level AOC (T1) significantly predicted individual-level psychological well-being, self-reported sickness absence, and sleep disturbances (T2). The association between group-level AOC (T1) and psychological well-being (T2) was fully mediated by individual-level AOC (T1), and the associations between group-level AOC (T1) and self-reported sickness absence and sleep disturbances (T2) were partially mediated by individual-level AOC (T1). Group-level AOC is an important predictor of employee well-being in contemporary health care organizations.
Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A
2006-01-23
A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.
Constructing general partial differential equations using polynomial and neural networks.
Zjavka, Ladislav; Pedrycz, Witold
2016-01-01
Sum fraction terms can approximate multi-variable functions on the basis of discrete observations, replacing a partial differential equation definition with polynomial elementary data relation descriptions. Artificial neural networks commonly transform the weighted sum of inputs to describe overall similarity relationships of trained and new testing input patterns. Differential polynomial neural networks form a new class of neural networks, which construct and solve an unknown general partial differential equation of a function of interest with selected substitution relative terms using non-linear multi-variable composite polynomials. The layers of the network generate simple and composite relative substitution terms whose convergent series combinations can describe partial dependent derivative changes of the input variables. This regression is based on trained generalized partial derivative data relations, decomposed into a multi-layer polynomial network structure. The sigmoidal function, commonly used as a nonlinear activation of artificial neurons, may transform some polynomial items together with the parameters with the aim to improve the polynomial derivative term series ability to approximate complicated periodic functions, as simple low order polynomials are not able to fully make up for the complete cycles. The similarity analysis facilitates substitutions for differential equations or can form dimensional units from data samples to describe real-world problems. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gierlinger, Notburga; Luss, Saskia; König, Christian; Konnerth, Johannes; Eder, Michaela; Fratzl, Peter
2010-01-01
The functional characteristics of plant cell walls depend on the composition of the cell wall polymers, as well as on their highly ordered architecture at scales from a few nanometres to several microns. Raman spectra of wood acquired with linear polarized laser light include information about polymer composition as well as the alignment of cellulose microfibrils with respect to the fibre axis (microfibril angle). By changing the laser polarization direction in 3° steps, the dependency between cellulose and laser orientation direction was investigated. Orientation-dependent changes of band height ratios and spectra were described by quadratic linear regression and partial least square regressions, respectively. Using the models and regressions with high coefficients of determination (R2 > 0.99) microfibril orientation was predicted in the S1 and S2 layers distinguished by the Raman imaging approach in cross-sections of spruce normal, opposite, and compression wood. The determined microfibril angle (MFA) in the different S2 layers ranged from 0° to 49.9° and was in coincidence with X-ray diffraction determination. With the prerequisite of geometric sample and laser alignment, exact MFA prediction can complete the picture of the chemical cell wall design gained by the Raman imaging approach at the micron level in all plant tissues. PMID:20007198
NASA Astrophysics Data System (ADS)
Arantes Camargo, Livia; Marques, José, Jr.
2015-04-01
The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.
Pistonesi, Marcelo F; Di Nezio, María S; Centurión, María E; Lista, Adriana G; Fragoso, Wallace D; Pontes, Márcio J C; Araújo, Mário C U; Band, Beatriz S Fernández
2010-12-15
In this study, a novel, simple, and efficient spectrofluorimetric method to determine directly and simultaneously five phenolic compounds (hydroquinone, resorcinol, phenol, m-cresol and p-cresol) in air samples is presented. For this purpose, variable selection by the successive projections algorithm (SPA) is used in order to obtain simple multiple linear regression (MLR) models based on a small subset of wavelengths. For comparison, partial least square (PLS) regression is also employed in full-spectrum. The concentrations of the calibration matrix ranged from 0.02 to 0.2 mg L(-1) for hydroquinone, from 0.05 to 0.6 mg L(-1) for resorcinol, and from 0.05 to 0.4 mg L(-1) for phenol, m-cresol and p-cresol; incidentally, such ranges are in accordance with the Argentinean environmental legislation. To verify the accuracy of the proposed method a recovery study on real air samples of smoking environment was carried out with satisfactory results (94-104%). The advantage of the proposed method is that it requires only spectrofluorimetric measurements of samples and chemometric modeling for simultaneous determination of five phenols. With it, air is simply sampled and no pre-treatment sample is needed (i.e., separation steps and derivatization reagents are avoided) that means a great saving of time. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Mathias, Simon A.; Wen, Zhang
2015-05-01
This article presents a numerical study to investigate the combined role of partial well penetration (PWP) and non-Darcy effects concerning the performance of groundwater production wells. A finite difference model is developed in MATLAB to solve the two-dimensional mixed-type boundary value problem associated with flow to a partially penetrating well within a cylindrical confined aquifer. Non-Darcy effects are incorporated using the Forchheimer equation. The model is verified by comparison to results from existing semi-analytical solutions concerning the same problem but assuming Darcy's law. A sensitivity analysis is presented to explore the problem of concern. For constant pressure production, Non-Darcy effects lead to a reduction in production rate, as compared to an equivalent problem solved using Darcy's law. For fully penetrating wells, this reduction in production rate becomes less significant with time. However, for partially penetrating wells, the reduction in production rate persists for much larger times. For constant production rate scenarios, the combined effect of PWP and non-Darcy flow takes the form of a constant additional drawdown term. An approximate solution for this loss term is obtained by performing linear regression on the modeling results.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Developing a dengue forecast model using machine learning: A case study in China.
Guo, Pi; Liu, Tao; Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun
2017-10-01
In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.
Hashami, Hilal Al; Bataclan, Maria F; Mathew, Mariam; Krishnan, Lalitha
2010-01-01
Caudal regression syndrome is a rare fetal condition of diabetic pregnancy. Although the exact mechanism is not known, hyperglycaemia during embryogenesis seems to act as a teratogen. Independently, caudal regression syndrome (CRS), agenesis of the corpus callosum (ACC) and partial lobar holoprosencephaly (HPE) have been reported in infants of diabetic mothers. To our knowledge, a combination of all these three conditions has not been reported so far. PMID:21509087
Hashami, Hilal Al; Bataclan, Maria F; Mathew, Mariam; Krishnan, Lalitha
2010-04-01
Caudal regression syndrome is a rare fetal condition of diabetic pregnancy. Although the exact mechanism is not known, hyperglycaemia during embryogenesis seems to act as a teratogen. Independently, caudal regression syndrome (CRS), agenesis of the corpus callosum (ACC) and partial lobar holoprosencephaly (HPE) have been reported in infants of diabetic mothers. To our knowledge, a combination of all these three conditions has not been reported so far.
Estimation of water quality by UV/Vis spectrometry in the framework of treated wastewater reuse.
Carré, Erwan; Pérot, Jean; Jauzein, Vincent; Lin, Liming; Lopez-Ferber, Miguel
2017-07-01
The aim of this study is to investigate the potential of ultraviolet/visible (UV/Vis) spectrometry as a complementary method for routine monitoring of reclaimed water production. Robustness of the models and compliance of their sensitivity with current quality limits are investigated. The following indicators are studied: total suspended solids (TSS), turbidity, chemical oxygen demand (COD) and nitrate. Partial least squares regression (PLSR) is used to find linear correlations between absorbances and indicators of interest. Artificial samples are made by simulating a sludge leak on the wastewater treatment plant and added to the original dataset, then divided into calibration and prediction datasets. The models are built on the calibration set, and then tested on the prediction set. The best models are developed with: PLSR for COD (R pred 2 = 0.80), TSS (R pred 2 = 0.86) and turbidity (R pred 2 = 0.96), and with a simple linear regression from absorbance at 208 nm (R pred 2 = 0.95) for nitrate concentration. The input of artificial data significantly enhances the robustness of the models. The sensitivity of the UV/Vis spectrometry monitoring system developed is compatible with quality requirements of reclaimed water production processes.
Outsourcing primary health care services--how politicians explain the grounds for their decisions.
Laamanen, Ritva; Simonsen-Rehn, Nina; Suominen, Sakari; Øvretveit, John; Brommels, Mats
2008-12-01
To explore outsourcing of primary health care (PHC) services in four municipalities in Finland with varying amounts and types of outsourcing: a Southern municipality (SM) which contracted all PHC services to a not-for-profit voluntary organization, and Eastern (EM), South-Western (SWM) and Western (WM) municipalities which had contracted out only a few services to profit or public organizations. A mail survey to all municipality politicians (response rate 52%, N=101) in 2004. Data were analyzed using cross-tabulations, Spearman correlation and linear regression analyses. Politicians were willing to outsource PHC services only partially, and many problems relating to outsourcing were reported. Politicians in all municipalities were least likely to outsource preventive services. A multiple linear regression model showed that reported preference to outsource in EM and in SWM was lower than in SM, and also lower among politicians from "leftist" political parties than "rightist" political parties. Perceived difficulties in local health policy issues were related to reduced preference to outsource. The model explained 27% of the variance of the inclination to outsource PHC services. The findings highlight how important it is to take into account local health policy issues when assessing service-provision models.
Factors associated with arterial stiffness in children aged 9-10 years
Batista, Milena Santos; Mill, José Geraldo; Pereira, Taisa Sabrina Silva; Fernandes, Carolina Dadalto Rocha; Molina, Maria del Carmen Bisi
2015-01-01
OBJECTIVE To analyze the factors associated with stiffness of the great arteries in prepubertal children. METHODS This study with convenience sample of 231 schoolchildren aged 9-10 years enrolled in public and private schools in Vitória, ES, Southeastern Brazil, in 2010-2011. Anthropometric and hemodynamic data, blood pressure, and pulse wave velocity in the carotid-femoral segment were obtained. Data on current and previous health conditions were obtained by questionnaire and notes on the child’s health card. Multiple linear regression was applied to identify the partial and total contribution of the factors in determining the pulse wave velocity values. RESULTS Among the students, 50.2% were female and 55.4% were 10 years old. Among those classified in the last tertile of pulse wave velocity, 60.0% were overweight, with higher mean blood pressure, waist circumference, and waist-to-height ratio. Birth weight was not associated with pulse wave velocity. After multiple linear regression analysis, body mass index (BMI) and diastolic blood pressure remained in the model. CONCLUSIONS BMI was the most important factor in determining arterial stiffness in children aged 9-10 years. PMID:25902563
Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita
2018-03-01
The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
Role of social support in adolescent suicidal ideation and suicide attempts.
Miller, Adam Bryant; Esposito-Smythers, Christianne; Leichtweis, Richard N
2015-03-01
The present study examined the relative contributions of perceptions of social support from parents, close friends, and school on current suicidal ideation (SI) and suicide attempt (SA) history in a clinical sample of adolescents. Participants were 143 adolescents (64% female; 81% white; range, 12-18 years; M = 15.38; standard deviation = 1.43) admitted to a partial hospitalization program. Data were collected with well-validated assessments and a structured clinical interview. Main and interactive effects of perceptions of social support on SI were tested with linear regression. Main and interactive effects of social support on the odds of SA were tested with logistic regression. Results from the linear regression analysis revealed that perceptions of lower school support independently predicted greater severity of SI, accounting for parent and close friend support. Further, the relationship between lower perceived school support and SI was the strongest among those who perceived lower versus higher parental support. Results from the logistic regression analysis revealed that perceptions of lower parental support independently predicted SA history, accounting for school and close friend support. Further, those who perceived lower support from school and close friends reported the greatest odds of an SA history. Results address a significant gap in the social support and suicide literature by demonstrating that perceptions of parent and school support are relatively more important than peer support in understanding suicidal thoughts and history of suicidal behavior. Results suggest that improving social support across these domains may be important in suicide prevention efforts. Copyright © 2015 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Refractive Status at Birth: Its Relation to Newborn Physical Parameters at Birth and Gestational Age
Varghese, Raji Mathew; Sreenivas, Vishnubhatla; Puliyel, Jacob Mammen; Varughese, Sara
2009-01-01
Background Refractive status at birth is related to gestational age. Preterm babies have myopia which decreases as gestational age increases and term babies are known to be hypermetropic. This study looked at the correlation of refractive status with birth weight in term and preterm babies, and with physical indicators of intra-uterine growth such as the head circumference and length of the baby at birth. Methods All babies delivered at St. Stephens Hospital and admitted in the nursery were eligible for the study. Refraction was performed within the first week of life. 0.8% tropicamide with 0.5% phenylephrine was used to achieve cycloplegia and paralysis of accommodation. 599 newborn babies participated in the study. Data pertaining to the right eye is utilized for all the analyses except that for anisometropia where the two eyes were compared. Growth parameters were measured soon after birth. Simple linear regression analysis was performed to see the association of refractive status, (mean spherical equivalent (MSE), astigmatism and anisometropia) with each of the study variables, namely gestation, length, weight and head circumference. Subsequently, multiple linear regression was carried out to identify the independent predictors for each of the outcome parameters. Results Simple linear regression showed a significant relation between all 4 study variables and refractive error but in multiple regression only gestational age and weight were related to refractive error. The partial correlation of weight with MSE adjusted for gestation was 0.28 and that of gestation with MSE adjusted for weight was 0.10. Birth weight had a higher correlation to MSE than gestational age. Conclusion This is the first study to look at refractive error against all these growth parameters, in preterm and term babies at birth. It would appear from this study that birth weight rather than gestation should be used as criteria for screening for refractive error, especially in developing countries where the incidence of intrauterine malnutrition is higher. PMID:19214228
Use of partial least squares regression to impute SNP genotypes in Italian cattle breeds.
Dimauro, Corrado; Cellesi, Massimo; Gaspa, Giustino; Ajmone-Marsan, Paolo; Steri, Roberto; Marras, Gabriele; Macciotta, Nicolò P P
2013-06-05
The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used. Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content. In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip. Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.
Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Ciura, Krzesimir; Belka, Mariusz; Kawczak, Piotr; Bączek, Tomasz; Markuszewski, Michał J; Nowakowska, Joanna
2017-09-05
The objective of this paper is to build QSRR/QSAR model for predicting the blood-brain barrier (BBB) permeability. The obtained models are based on salting-out thin layer chromatography (SOTLC) constants and calculated molecular descriptors. Among chromatographic methods SOTLC was chosen, since the mobile phases are free of organic solvent. As consequences, there are less toxic, and have lower environmental impact compared to classical reserved phases liquid chromatography (RPLC). During the study three stationary phase silica gel, cellulose plates and neutral aluminum oxide were examined. The model set of solutes presents a wide range of log BB values, containing compounds which cross the BBB readily and molecules poorly distributed to the brain including drugs acting on the nervous system as well as peripheral acting drugs. Additionally, the comparison of three regression models: multiple linear regression (MLR), partial least-squares (PLS) and orthogonal partial least squares (OPLS) were performed. The designed QSRR/QSAR models could be useful to predict BBB of systematically synthesized newly compounds in the drug development pipeline and are attractive alternatives of time-consuming and demanding directed methods for log BB measurement. The study also shown that among several regression techniques, significant differences can be obtained in models performance, measured by R 2 and Q 2 , hence it is strongly suggested to evaluate all available options as MLR, PLS and OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
Boulos, Mark I; Murray, Brian J; Muir, Ryan T; Gao, Fuqiang; Szilagyi, Gregory M; Huroy, Menal; Kiss, Alexander; Walters, Arthur S; Black, Sandra E; Lim, Andrew S; Swartz, Richard H
2017-03-01
Emerging evidence suggests that periodic limb movements (PLMs) may contribute to the development of cerebrovascular disease. White matter hyperintensities (WMHs), a widely accepted biomarker for cerebral small vessel disease, are associated with incident stroke and death. We evaluated the association between increased PLM indices and WMH burden in patients presenting with stroke or transient ischemic attack (TIA), while controlling for vascular risk factors and stroke severity. Thirty patients presenting within 2 weeks of a first-ever minor stroke or high-risk TIA were prospectively recruited. PLM severity was measured with polysomnography. WMH burden was quantified using the Age Related White Matter Changes (ARWMC) scale based on neuroimaging. Partial Spearman's rank-order correlations and multiple linear regression models tested the association between WMH burden and PLM severity. Greater WMH burden was correlated with elevated PLM index and stroke volume. Partial Spearman's rank-order correlations demonstrated that the relationship between WMH burden and PLM index persisted despite controlling for vascular risk factors. Multivariate linear regression models revealed that PLM index was a significant predictor of an elevated ARWMC score while controlling for age, stroke volume, stroke severity, hypertension, and apnea-hypopnea index. The quantity of PLMs was associated with WMH burden in patients with first-ever minor stroke or TIA. PLMs may be a risk factor for or marker of WMH burden, even after considering vascular risk factors and stroke severity. These results invite further investigation of PLMs as a potentially useful target to reduce WMH and stroke burden. © Sleep Research Society (SRS) 2016. All rights reserved. For permissions, please email: journals.permissions@oup.com
Sperm quality variables as indicators of bull fertility may be breed dependent.
Morrell, Jane M; Nongbua, Thanapol; Valeanu, Sabina; Lima Verde, Isabel; Lundstedt-Enkel, Katrin; Edman, Anders; Johannisson, Anders
2017-10-01
A means of discriminating among bulls of high fertility based on sperm quality is needed by breeding centers. The objective of the study was to examine parameters of sperm quality in bulls of known fertility to identify useful indicators of fertility. Frozen semen was available from bulls of known fertility (Viking Genetics, Skara, Sweden): Swedish Red (n=31), Holstein (n=25) and Others (one each of Charolais, Limousin, Blonde, SKB). After thawing, the sperm samples were analyzed for motility (computer assisted sperm analysis), plasma membrane integrity, chromatin integrity, acrosome status, mitochondrial activity and reactive oxygen species. A fertility index score based on the adjusted 56-day non-return rate for >1000 inseminations was available for each bull. Multivariate data analysis (Partial Least Squares Regression and Orthogonal Partial Least Squares Regression) was performed to identify variables related to fertility; Pearson univariate correlations were made on the parameters of interest. Breed of bull affected the relationship of sperm quality variables and fertility index score, as follows: Swedish Red: %DNA Fragmentation Index, r=-0.56, P<0.01; intact plasma membrane, r=0.40, P<0.05; membrane damaged, not acrosome reacted, r=-0.6, P<0.01; Linearity, r=0.37, P<0.05; there was a trend towards significance for Wobble, r=0.34, P=0.08. Holstein: Linearity was significant r=0.46, P<0.05; there was a trend towards significance for Wobble, r=0.45, P=0.08. In conclusion, breed has a greater effect on sperm quality than previously realized; different parameters of sperm quality are needed to indicate potential fertility in different breeds. Copyright © 2017 Elsevier B.V. All rights reserved.
Racial/Ethnic Minority Youth With Recent-Onset Type 1 Diabetes Have Poor Prognostic Factors.
Redondo, Maria Jose; Libman, Ingrid; Cheng, Peiyao; Kollman, Craig; Tosur, Mustafa; Gal, Robin L; Bacha, Fida; Klingensmith, Georgeanna J; Clements, Mark
2018-05-01
To compare races/ethnicities for characteristics, at type 1 diabetes diagnosis and during the first 3 years postdiagnosis, known to influence long-term health outcomes. We analyzed 927 Pediatric Diabetes Consortium (PDC) participants <19 years old (631 non-Hispanic white [NHW], 216 Hispanic, and 80 African American [AA]) diagnosed with type 1 diabetes and followed for a median of 3.0 years (interquartile range 2.2-3.6). Demographic and clinical data were collected from medical records and patient/parent interviews. Partial remission period or "honeymoon" was defined as insulin dose-adjusted hemoglobin A 1c (IDAA1c) ≤9.0%. We used logistic, linear, and multinomial regression models, as well as repeated-measures logistic and linear regression models. Models were adjusted for known confounders. AA subjects, compared with NHW, at diagnosis, were in a higher age- and sex-adjusted BMI percentile (BMI%), had more advanced pubertal development, and had higher frequency of presentation in diabetic ketoacidosis, largely explained by socioeconomic factors. During the first 3 years, AA subjects were more likely to have hypertension and severe hypoglycemia events; had trajectories with higher hemoglobin A 1c , BMI%, insulin doses, and IDAA1c; and were less likely to enter the partial remission period. Hispanics, compared with NHWs, had higher BMI% at diagnosis and over the three subsequent years. During the 3 years postdiagnosis, Hispanics had higher prevalence of dyslipidemia and maintained trajectories of higher insulin doses and IDAA1c. Youth of minority race/ethnicity have increased markers of poor prognosis of type 1 diabetes at diagnosis and 3 years postdiagnosis, possibly contributing to higher risk of long-term diabetes complications compared with NHWs. © 2018 by the American Diabetes Association.
Rha, Koon H; Abdel Raheem, Ali; Park, Sung Y; Kim, Kwang H; Kim, Hyung J; Koo, Kyo C; Choi, Young D; Jung, Byung H; Lee, Sang K; Lee, Won K; Krishnan, Jayram; Shin, Tae Y; Cho, Jin-Seon
2017-11-01
To assess the correlation of the resected and ischaemic volume (RAIV), which is a preoperatively calculated volume of nephron loss, with the amount of postoperative renal function (PRF) decline after minimally invasive partial nephrectomy (PN) in a multi-institutional dataset. We identified 348 patients from March 2005 to December 2013 at six institutions. Data on all cases of laparoscopic (n = 85) and robot-assisted PN (n = 263) performed were retrospectively gathered. Univariable and multivariable linear regression analyses were used to identify the associations between various time points of PRF and the RAIV, as a continuous variable. The mean (sd) RAIV was 24.2 (29.2) cm 3 . The mean preoperative estimated glomerular filtration rate (eGFR) and the eGFRs at postoperative day 1, 6 and 36 months after PN were 91.0 and 76.8, 80.2 and 87.7 mL/min/1.73 m 2 , respectively. In multivariable linear regression analysis, the amount of decline in PRF at follow-up was significantly correlated with the RAIV (β 0.261, 0.165, 0.260 at postoperative day 1, 6 and 36 months after PN, respectively). This study has the limitation of its retrospective nature. Preoperatively calculated RAIV significantly correlates with the amount of decline in PRF during long-term follow-up. The RAIV could lead our research to the level of prediction of the amount of PRF decline after PN and thus would be appropriate for assessing the technical advantages of emerging techniques. © 2017 The Authors BJU International © 2017 BJU International Published by John Wiley & Sons Ltd.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Meteorological adjustment of yearly mean values for air pollutant concentration comparison
NASA Technical Reports Server (NTRS)
Sidik, S. M.; Neustadter, H. E.
1976-01-01
Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.
Sibling dilution hypothesis: a regression surface analysis.
Marjoribanks, K
2001-08-01
This study examined relationships between sibship size (the number of children in a family), birth order, and measures of academic performance, academic self-concept, and educational aspirations at different levels of family educational resources. As part of a national longitudinal study of Australian secondary school students data were collected from 2,530 boys and 2,450 girls in Years 9 and 10. Regression surfaces were constructed from models that included terms to account for linear, interaction, and curvilinear associations among the variables. Analysis suggests the general propositions (a) family educational resources have significant associations with children's school-related outcomes at different levels of sibling variables, the relationships for girls being curvilinear, and (b) sibling variables continue to have small significant associations with affective and cognitive outcomes, after taking into account variations in family educational resources. That is, the investigation provides only partial support for the sibling dilution hypothesis.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Application of near-infrared spectroscopy in the detection of fat-soluble vitamins in premix feed
NASA Astrophysics Data System (ADS)
Jia, Lian Ping; Tian, Shu Li; Zheng, Xue Cong; Jiao, Peng; Jiang, Xun Peng
2018-02-01
Vitamin is the organic compound and necessary for animal physiological maintenance. The rapid determination of the content of different vitamins in premix feed can help to achieve accurate diets and efficient feeding. Compared with high-performance liquid chromatography and other wet chemical methods, near-infrared spectroscopy is a fast, non-destructive, non-polluting method. 168 samples of premix feed were collected and the contents of vitamin A, vitamin E and vitamin D3 were detected by the standard method. The near-infrared spectra of samples ranging from 10 000 to 4 000 cm-1 were obtained. Partial least squares regression (PLSR) and support vector machine regression (SVMR) were used to construct the quantitative model. The results showed that the RMSEP of PLSR model of vitamin A, vitamin E and vitamin D3 were 0.43×107 IU/kg, 0.09×105 IU/kg and 0.17×107 IU/kg, respectively. The RMSEP of SVMR model was 0.45×107 IU/kg, 0.11×105 IU/kg and 0.18×107 IU/kg. Compared with nonlinear regression method (SVMR), linear regression method (PLSR) is more suitable for the quantitative analysis of vitamins in premix feed.
Shrinkage Estimation of Varying Covariate Effects Based On Quantile Regression
Peng, Limin; Xu, Jinfeng; Kutner, Nancy
2013-01-01
Varying covariate effects often manifest meaningful heterogeneity in covariate-response associations. In this paper, we adopt a quantile regression model that assumes linearity at a continuous range of quantile levels as a tool to explore such data dynamics. The consideration of potential non-constancy of covariate effects necessitates a new perspective for variable selection, which, under the assumed quantile regression model, is to retain variables that have effects on all quantiles of interest as well as those that influence only part of quantiles considered. Current work on l1-penalized quantile regression either does not concern varying covariate effects or may not produce consistent variable selection in the presence of covariates with partial effects, a practical scenario of interest. In this work, we propose a shrinkage approach by adopting a novel uniform adaptive LASSO penalty. The new approach enjoys easy implementation without requiring smoothing. Moreover, it can consistently identify the true model (uniformly across quantiles) and achieve the oracle estimation efficiency. We further extend the proposed shrinkage method to the case where responses are subject to random right censoring. Numerical studies confirm the theoretical results and support the utility of our proposals. PMID:25332515
NASA Astrophysics Data System (ADS)
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Wan Mohd Azam, Wan Mohd Yunus; Din, Normah Che; Ahmad, Mahadir; Ghazali, Shazli Ezzat; Ibrahim, Norhayati; Said, Zaini; Ghazali, Ahmad Rohi; Shahar, Suzana; Razali, Rosdinom; Maniam, T
2013-04-01
Loneliness has long been known to have strong association with depression. The relationship between loneliness and depression, however, has been associated with other risk factors including social support. The aim of this paper is to describe the role of social support in the association between loneliness and depression. This cross-sectional study examined the mediating effects of social support among 161 community-based elderly in agricultural settlement of a rural area in Sungai Tengi, Malaysia. Subjects were investigated with De Jong Gierveld Loneliness Scale, Geriatric Depression Scale and Medical Outcome Survey Social Support Survey. Data were analyzed using Pearson correlation, linear and hierarchical regression. Results indicated that social support partially mediated the relationship between loneliness and depression. This suggests that social support affects the linear association between loneliness and depression in the elderly. Copyright © 2013 Wiley Publishing Asia Pty Ltd.
Ghanem, Eman; Hopfer, Helene; Navarro, Andrea; Ritzer, Maxwell S; Mahmood, Lina; Fredell, Morgan; Cubley, Ashley; Bolen, Jessica; Fattah, Rabia; Teasdale, Katherine; Lieu, Linh; Chua, Tedmund; Marini, Federico; Heymann, Hildegarde; Anslyn, Eric V
2015-05-20
Differential sensing using synthetic receptors as mimics of the mammalian senses of taste and smell is a powerful approach for the analysis of complex mixtures. Herein, we report on the effectiveness of a cross-reactive, supramolecular, peptide-based sensing array in differentiating and predicting the composition of red wine blends. Fifteen blends of Cabernet Sauvignon, Merlot and Cabernet Franc, in addition to the mono varietals, were used in this investigation. Linear Discriminant Analysis (LDA) showed a clear differentiation of blends based on tannin concentration and composition where certain mono varietals like Cabernet Sauvignon seemed to contribute less to the overall characteristics of the blend. Partial Least Squares (PLS) Regression and cross validation were used to build a predictive model for the responses of the receptors to eleven binary blends and the three mono varietals. The optimized model was later used to predict the percentage of each mono varietal in an independent test set composted of four tri-blends with a 15% average error. A partial least square regression model using the mouth-feel and taste descriptive sensory attributes of the wine blends revealed a strong correlation of the receptors to perceived astringency, which is indicative of selective binding to polyphenols in wine.
Estimating Biochemical Parameters of Tea (camellia Sinensis (L.)) Using Hyperspectral Techniques
NASA Astrophysics Data System (ADS)
Bian, M.; Skidmore, A. K.; Schlerf, M.; Liu, Y.; Wang, T.
2012-07-01
Tea (Camellia Sinensis (L.)) is an important economic crop and the market price of tea depends largely on its quality. This research aims to explore the potential of hyperspectral remote sensing on predicting the concentration of biochemical components, namely total tea polyphenols, as indicators of tea quality at canopy scale. Experiments were carried out for tea plants growing in the field and greenhouse. Partial least squares regression (PLSR), which has proven to be the one of the most successful empirical approach, was performed to establish the relationship between reflectance and biochemical concentration across six tea varieties in the field. Moreover, a novel integrated approach involving successive projections algorithms as band selection method and neural networks was developed and applied to detect the concentration of total tea polyphenols for one tea variety, in order to explore and model complex nonlinearity relationships between independent (wavebands) and dependent (biochemicals) variables. The good prediction accuracies (r2 > 0.8 and relative RMSEP < 10 %) achieved for tea plants using both linear (partial lease squares regress) and nonlinear (artificial neural networks) modelling approaches in this study demonstrates the feasibility of using airborne and spaceborne sensors to cover wide areas of tea plantation for in situ monitoring of tea quality cheaply and rapidly.
Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics
Li, Xiaoli; Zhang, Yuying; He, Yong
2016-01-01
This paper investigated the feasibility of Fourier transform infrared transmission (FT-IR) spectroscopy to detect talcum powder illegally added in tea based on chemometric methods. Firstly, 210 samples of tea powder with 13 dose levels of talcum powder were prepared for FT-IR spectra acquirement. In order to highlight the slight variations in FT-IR spectra, smoothing, normalize and standard normal variate (SNV) were employed to preprocess the raw spectra. Among them, SNV preprocessing had the best performance with high correlation of prediction (RP = 0.948) and low root mean square error of prediction (RMSEP = 0.108) of partial least squares (PLS) model. Then 18 characteristic wavenumbers were selected based on a hybrid of backward interval partial least squares (biPLS) regression, competitive adaptive reweighted sampling (CARS) algorithm and successive projections algorithm (SPA). These characteristic wavenumbers only accounted for 0.64% of the full wavenumbers. Following that, 18 characteristic wavenumbers were used to build linear and nonlinear determination models by PLS regression and extreme learning machine (ELM), respectively. The optimal model with RP = 0.963 and RMSEP = 0.137 was achieved by ELM algorithm. These results demonstrated that FT-IR spectroscopy with chemometrics could be used successfully to detect talcum powder in tea. PMID:27468701
Effects of perceived stress and uplifts on inflammation and coagulability.
Jain, Shamini; Mills, Paul J; von Känel, Roland; Hong, Suzi; Dimsdale, Joel E
2007-01-01
We investigated whether depressed mood and chronic hassles and uplifts predicted levels of hemostasis markers D-Dimer and type-1 plasminogen activator inhibitor (PAI-1), as well as the proinflammatory markers interleukin-6 (IL-6) and soluble intercellular adhesion molecule-1 (sICAM-1) in 108 healthy individuals. One hundred eight African-American and Euro-American men and women were studied (58 men, 50 women; mean age = 36.5 +/- 8 years). D-Dimer, PAI-1, IL-6, and sICAM-1 plasma levels were analyzed from fasting venous blood samples. Data were analyzed via hierarchical linear regression and followed with partial correlation analysis. Regression analyses combined with partial correlation analyses suggested that increases in hassle frequency predicted elevated levels of sICAM-1 (p= .034), and increases in hassle severity predicted elevated levels of D-Dimer (p= .017). Increases in uplift intensity predicted lower levels of PAI-1 (p= .004) as well as showed a trend for decreased IL-6 (p= .069). Depressed mood did not significantly predict any dependent variable. These results were independent of sociodemographic, biological, and other related mood variables. The findings suggest that for even relatively healthy persons, increased perceptions of hassles are independently associated with greater inflammation and hypercoagulability, whereas increased perceptions of uplifts are independently associated with decreased hypercoagulability.
Linear algebraic theory of partial coherence: discrete fields and measures of partial coherence.
Ozaktas, Haldun M; Yüksel, Serdar; Kutay, M Alper
2002-08-01
A linear algebraic theory of partial coherence is presented that allows precise mathematical definitions of concepts such as coherence and incoherence. This not only provides new perspectives and insights but also allows us to employ the conceptual and algebraic tools of linear algebra in applications. We define several scalar measures of the degree of partial coherence of an optical field that are zero for full incoherence and unity for full coherence. The mathematical definitions are related to our physical understanding of the corresponding concepts by considering them in the context of Young's experiment.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Nonparametric estimation and testing of fixed effects panel data models
Henderson, Daniel J.; Carroll, Raymond J.; Li, Qi
2009-01-01
In this paper we consider the problem of estimating nonparametric panel data models with fixed effects. We introduce an iterative nonparametric kernel estimator. We also extend the estimation method to the case of a semiparametric partially linear fixed effects model. To determine whether a parametric, semiparametric or nonparametric model is appropriate, we propose test statistics to test between the three alternatives in practice. We further propose a test statistic for testing the null hypothesis of random effects against fixed effects in a nonparametric panel data regression model. Simulations are used to examine the finite sample performance of the proposed estimators and the test statistics. PMID:19444335
Non-linear duality invariant partially massless models?
Cherney, D.; Deser, S.; Waldron, A.; ...
2015-12-15
We present manifestly duality invariant, non-linear, equations of motion for maximal depth, partially massless higher spins. These are based on a first order, Maxwell-like formulation of the known partially massless systems. Lastly, our models mimic Dirac–Born–Infeld theory but it is unclear whether they are Lagrangian.
On Partial Fraction Decompositions by Repeated Polynomial Divisions
ERIC Educational Resources Information Center
Man, Yiu-Kwong
2017-01-01
We present a method for finding partial fraction decompositions of rational functions with linear or quadratic factors in the denominators by means of repeated polynomial divisions. This method does not involve differentiation or solving linear equations for obtaining the unknown partial fraction coefficients, which is very suitable for either…
Fernandes, David Douglas Sousa; Gomes, Adriano A; Costa, Gean Bezerra da; Silva, Gildo William B da; Véras, Germano
2011-12-15
This work is concerned of evaluate the use of visible and near-infrared (NIR) range, separately and combined, to determine the biodiesel content in biodiesel/diesel blends using Multiple Linear Regression (MLR) and variable selection by Successive Projections Algorithm (SPA). Full spectrum models employing Partial Least Squares (PLS) and variables selection by Stepwise (SW) regression coupled with Multiple Linear Regression (MLR) and PLS models also with variable selection by Jack-Knife (Jk) were compared the proposed methodology. Several preprocessing were evaluated, being chosen derivative Savitzky-Golay with second-order polynomial and 17-point window for NIR and visible-NIR range, with offset correction. A total of 100 blends with biodiesel content between 5 and 50% (v/v) prepared starting from ten sample of biodiesel. In the NIR and visible region the best model was the SPA-MLR using only two and eight wavelengths with RMSEP of 0.6439% (v/v) and 0.5741 respectively, while in the visible-NIR region the best model was the SW-MLR using five wavelengths and RMSEP of 0.9533% (v/v). Results indicate that both spectral ranges evaluated showed potential for developing a rapid and nondestructive method to quantify biodiesel in blends with mineral diesel. Finally, one can still mention that the improvement in terms of prediction error obtained with the procedure for variables selection was significant. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Fowler, Stephanie M; Ponnampalam, Eric N; Schmidt, Heinar; Wynn, Peter; Hopkins, David L
2015-12-01
A hand held Raman spectroscopic device was used to predict intramuscular fat (IMF) levels and the major fatty acid (FA) groups of fresh intact ovine M. longissimus lumborum (LL). IMF levels were determined using the Soxhlet method, while FA analysis was conducted using a rapid (KOH in water, methanol and sulphuric acid in water) extraction procedure. IMF levels and FA values were regressed against Raman spectra using partial least squares regression and against each other using linear regression. The results indicate that there is potential to predict PUFA (R(2)=0.93) and MUFA (R(2)=0.54) as well as SFA values that had been adjusted for IMF content (R(2)=0.54). However, this potential was significantly reduced when correlations between predicted and observed values were determined by cross validation (R(2)cv=0.21-0.00). Overall, the prediction of major FA groups using Raman spectra was more precise (relative reductions in error of 0.3-40.8%) compared to the null models. Copyright © 2015 Elsevier Ltd. All rights reserved.
Han, Fu Liang; Li, Zheng; Xu, Yan
2015-12-01
Monomeric anthocyanin contributions to young red wine color were investigated using partial least square regression (PLSR) and aqueous alcohol solutions in this study. Results showed that the correlation between the anthocyanin concentration and the solution color fitted in a quadratic regression rather than linear or cubic regression. Malvidin-3-O-glucoside was estimated to show the highest contribution to young red wine color according to its concentration in wine, whereas peonidin-3-O-glucoside in its concentration contributed the least. The PLSR suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside under the same concentration resulted in a stronger color of young red wine compared with malvidin-3-O-glucoside. These estimates were further confirmed by their color in aqueous alcohol solutions. These results suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside were primary anthocyanins to enhance young red wine color by increasing their concentrations. This study could provide an alternative approach to improve young red wine color by adjusting anthocyanin composition and concentration. © 2015 Institute of Food Technologists®
Qiu, Shanshan; Wang, Jun; Gao, Liping
2014-07-09
An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
NASA Astrophysics Data System (ADS)
Chen, Pengfei; Jing, Qi
2017-02-01
An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.
Chizewski, Michael G; Chiu, Loren Z F
2012-05-01
Joint angle is the relative rotation between two segments where one is a reference and assumed to be non-moving. However, rotation of the reference segment will influence the system's spatial orientation and joint angle. The purpose of this investigation was to determine the contribution of leg and calcaneal rotations to ankle rotation in a weight-bearing task. Forty-eight individuals performed partial squats recorded using a 3D motion capture system. Markers on the calcaneus and leg were used to model leg and calcaneal segment, and ankle joint rotations. Multiple linear regression was used to determine the contribution of leg and calcaneal segment rotations to ankle joint dorsiflexion. Regression models for left (R(2)=0.97) and right (R(2)=0.97) ankle dorsiflexion were significant. Sagittal plane leg rotation had a positive influence (left: β=1.411; right: β=1.418) while sagittal plane calcaneal rotation had a negative influence (left: β=-0.573; right: β=-0.650) on ankle dorsiflexion. Sagittal plane rotations of the leg and calcaneus were positively correlated (left: r=0.84, P<0.001; right: r=0.80, P<0.001). During a partial squat, the calcaneus rotates forward. Simultaneous forward calcaneal rotation with ankle dorsiflexion reduces total ankle dorsiflexion angle. Rear foot posture is reoriented during a partial squat, allowing greater leg rotation in the sagittal plane. Segment rotations may provide greater insight into movement mechanics that cannot be explained via joint rotations alone. Copyright © 2012 Elsevier B.V. All rights reserved.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Plant leaf chlorophyll content retrieval based on a field imaging spectroscopy system.
Liu, Bo; Yue, Yue-Min; Li, Ru; Shen, Wen-Jing; Wang, Ke-Lin
2014-10-23
A field imaging spectrometer system (FISS; 380-870 nm and 344 bands) was designed for agriculture applications. In this study, FISS was used to gather spectral information from soybean leaves. The chlorophyll content was retrieved using a multiple linear regression (MLR), partial least squares (PLS) regression and support vector machine (SVM) regression. Our objective was to verify the performance of FISS in a quantitative spectral analysis through the estimation of chlorophyll content and to determine a proper quantitative spectral analysis method for processing FISS data. The results revealed that the derivative reflectance was a more sensitive indicator of chlorophyll content and could extract content information more efficiently than the spectral reflectance, which is more significant for FISS data compared to ASD (analytical spectral devices) data, reducing the corresponding RMSE (root mean squared error) by 3.3%-35.6%. Compared with the spectral features, the regression methods had smaller effects on the retrieval accuracy. A multivariate linear model could be the ideal model to retrieve chlorophyll information with a small number of significant wavelengths used. The smallest RMSE of the chlorophyll content retrieved using FISS data was 0.201 mg/g, a relative reduction of more than 30% compared with the RMSE based on a non-imaging ASD spectrometer, which represents a high estimation accuracy compared with the mean chlorophyll content of the sampled leaves (4.05 mg/g). Our study indicates that FISS could obtain both spectral and spatial detailed information of high quality. Its image-spectrum-in-one merit promotes the good performance of FISS in quantitative spectral analyses, and it can potentially be widely used in the agricultural sector.
Plant Leaf Chlorophyll Content Retrieval Based on a Field Imaging Spectroscopy System
Liu, Bo; Yue, Yue-Min; Li, Ru; Shen, Wen-Jing; Wang, Ke-Lin
2014-01-01
A field imaging spectrometer system (FISS; 380–870 nm and 344 bands) was designed for agriculture applications. In this study, FISS was used to gather spectral information from soybean leaves. The chlorophyll content was retrieved using a multiple linear regression (MLR), partial least squares (PLS) regression and support vector machine (SVM) regression. Our objective was to verify the performance of FISS in a quantitative spectral analysis through the estimation of chlorophyll content and to determine a proper quantitative spectral analysis method for processing FISS data. The results revealed that the derivative reflectance was a more sensitive indicator of chlorophyll content and could extract content information more efficiently than the spectral reflectance, which is more significant for FISS data compared to ASD (analytical spectral devices) data, reducing the corresponding RMSE (root mean squared error) by 3.3%–35.6%. Compared with the spectral features, the regression methods had smaller effects on the retrieval accuracy. A multivariate linear model could be the ideal model to retrieve chlorophyll information with a small number of significant wavelengths used. The smallest RMSE of the chlorophyll content retrieved using FISS data was 0.201 mg/g, a relative reduction of more than 30% compared with the RMSE based on a non-imaging ASD spectrometer, which represents a high estimation accuracy compared with the mean chlorophyll content of the sampled leaves (4.05 mg/g). Our study indicates that FISS could obtain both spectral and spatial detailed information of high quality. Its image-spectrum-in-one merit promotes the good performance of FISS in quantitative spectral analyses, and it can potentially be widely used in the agricultural sector. PMID:25341439
Predicting heavy metal concentrations in soils and plants using field spectrophotometry
NASA Astrophysics Data System (ADS)
Muradyan, V.; Tepanosyan, G.; Asmaryan, Sh.; Sahakyan, L.; Saghatelyan, A.; Warner, T. A.
2017-09-01
Aim of this study is to predict heavy metal (HM) concentrations in soils and plants using field remote sensing methods. The studied sites were an industrial town of Kajaran and city of Yerevan. The research also included sampling of soils and leaves of two tree species exposed to different pollution levels and determination of contents of HM in lab conditions. The obtained spectral values were then collated with contents of HM in Kajaran soils and the tree leaves sampled in Yerevan, and statistical analysis was done. Consequently, Zn and Pb have a negative correlation coefficient (p <0.01) in a 2498 nm spectral range for soils. Pb has a significantly higher correlation at red edge for plants. A regression models and artificial neural network (ANN) for HM prediction were developed. Good results were obtained for the best stress sensitive spectral band ANN (R2 0.9, RPD 2.0), Simple Linear Regression (SLR) and Partial Least Squares Regression (PLSR) (R2 0.7, RPD 1.4) models. Multiple Linear Regression (MLR) model was not applicable to predict Pb and Zn concentrations in soils in this research. Almost all full spectrum PLS models provide good calibration and validation results (RPD>1.4). Full spectrum ANN models are characterized by excellent calibration R2, rRMSE and RPD (0.9; 0.1 and >2.5 respectively). For prediction of Pb and Ni contents in plants SLR and PLS models were used. The latter provide almost the same results. Our findings indicate that it is possible to make coarse direct estimation of HM content in soils and plants using rapid and economic reflectance spectroscopy.
Factor Analysis of Linear Type Traits and Their Relation with Longevity in Brazilian Holstein Cattle
Kern, Elisandra Lurdes; Cobuci, Jaime Araújo; Costa, Cláudio Napolis; Pimentel, Concepta Margaret McManus
2014-01-01
In this study we aimed to evaluate the reduction in dimensionality of 20 linear type traits and more final score in 14,943 Holstein cows in Brazil using factor analysis, and indicate their relationship with longevity and 305 d first lactation milk production. Low partial correlations (−0.19 to 0.38), the medium to high Kaiser sampling mean (0.79) and the significance of the Bartlett sphericity test (p<0.001), indicated correlations between type traits and the suitability of these data for a factor analysis, after the elimination of seven traits. Two factors had autovalues greater than one. The first included width and height of posterior udder, udder texture, udder cleft, loin strength, bone quality and final score. The second included stature, top line, chest width, body depth, fore udder attachment, angularity and final score. The linear regression of the factors on several measures of longevity and 305 d milk production showed that selection considering only the first factor should lead to improvements in longevity and 305 milk production. PMID:25050015
Partially Flipped Linear Algebra: A Team-Based Approach
ERIC Educational Resources Information Center
Carney, Debra; Ormes, Nicholas; Swanson, Rebecca
2015-01-01
In this article we describe a partially flipped Introductory Linear Algebra course developed by three faculty members at two different universities. We give motivation for our partially flipped design and describe our implementation in detail. Two main features of our course design are team-developed preview videos and related in-class activities.…
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio
2015-08-15
In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.
Agreement evaluation of AVHRR and MODIS 16-day composite NDVI data sets
Ji, Lei; Gallo, Kevin P.; Eidenshink, Jeffery C.; Dwyer, John L.
2008-01-01
Satellite-derived normalized difference vegetation index (NDVI) data have been used extensively to detect and monitor vegetation conditions at regional and global levels. A combination of NDVI data sets derived from AVHRR and MODIS can be used to construct a long NDVI time series that may also be extended to VIIRS. Comparative analysis of NDVI data derived from AVHRR and MODIS is critical to understanding the data continuity through the time series. In this study, the AVHRR and MODIS 16-day composite NDVI products were compared using regression and agreement analysis methods. The analysis shows a high agreement between the AVHRR-NDVI and MODIS-NDVI observed from 2002 and 2003 for the conterminous United States, but the difference between the two data sets is appreciable. Twenty per cent of the total difference between the two data sets is due to systematic difference, with the remainder due to unsystematic difference. The systematic difference can be eliminated with a linear regression-based transformation between two data sets, and the unsystematic difference can be reduced partially by applying spatial filters to the data. We conclude that the continuity of NDVI time series from AVHRR to MODIS is satisfactory, but a linear transformation between the two sets is recommended.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
NASA Astrophysics Data System (ADS)
Herath, Imali Kaushalya; Ye, Xuchun; Wang, Jianli; Bouraima, Abdel-Kabirou
2018-02-01
Reference evapotranspiration (ETr) is one of the important parameters in the hydrological cycle. The spatio-temporal variation of ETr and other meteorological parameters that influence ETr were investigated in the Jialing River Basin (JRB), China. The ETr was estimated using the CROPWAT 8.0 computer model based on the Penman-Montieth equation for the period 1964-2014. Mean temperature (MT), relative humidity (RH), sunshine duration (SD), and wind speed (WS) were the main input parameters of CROPWAT while 12 meteorological stations were evaluated. Linear regression and Mann-Kendall methods were applied to study the spatio-temporal trends while the inverse distance weighted (IDW) method was used to identify the spatial distribution of ETr. Stepwise regression and partial correlation methods were used to identify the meteorological variables that most significantly influenced the changes in ETr. The highest annual ETr was found in the northern part of the basin, whereas the lowest rate was recorded in the western part. In the autumn, the highest ETr was recorded in the southeast part of JRB. The annual ETr reflected neither significant increasing nor decreasing trends. Except for the summer, ETr is slightly increasing in other seasons. The MT significantly increased whereas SD and RH were significantly decreased during the 50-year period. Partial correlation and stepwise regression methods found that the impact of meteorological parameters on ETr varies on an annual and seasonal basis while SD, MT, and RH contributed to the changes of annual and seasonal ETr in the JRB.
Yoo, Kwangsun; Rosenberg, Monica D; Hsu, Wei-Ting; Zhang, Sheng; Li, Chiang-Shan R; Scheinost, Dustin; Constable, R Todd; Chun, Marvin M
2018-02-15
Connectome-based predictive modeling (CPM; Finn et al., 2015; Shen et al., 2017) was recently developed to predict individual differences in traits and behaviors, including fluid intelligence (Finn et al., 2015) and sustained attention (Rosenberg et al., 2016a), from functional brain connectivity (FC) measured with fMRI. Here, using the CPM framework, we compared the predictive power of three different measures of FC (Pearson's correlation, accordance, and discordance) and two different prediction algorithms (linear and partial least square [PLS] regression) for attention function. Accordance and discordance are recently proposed FC measures that respectively track in-phase synchronization and out-of-phase anti-correlation (Meskaldji et al., 2015). We defined connectome-based models using task-based or resting-state FC data, and tested the effects of (1) functional connectivity measure and (2) feature-selection/prediction algorithm on individualized attention predictions. Models were internally validated in a training dataset using leave-one-subject-out cross-validation, and externally validated with three independent datasets. The training dataset included fMRI data collected while participants performed a sustained attention task and rested (N = 25; Rosenberg et al., 2016a). The validation datasets included: 1) data collected during performance of a stop-signal task and at rest (N = 83, including 19 participants who were administered methylphenidate prior to scanning; Farr et al., 2014a; Rosenberg et al., 2016b), 2) data collected during Attention Network Task performance and rest (N = 41, Rosenberg et al., in press), and 3) resting-state data and ADHD symptom severity from the ADHD-200 Consortium (N = 113; Rosenberg et al., 2016a). Models defined using all combinations of functional connectivity measure (Pearson's correlation, accordance, and discordance) and prediction algorithm (linear and PLS regression) predicted attentional abilities, with correlations between predicted and observed measures of attention as high as 0.9 for internal validation, and 0.6 for external validation (all p's < 0.05). Models trained on task data outperformed models trained on rest data. Pearson's correlation and accordance features generally showed a small numerical advantage over discordance features, while PLS regression models were usually better than linear regression models. Overall, in addition to correlation features combined with linear models (Rosenberg et al., 2016a), it is useful to consider accordance features and PLS regression for CPM. Copyright © 2017 Elsevier Inc. All rights reserved.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Combined hydraulic and regenerative braking system
Venkataperumal, R.R.; Mericle, G.E.
1979-08-09
A combined hydraulic and regenerative braking system and method for an electric vehicle is disclosed. The braking system is responsive to the applied hydraulic pressure in a brake line to control the braking of the vehicle to be completely hydraulic up to a first level of brake line pressure, to be partially hydraulic at a constant braking force and partially regenerative at a linearly increasing braking force from the first level of applied brake line pressure to a higher second level of brake line pressure, to be partially hydraulic at a linearly increasing braking force and partially regenerative at a linearly decreasing braking force from the second level of applied line pressure to a third and higher level of applied line pressure, and to be completely hydraulic at a linearly increasing braking force from the third level to all higher applied levels of line pressure.
Combined hydraulic and regenerative braking system
Venkataperumal, Rama R.; Mericle, Gerald E.
1981-06-02
A combined hydraulic and regenerative braking system and method for an electric vehicle, with the braking system being responsive to the applied hydraulic pressure in a brake line to control the braking of the vehicle to be completely hydraulic up to a first level of brake line pressure, to be partially hydraulic at a constant braking force and partially regenerative at a linearly increasing braking force from the first level of applied brake line pressure to a higher second level of brake line pressure, to be partially hydraulic at a linearly increasing braking force and partially regenerative at a linearly decreasing braking force from the second level of applied line pressure to a third and higher level of applied line pressure, and to be completely hydraulic at a linearly increasing braking force from the third level to all higher applied levels of line pressure.
An Extension of the Partial Credit Model with an Application to the Measurement of Change.
ERIC Educational Resources Information Center
Fischer, Gerhard H.; Ponocny, Ivo
1994-01-01
An extension to the partial credit model, the linear partial credit model, is considered under the assumption of a certain linear decomposition of the item x category parameters into basic parameters. A conditional maximum likelihood algorithm for estimating basic parameters is presented and illustrated with simulation and an empirical study. (SLD)
Shafie, Suhaidi; Kawahito, Shoji; Halin, Izhal Abdul; Hasan, Wan Zuha Wan
2009-01-01
The partial charge transfer technique can expand the dynamic range of a CMOS image sensor by synthesizing two types of signal, namely the long and short accumulation time signals. However the short accumulation time signal obtained from partial transfer operation suffers of non-linearity with respect to the incident light. In this paper, an analysis of the non-linearity in partial charge transfer technique has been carried, and the relationship between dynamic range and the non-linearity is studied. The results show that the non-linearity is caused by two factors, namely the current diffusion, which has an exponential relation with the potential barrier, and the initial condition of photodiodes in which it shows that the error in the high illumination region increases as the ratio of the long to the short accumulation time raises. Moreover, the increment of the saturation level of photodiodes also increases the error in the high illumination region.
Developing a dengue forecast model using machine learning: A case study in China
Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun
2017-01-01
Background In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Methodology/Principal findings Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011–2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. Conclusion and significance The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics. PMID:29036169
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Seasonal forecasting of high wind speeds over Western Europe
NASA Astrophysics Data System (ADS)
Palutikof, J. P.; Holt, T.
2003-04-01
As financial losses associated with extreme weather events escalate, there is interest from end users in the forestry and insurance industries, for example, in the development of seasonal forecasting models with a long lead time. This study uses exceedences of the 90th, 95th, and 99th percentiles of daily maximum wind speed over the period 1958 to present to derive predictands of winter wind extremes. The source data is the 6-hourly NCEP Reanalysis gridded surface wind field. Predictor variables include principal components of Atlantic sea surface temperature and several indices of climate variability, including the NAO and SOI. Lead times of up to a year are considered, in monthly increments. Three regression techniques are evaluated; multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS). PCR and PLS proved considerably superior to MLR with much lower standard errors. PLS was chosen to formulate the predictive model since it offers more flexibility in experimental design and gave slightly better results than PCR. The results indicate that winter windiness can be predicted with considerable skill one year ahead for much of coastal Europe, but that this deteriorates rapidly in the hinterland. The experiment succeeded in highlighting PLS as a very useful method for developing more precise forecasting models, and in identifying areas of high predictability.
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Bin Ratio-Based Histogram Distances and Their Application to Image Classification.
Hu, Weiming; Xie, Nianhua; Hu, Ruiguang; Ling, Haibin; Chen, Qiang; Yan, Shuicheng; Maybank, Stephen
2014-12-01
Large variations in image background may cause partial matching and normalization problems for histogram-based representations, i.e., the histograms of the same category may have bins which are significantly different, and normalization may produce large changes in the differences between corresponding bins. In this paper, we deal with this problem by using the ratios between bin values of histograms, rather than bin values' differences which are used in the traditional histogram distances. We propose a bin ratio-based histogram distance (BRD), which is an intra-cross-bin distance, in contrast with previous bin-to-bin distances and cross-bin distances. The BRD is robust to partial matching and histogram normalization, and captures correlations between bins with only a linear computational complexity. We combine the BRD with the ℓ1 histogram distance and the χ(2) histogram distance to generate the ℓ1 BRD and the χ(2) BRD, respectively. These combinations exploit and benefit from the robustness of the BRD under partial matching and the robustness of the ℓ1 and χ(2) distances to small noise. We propose a method for assessing the robustness of histogram distances to partial matching. The BRDs and logistic regression-based histogram fusion are applied to image classification. The experimental results on synthetic data sets show the robustness of the BRDs to partial matching, and the experiments on seven benchmark data sets demonstrate promising results of the BRDs for image classification.
Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan
2017-08-28
The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.
Douglas, R K; Nawar, S; Alamar, M C; Mouazen, A M; Coulon, F
2018-03-01
Visible and near infrared spectrometry (vis-NIRS) coupled with data mining techniques can offer fast and cost-effective quantitative measurement of total petroleum hydrocarbons (TPH) in contaminated soils. Literature showed however significant differences in the performance on the vis-NIRS between linear and non-linear calibration methods. This study compared the performance of linear partial least squares regression (PLSR) with a nonlinear random forest (RF) regression for the calibration of vis-NIRS when analysing TPH in soils. 88 soil samples (3 uncontaminated and 85 contaminated) collected from three sites located in the Niger Delta were scanned using an analytical spectral device (ASD) spectrophotometer (350-2500nm) in diffuse reflectance mode. Sequential ultrasonic solvent extraction-gas chromatography (SUSE-GC) was used as reference quantification method for TPH which equal to the sum of aliphatic and aromatic fractions ranging between C 10 and C 35 . Prior to model development, spectra were subjected to pre-processing including noise cut, maximum normalization, first derivative and smoothing. Then 65 samples were selected as calibration set and the remaining 20 samples as validation set. Both vis-NIR spectrometry and gas chromatography profiles of the 85 soil samples were subjected to RF and PLSR with leave-one-out cross-validation (LOOCV) for the calibration models. Results showed that RF calibration model with a coefficient of determination (R 2 ) of 0.85, a root means square error of prediction (RMSEP) 68.43mgkg -1 , and a residual prediction deviation (RPD) of 2.61 outperformed PLSR (R 2 =0.63, RMSEP=107.54mgkg -1 and RDP=2.55) in cross-validation. These results indicate that RF modelling approach is accounting for the nonlinearity of the soil spectral responses hence, providing significantly higher prediction accuracy compared to the linear PLSR. It is recommended to adopt the vis-NIRS coupled with RF modelling approach as a portable and cost effective method for the rapid quantification of TPH in soils. Copyright © 2017 Elsevier B.V. All rights reserved.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
Developing GIS-based eastern equine encephalitis vector-host models in Tuskegee, Alabama.
Jacob, Benjamin G; Burkett-Cadena, Nathan D; Luvall, Jeffrey C; Parcak, Sarah H; McClure, Christopher J W; Estep, Laura K; Hill, Geoffrey E; Cupp, Eddie W; Novak, Robert J; Unnasch, Thomas R
2010-02-24
A site near Tuskegee, Alabama was examined for vector-host activities of eastern equine encephalomyelitis virus (EEEV). Land cover maps of the study site were created in ArcInfo 9.2 from QuickBird data encompassing visible and near-infrared (NIR) band information (0.45 to 0.72 microm) acquired July 15, 2008. Georeferenced mosquito and bird sampling sites, and their associated land cover attributes from the study site, were overlaid onto the satellite data. SAS 9.1.4 was used to explore univariate statistics and to generate regression models using the field and remote-sampled mosquito and bird data. Regression models indicated that Culex erracticus and Northern Cardinals were the most abundant mosquito and bird species, respectively. Spatial linear prediction models were then generated in Geostatistical Analyst Extension of ArcGIS 9.2. Additionally, a model of the study site was generated, based on a Digital Elevation Model (DEM), using ArcScene extension of ArcGIS 9.2. For total mosquito count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 5.041 km, nugget of 6.325 km, lag size of 7.076 km, and range of 31.43 km, using 12 lags. For total adult Cx. erracticus count, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 5.764 km, nugget of 6.114 km, lag size of 7.472 km, and range of 32.62 km, using 12 lags. For the total bird count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 4.998 km, nugget of 5.413 km, lag size of 7.549 km and range of 35.27 km, using 12 lags. For the Northern Cardinal count data, a first-order trend ordinary kriging process was fitted to the semivariogram at a partial sill of 6.387 km, nugget of 5.935 km, lag size of 8.549 km and a range of 41.38 km, using 12 lags. Results of the DEM analyses indicated a statistically significant inverse linear relationship between total sampled mosquito data and elevation (R2 = -.4262; p < .0001), with a standard deviation (SD) of 10.46, and total sampled bird data and elevation (R2 = -.5111; p < .0001), with a SD of 22.97. DEM statistics also indicated a significant inverse linear relationship between total sampled Cx. erracticus data and elevation (R2 = -.4711; p < .0001), with a SD of 11.16, and the total sampled Northern Cardinal data and elevation (R2 = -.5831; p < .0001), SD of 11.42. These data demonstrate that GIS/remote sensing models and spatial statistics can capture space-varying functional relationships between field-sampled mosquito and bird parameters for determining risk for EEEV transmission.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Pietrzak, Robert H.; Goldstein, Risë B.; Southwick, Steven M.; Grant, Bridget F.
2011-01-01
Background/Objectives Trauma exposure and posttraumatic stress disorder (PTSD) may increase risk for medical conditions in older adults. We present findings on past-year medical conditions associated with lifetime trauma exposure, and full and partial PTSD, in a nationally representative sample of U.S. older adults. Design, Setting, Participants, and Measurements Face-to-face diagnostic interviews were conducted with 9,463 adults aged 60 and older in the Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions. Logistic regression analyses adjusting for sociodemographics and psychiatric comorbidity evaluated associations between PTSD status and past-year medical disorders; linear regression models evaluated associations with past-month physical functioning. Results After adjustment for sociodemographic characteristics and comorbid lifetime mood, anxiety, substance use, attention-deficit/hyperactivity, and personality disorders, respondents with lifetime PTSD were more likely than trauma controls to report being diagnosed by a healthcare professional with hypertension, angina pectoris, tachycardia, other heart disease, stomach ulcer, gastritis, and arthritis (odds ratios [ORs]=1.3–1.8); they also scored lower on a measure of physical functioning than controls and respondents with partial PTSD. Respondents with lifetime partial PTSD were more likely than controls to report past-year diagnoses of gastritis (OR=1.7), angina pectoris (OR=1.5), and arthritis (OR=1.4), and reported worse physical functioning. Number of lifetime traumatic event types was associated with most of the medical conditions assessed; adjustment for these events reduced the magnitudes of and rendered non-significant most associations between PTSD status and medical conditions. Conclusion Older adults with lifetime PTSD have elevated rates of several physical health conditions, many of which are chronic disorders of aging, and poorer physical functioning. Older adults with lifetime partial PTSD have elevated rates of gastritis, angina pectoris, and arthritis, and poorer physical functioning. PMID:22283516
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
NASA Astrophysics Data System (ADS)
Jintao, Xue; Yufei, Liu; Liming, Ye; Chunyan, Li; Quanwei, Yang; Weiying, Wang; Yun, Jing; Minxiang, Zhang; Peng, Li
2018-01-01
Near-Infrared Spectroscopy (NIRS) was first used to develop a method for rapid and simultaneous determination of 5 active alkaloids (berberine, coptisine, palmatine, epiberberine and jatrorrhizine) in 4 parts (rhizome, fibrous root, stem and leaf) of Coptidis Rhizoma. A total of 100 samples from 4 main places of origin were collected and studied. With HPLC analysis values as calibration reference, the quantitative analysis of 5 marker components was performed by two different modeling methods, partial least-squares (PLS) regression as linear regression and artificial neural networks (ANN) as non-linear regression. The results indicated that the 2 types of models established were robust, accurate and repeatable for five active alkaloids, and the ANN models was more suitable for the determination of berberine, coptisine and palmatine while the PLS model was more suitable for the analysis of epiberberine and jatrorrhizine. The performance of the optimal models was achieved as follows: the correlation coefficient (R) for berberine, coptisine, palmatine, epiberberine and jatrorrhizine was 0.9958, 0.9956, 0.9959, 0.9963 and 0.9923, respectively; the root mean square error of validation (RMSEP) was 0.5093, 0.0578, 0.0443, 0.0563 and 0.0090, respectively. Furthermore, for the comprehensive exploitation and utilization of plant resource of Coptidis Rhizoma, the established NIR models were used to analysis the content of 5 active alkaloids in 4 parts of Coptidis Rhizoma and 4 main origin of places. This work demonstrated that NIRS may be a promising method as routine screening for off-line fast analysis or on-line quality assessment of traditional Chinese medicine (TCM).
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING
Reiss, Philip T.; Huo, Lan; Zhao, Yihong; Kelly, Clare; Ogden, R. Todd
2016-01-01
An increasingly important goal of psychiatry is the use of brain imaging data to develop predictive models. Here we present two contributions to statistical methodology for this purpose. First, we propose and compare a set of wavelet-domain procedures for fitting generalized linear models with scalar responses and image predictors: sparse variants of principal component regression and of partial least squares, and the elastic net. Second, we consider assessing the contribution of image predictors over and above available scalar predictors, in particular via permutation tests and an extension of the idea of confounding to the case of functional or image predictors. Using the proposed methods, we assess whether maps of a spontaneous brain activity measure, derived from functional magnetic resonance imaging, can meaningfully predict presence or absence of attention deficit/hyperactivity disorder (ADHD). Our results shed light on the role of confounding in the surprising outcome of the recent ADHD-200 Global Competition, which challenged researchers to develop algorithms for automated image-based diagnosis of the disorder. PMID:27330652
Kuriakose, Saji; Joe, I Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuriakose, Saji; Joe, I. Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
Sorokin, Igor; Feustel, Paul J; O'Malley, Rebecca L
2017-10-01
The purpose of the study was to compare utilization and predictors of partial nephrectomy (PN) in the pre- and post-guideline eras. American Board of Urology certification/recertification operative logs were reviewed from 2003 to 2014. Nephrectomy cases were extracted using Current Procedural Terminology codes. The cases were then stratified according to pre-guidelines (2003-October 2009) and post-guidelines (November 2009-2014). Multivariable logistic regression was used to evaluate patient, surgeon, and practice characteristics as predictors of PN. A general linear model with regression analysis was used to evaluate the change in PN over time relative to the incidence of renal cell carcinoma (RCC). We identified 20,402 and 20,729 nephrectomies in the pre- and post-guidelines eras, respectively. In multivariable analysis, the post-guidelines group was more likely to undergo PN (odds ratio, 1.87; P < .001). The pre- as well as post-guidelines groups had a higher likelihood of undergoing PN with an open approach, higher-volume surgeons, and younger patient age (P < .05). Surgeon subspecialty and US region were no longer significant factors after guidelines publication. Number of PN normalized to the incidence of RCC continued to increase over time (0.14%/y; R 2 = 0.77; P < .001). Partial nephrectomy in the post-guidelines era is no longer confined to urological subspecialists or certain densely populated US regions. Although rates of PN continue to increase relative to the recently decreasing overall incidence of RCC, the slope has leveled off somewhat. This is likely related to clinical intricacies of the best treatment modality and technologic advances rather than changes related to guidelines publication. Published by Elsevier Inc.
Sadler, Susannah; Angus, Colin; Gavens, Lucy; Gillespie, Duncan; Holmes, John; Hamilton, Jean; Brennan, Alan; Meier, Petra
2017-05-01
In many countries, conflicting gradients in alcohol consumption and alcohol-associated mortality have been observed. To understand this 'alcohol harm paradox' we analysed the socio-economic gradient in alcohol-associated hospital admissions to test whether it was greater in conditions which were: (1) chronic (associated with long-term drinking) and partially alcohol-attributable, (2) chronic and wholly alcohol-attributable, (3) acute (associated with intoxication) and partially alcohol-attributable and (4) acute and wholly alcohol-attributable. Our aim was to clarify how (1) drinking patterns (e.g. intoxication linked to acute admissions or dependence linked to chronic conditions) and (2) non-alcohol causes (e.g. smoking and poor diet which are risks for partially alcohol-attributable conditions) contribute to the paradox. Regression analysis testing the modifying effects of condition-group (1-4 above) and sex on the relationship between area-based deprivation and admissions. England, April 2010-March 2013. A total of 9 239 629 English hospital admissions where a primary or secondary cause was one of 36 alcohol-associated conditions. Admissions by condition and deciles of Index of Multiple Deprivation (IMD). Socio-economic gradient measured as the relative index of inequality (RII, the slope of a linear regression of IMD on admissions adjusted for overall admission rate). Conditions were categorized by ICD-10 code. A socio-economic gradient in hospitalizations was seen for all conditions, except partially attributable chronic conditions. The gradient was significantly steeper for conditions which were wholly attributable to alcohol and for acute conditions than for conditions partially alcohol-attributable and for chronic conditions. Gradients were steeper for men than for women in cases of wholly alcohol attributable conditions. There is a socio-economic gradient in English hospital admission for most alcohol-associated conditions. The greatest inequalities are in conditions associated with alcohol dependence, such as liver disease and mental and behavioural conditions, and in acute conditions, such as alcohol poisoning and assault. Socio-economic differences in harmful drinking patterns (dependence and intoxication) may contribute to the 'alcohol harm paradox'. © 2016 Society for the Study of Addiction.
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Asumadu-Sarkodie, Samuel; Owusu, Phebe Asantewaa
2017-03-01
In this study, the impact of energy, agriculture, macroeconomic and human-induced indicators on environmental pollution from 1971 to 2011 is investigated using the statistically inspired modification of partial least squares (SIMPLS) regression model. There was evidence of a linear relationship between energy, agriculture, macroeconomic and human-induced indicators and carbon dioxide emissions. Evidence from the SIMPLS regression shows that a 1% increase in crop production index will reduce carbon dioxide emissions by 0.71%. Economic growth increased by 1% will reduce carbon dioxide emissions by 0.46%, which means that an increase in Ghana's economic growth may lead to a reduction in environmental pollution. The increase in electricity production from hydroelectric sources by 1% will reduce carbon dioxide emissions by 0.30%; thus, increasing renewable energy sources in Ghana's energy portfolio will help mitigate carbon dioxide emissions. Increasing enteric emissions by 1% will increase carbon dioxide emissions by 4.22%, and a 1% increase in the nitrogen content of manure management will increase carbon dioxide emissions by 6.69%. The SIMPLS regression forecasting exhibited a 5% MAPE from the prediction of carbon dioxide emissions.
Talpur, M Younis; Kara, Huseyin; Sherazi, S T H; Ayyildiz, H Filiz; Topkafa, Mustafa; Arslan, Fatma Nur; Naz, Saba; Durmaz, Fatih; Sirajuddin
2014-11-01
Single bounce attenuated total reflectance (SB-ATR) Fourier transform infrared (FTIR) spectroscopy in conjunction with chemometrics was used for accurate determination of free fatty acid (FFA), peroxide value (PV), iodine value (IV), conjugated diene (CD) and conjugated triene (CT) of cottonseed oil (CSO) during potato chips frying. Partial least square (PLS), stepwise multiple linear regression (SMLR), principal component regression (PCR) and simple Beer׳s law (SBL) were applied to develop the calibrations for simultaneous evaluation of five stated parameters of cottonseed oil (CSO) during frying of French frozen potato chips at 170°C. Good regression coefficients (R(2)) were achieved for FFA, PV, IV, CD and CT with value of >0.992 by PLS, SMLR, PCR, and SBL. Root mean square error of prediction (RMSEP) was found to be less than 1.95% for all determinations. Result of the study indicated that SB-ATR FTIR in combination with multivariate chemometrics could be used for accurate and simultaneous determination of different parameters during the frying process without using any toxic organic solvent. Copyright © 2014 Elsevier B.V. All rights reserved.
Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong
2018-02-27
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.
Ye, Lanhan; Song, Kunlin; Shen, Tingting
2018-01-01
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445
NASA Astrophysics Data System (ADS)
Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan
2005-11-01
The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Analytics of Radioactive Materials Released in the Fukushima Daiichi Nuclear Accident
DOE Office of Scientific and Technical Information (OSTI.GOV)
Egarievwe, Stephen U.; Nuclear Engineering Department, University of Tennessee, Knoxville, TN; Coble, Jamie B.
The 2011 Fukushima Daiichi nuclear accident in Japan resulted in the release of radioactive materials into the atmosphere, the nearby sea, and the surrounding land. Following the accident, several meteorological models were used to predict the transport of the radioactive materials to other continents such as North America and Europe. Also of high importance is the dispersion of radioactive materials locally and within Japan. Based on the International Atomic Energy Agency (IAEA) Convention on Early Notification of a nuclear accident, several radiological data sets were collected on the accident by the Japanese authorities. Among the radioactive materials monitored, are I-131more » and Cs-137 which form the major contributions to the contamination of drinking water. The radiation dose in the atmosphere was also measured. It is impractical to measure contamination and radiation dose in every place of interest. Therefore, modeling helps to predict contamination and radiation dose. Some modeling studies that have been reported in the literature include the simulation of transport and deposition of I-131 and Cs-137 from the accident, Cs-137 deposition and contamination of Japanese soils, and preliminary estimates of I-131 and Cs-137 discharged from the plant into the atmosphere. In this paper, we present statistical analytics of I-131 and Cs-137 with the goal of predicting gamma dose from the Fukushima Daiichi nuclear accident. The data sets used in our study were collected from the IAEA Fukushima Monitoring Database. As part of this study, we investigated several regression models to find the best algorithm for modeling the gamma dose. The modeling techniques used in our study include linear regression, principal component regression (PCR), partial least square (PLS) regression, and ridge regression. Our preliminary results on the first set of data showed that the linear regression model with one variable was the best with a root mean square error of 0.0133 μSv/h, compared to 0.0210 μSv/h for PCR, 0.231 μSv/h for ridge regression L-curve, 0.0856 μSv/h for PLS, and 0.0860 μSv/h for ridge regression cross validation. Complete results using the full datasets for these models will also be presented. (authors)« less
Gimelfarb, A.; Willis, J. H.
1994-01-01
An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient. PMID:26930593
Martins, Douglas K; Stauffer, Ryan M; Thompson, Anne M; Halliday, Hannah S; Kollonige, Debra; Joseph, Everette; Weinheimer, Andrew J
The current network of ground-based monitors for ozone (O 3 ) is limited due to the spatial heterogeneity of O 3 at the surface. Satellite measurements can provide a solution to this limitation, but the lack of sensitivity of satellites to O 3 within the boundary layer causes large uncertainties in satellite retrievals at the near-surface. The vertical variability of O 3 was investigated using ozonesondes collected as part of NASA's D eriving I nformation on S urface Conditions from CO lumn and VER tically Resolved Observations Relevant to A ir Q uality (DISCOVER-AQ) campaign during July 2011 in the Baltimore, MD/Washington D.C. metropolitan area. A subset of the ozonesonde measurements was corrected for a known bias from the electrochemical solution strength using new procedures based on laboratory and field tests. A significant correlation of O 3 over the two sites with ozonesonde measurements (Edgewood and Beltsville, MD) was observed between the mid-troposphere (7-10 km) and the near-surface (1-3 km). A linear regression model based on the partial column amounts of O 3 within these subregions was developed to calculate the near-surface O 3 using mid-tropospheric satellite measurements from the Tropospheric Emission Spectrometer (TES) onboard the Aura spacecraft. The uncertainties of the calculated near-surface O 3 using TES mid-tropospheric satellite retrievals and a linear regression model were less than 20 %, which is less than that of the observed variability of O 3 at the surface in this region. These results utilize a region of the troposphere to which existing satellites are more sensitive compared to the boundary layer and can provide information of O 3 at the near-surface using existing satellite infrastructure and algorithms.
Andri, Bertyl; Dispas, Amandine; Marini, Roland Djang'Eing'a; Hubert, Philippe; Sassiat, Patrick; Al Bakain, Ramia; Thiébaut, Didier; Vial, Jérôme
2017-03-31
This work presents a first attempt to establish a model of the retention behaviour for pharmaceutical compounds in gradient mode SFC. For this purpose, multivariate statistics were applied on the basis of data gathered with the Design of Experiment (DoE) methodology. It permitted to build optimally the experiments needed, and served as a basis for providing relevant physicochemical interpretation of the effects observed. Data gathered over a broad experimental domain enabled the establishment of well-fit linear models of the retention of the individual compounds in presence of methanol as co-solvent. These models also allowed the appreciation of the impact of each experimental parameter and their factorial combinations. This approach was carried out with two organic modifiers (i.e. methanol and ethanol) and provided comparable results. Therefore, it demonstrates the feasibility to model retention in gradient mode SFC for individual compounds as a function of the experimental conditions. This approach also permitted to highlight the predominant effect of some parameters (e.g. gradient slope and pressure) on the retention of compounds. Because building of individual models of retention was possible, the next step considered the establishment of a global model of the retention to predict the behaviour of given compounds on the basis of, on the one side, the physicochemical descriptors of the compounds (e.g. Linear Solvation Energy Relationship (LSER) descriptors) and, on the other side, of the experimental conditions. This global model was established by means of partial least squares regression for the selected compounds, in an experimental domain defined by the Design of Experiment (DoE) methodology. Assessment of the model's predictive capabilities revealed satisfactory agreement between predicted and actual retention (i.e. R 2 =0.942, slope=1.004) of the assessed compounds, which is unprecedented in the field. Copyright © 2017 Elsevier B.V. All rights reserved.
García-Hermoso, Antonio; Carrillo, Hugo Alejandro; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavidez, Daniel Humberto; Correa-Bautista, Jorge Enrique; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio; Peterson, Mark D; Ramírez-Vélez, Robinson
2017-01-01
The purpose of this study was two-fold: to analyze the association between muscular fitness (MF) and clustering of metabolic syndrome (MetS) components, and to determine if fatness parameters mediate the association between MF and MetS clustering in Colombian collegiate students. This cross-sectional study included a total of 886 (51.9% women) healthy collegiate students (21.4 ± 3.3 years old). Standing broad jump and isometric handgrip dynamometry were used as indicators of lower and upper body MF, respectively. Also, a MF score was computed by summing the standardized values of both tests, and used to classify adults as fit or unfit. We also assessed fat mass, body mass index, waist-to-height ratio, and abdominal visceral fat, and categorized individuals as low and high fat using international cut-offs. A MetS cluster score was derived by calculating the sum of the sample-specific z-scores from the triglycerides, HDL cholesterol, fasting glucose, waist circumference, and arterial blood pressure. Linear regression models were used to examine whether the association between MF and MetS cluster was mediated by the fatness parameters. Data were collected from 2013 to 2016 and the analysis was done in 2016. Findings revealed that the best profiles (fit + low fat) were associated with lower levels of the MetS clustering (p <0.001 in the four fatness parameters), compared with unfit and fat (unfit + high fat) counterparts. Linear regression models indicated a partial mediating effect for fatness parameters in the association of MF with MetS clustering. Our findings indicate that efforts to improve MF in young adults may decrease MetS risk partially through an indirect effect on improvements to adiposity levels. Thus, weight reduction should be taken into account as a complementary goal to improvements in MF within exercise programs.
Carrillo, Hugo Alejandro; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavidez, Daniel Humberto; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio
2017-01-01
The purpose of this study was two-fold: to analyze the association between muscular fitness (MF) and clustering of metabolic syndrome (MetS) components, and to determine if fatness parameters mediate the association between MF and MetS clustering in Colombian collegiate students. This cross-sectional study included a total of 886 (51.9% women) healthy collegiate students (21.4 ± 3.3 years old). Standing broad jump and isometric handgrip dynamometry were used as indicators of lower and upper body MF, respectively. Also, a MF score was computed by summing the standardized values of both tests, and used to classify adults as fit or unfit. We also assessed fat mass, body mass index, waist-to-height ratio, and abdominal visceral fat, and categorized individuals as low and high fat using international cut-offs. A MetS cluster score was derived by calculating the sum of the sample-specific z-scores from the triglycerides, HDL cholesterol, fasting glucose, waist circumference, and arterial blood pressure. Linear regression models were used to examine whether the association between MF and MetS cluster was mediated by the fatness parameters. Data were collected from 2013 to 2016 and the analysis was done in 2016. Findings revealed that the best profiles (fit + low fat) were associated with lower levels of the MetS clustering (p <0.001 in the four fatness parameters), compared with unfit and fat (unfit + high fat) counterparts. Linear regression models indicated a partial mediating effect for fatness parameters in the association of MF with MetS clustering. Our findings indicate that efforts to improve MF in young adults may decrease MetS risk partially through an indirect effect on improvements to adiposity levels. Thus, weight reduction should be taken into account as a complementary goal to improvements in MF within exercise programs. PMID:28296952
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Teoh, Shao Thing; Kitamura, Miki; Nakayama, Yasumune; Putri, Sastia; Mukai, Yukio; Fukusaki, Eiichiro
2016-08-01
In recent years, the advent of high-throughput omics technology has made possible a new class of strain engineering approaches, based on identification of possible gene targets for phenotype improvement from omic-level comparison of different strains or growth conditions. Metabolomics, with its focus on the omic level closest to the phenotype, lends itself naturally to this semi-rational methodology. When a quantitative phenotype such as growth rate under stress is considered, regression modeling using multivariate techniques such as partial least squares (PLS) is often used to identify metabolites correlated with the target phenotype. However, linear modeling techniques such as PLS require a consistent metabolite-phenotype trend across the samples, which may not be the case when outliers or multiple conflicting trends are present in the data. To address this, we proposed a data-mining strategy that utilizes random sample consensus (RANSAC) to select subsets of samples with consistent trends for construction of better regression models. By applying a combination of RANSAC and PLS (RANSAC-PLS) to a dataset from a previous study (gas chromatography/mass spectrometry metabolomics data and 1-butanol tolerance of 19 yeast mutant strains), new metabolites were indicated to be correlated with tolerance within certain subsets of the samples. The relevance of these metabolites to 1-butanol tolerance were then validated from single-deletion strains of corresponding metabolic genes. The results showed that RANSAC-PLS is a promising strategy to identify unique metabolites that provide additional hints for phenotype improvement, which could not be detected by traditional PLS modeling using the entire dataset. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Model averaging and muddled multimodel inferences.
Cade, Brian S
2015-09-01
Three flawed practices associated with model averaging coefficients for predictor variables in regression models commonly occur when making multimodel inferences in analyses of ecological data. Model-averaged regression coefficients based on Akaike information criterion (AIC) weights have been recommended for addressing model uncertainty but they are not valid, interpretable estimates of partial effects for individual predictors when there is multicollinearity among the predictor variables. Multicollinearity implies that the scaling of units in the denominators of the regression coefficients may change across models such that neither the parameters nor their estimates have common scales, therefore averaging them makes no sense. The associated sums of AIC model weights recommended to assess relative importance of individual predictors are really a measure of relative importance of models, with little information about contributions by individual predictors compared to other measures of relative importance based on effects size or variance reduction. Sometimes the model-averaged regression coefficients for predictor variables are incorrectly used to make model-averaged predictions of the response variable when the models are not linear in the parameters. I demonstrate the issues with the first two practices using the college grade point average example extensively analyzed by Burnham and Anderson. I show how partial standard deviations of the predictor variables can be used to detect changing scales of their estimates with multicollinearity. Standardizing estimates based on partial standard deviations for their variables can be used to make the scaling of the estimates commensurate across models, a necessary but not sufficient condition for model averaging of the estimates to be sensible. A unimodal distribution of estimates and valid interpretation of individual parameters are additional requisite conditions. The standardized estimates or equivalently the t statistics on unstandardized estimates also can be used to provide more informative measures of relative importance than sums of AIC weights. Finally, I illustrate how seriously compromised statistical interpretations and predictions can be for all three of these flawed practices by critiquing their use in a recent species distribution modeling technique developed for predicting Greater Sage-Grouse (Centrocercus urophasianus) distribution in Colorado, USA. These model averaging issues are common in other ecological literature and ought to be discontinued if we are to make effective scientific contributions to ecological knowledge and conservation of natural resources.
Model averaging and muddled multimodel inferences
Cade, Brian S.
2015-01-01
Three flawed practices associated with model averaging coefficients for predictor variables in regression models commonly occur when making multimodel inferences in analyses of ecological data. Model-averaged regression coefficients based on Akaike information criterion (AIC) weights have been recommended for addressing model uncertainty but they are not valid, interpretable estimates of partial effects for individual predictors when there is multicollinearity among the predictor variables. Multicollinearity implies that the scaling of units in the denominators of the regression coefficients may change across models such that neither the parameters nor their estimates have common scales, therefore averaging them makes no sense. The associated sums of AIC model weights recommended to assess relative importance of individual predictors are really a measure of relative importance of models, with little information about contributions by individual predictors compared to other measures of relative importance based on effects size or variance reduction. Sometimes the model-averaged regression coefficients for predictor variables are incorrectly used to make model-averaged predictions of the response variable when the models are not linear in the parameters. I demonstrate the issues with the first two practices using the college grade point average example extensively analyzed by Burnham and Anderson. I show how partial standard deviations of the predictor variables can be used to detect changing scales of their estimates with multicollinearity. Standardizing estimates based on partial standard deviations for their variables can be used to make the scaling of the estimates commensurate across models, a necessary but not sufficient condition for model averaging of the estimates to be sensible. A unimodal distribution of estimates and valid interpretation of individual parameters are additional requisite conditions. The standardized estimates or equivalently the tstatistics on unstandardized estimates also can be used to provide more informative measures of relative importance than sums of AIC weights. Finally, I illustrate how seriously compromised statistical interpretations and predictions can be for all three of these flawed practices by critiquing their use in a recent species distribution modeling technique developed for predicting Greater Sage-Grouse (Centrocercus urophasianus) distribution in Colorado, USA. These model averaging issues are common in other ecological literature and ought to be discontinued if we are to make effective scientific contributions to ecological knowledge and conservation of natural resources.
Liu, Fei; Feng, Lei; Lou, Bing-gan; Sun, Guang-ming; Wang, Lian-ping; He, Yong
2010-07-01
The combinational-stimulated bands were used to develop linear and nonlinear calibrations for the early detection of sclerotinia of oilseed rape (Brassica napus L.). Eighty healthy and 100 Sclerotinia leaf samples were scanned, and different preprocessing methods combined with successive projections algorithm (SPA) were applied to develop partial least squares (PLS) discriminant models, multiple linear regression (MLR) and least squares-support vector machine (LS-SVM) models. The results indicated that the optimal full-spectrum PLS model was achieved by direct orthogonal signal correction (DOSC), then De-trending and Raw spectra with correct recognition ratio of 100%, 95.7% and 95.7%, respectively. When using combinational-stimulated bands, the optimal linear models were SPA-MLR (DOSC) and SPA-PLS (DOSC) with correct recognition ratio of 100%. All SPA-LSSVM models using DOSC, De-trending and Raw spectra achieved perfect results with recognition of 100%. The overall results demonstrated that it was feasible to use combinational-stimulated bands for the early detection of Sclerotinia of oilseed rape, and DOSC-SPA was a powerful way for informative wavelength selection. This method supplied a new approach to the early detection and portable monitoring instrument of sclerotinia.
Hess, Glen W.
2002-01-01
Techniques for estimating monthly streamflow-duration characteristics at ungaged and partial-record sites in central Nevada have been updated. These techniques were developed using streamflow records at six continuous-record sites, basin physical and climatic characteristics, and concurrent streamflow measurements at four partial-record sites. Two methods, the basin-characteristic method and the concurrent-measurement method, were developed to provide estimating techniques for selected streamflow characteristics at ungaged and partial-record sites in central Nevada. In the first method, logarithmic-regression analyses were used to relate monthly mean streamflows (from all months and by month) from continuous-record gaging sites of various percent exceedence levels or monthly mean streamflows (by month) to selected basin physical and climatic variables at ungaged sites. Analyses indicate that the total drainage area and percent of drainage area at altitudes greater than 10,000 feet are the most significant variables. For the equations developed from all months of monthly mean streamflow, the coefficient of determination averaged 0.84 and the standard error of estimate of the relations for the ungaged sites averaged 72 percent. For the equations derived from monthly means by month, the coefficient of determination averaged 0.72 and the standard error of estimate of the relations averaged 78 percent. If standard errors are compared, the relations developed in this study appear generally to be less accurate than those developed in a previous study. However, the new relations are based on additional data and the slight increase in error may be due to the wider range of streamflow for a longer period of record, 1995-2000. In the second method, streamflow measurements at partial-record sites were correlated with concurrent streamflows at nearby gaged sites by the use of linear-regression techniques. Statistical measures of results using the second method typically indicated greater accuracy than for the first method. However, to make estimates for individual months, the concurrent-measurement method requires several years additional streamflow data at more partial-record sites. Thus, exceedence values for individual months are not yet available due to the low number of concurrent-streamflow-measurement data available. Reliability, limitations, and applications of both estimating methods are described herein.
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
RRegrs: an R package for computer-aided model selection with multiple regression models.
Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L
2015-01-01
Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.
Control Variate Selection for Multiresponse Simulation.
1987-05-01
M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.
Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D
2018-05-30
NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Posterior corneal topographic changes after partial flap during laser in situ keratomileusis
Sharma, N; Rani, A; Balasubramanya, R; Vajpayee, R B; Pandey, R M
2003-01-01
Aim: To study the posterior corneal topographic changes in eyes with partial flaps during laser assisted in situ keratomileusis (LASIK). Methods: Case records of 16 patients, who had partial flap in one eye during LASIK (group 1) and uncomplicated surgery in the other eye (group 2), were studied. Following occurrence of partial flap intraoperatively, laser ablation was abandoned in all the eyes. A 160/180 μm flap was attempted during the initial procedure using the Hansatome microkeratome (Bausch & Lomb Surgicals, Munich, Germany). LASIK surgery in all cases was performed using a 180 μm plate, at the mean interval of 4.16 (SD 1.5) months following the initial procedure. None of the eyes had intraoperative complication during LASIK. Relative posterior corneal surface elevation above the best fit sphere (BFS) before the initial procedure, before, and after LASIK were compared using the Orbscan slit scanning corneal topography/pachymetry system. Results: Posterior corneal elevation was comparable in the two groups, both preoperatively (group 1; 16.4 (4.8) μm, group 2; 16.1 (4.8) μm) and after final surgery (group 1; 57.2 (15.6) μm, group 2; 54.3 (13.1) μm). In group 1 after occurrence of partial flap, the posterior corneal elevation was 16.9 (4.4) μm, and this increase was not significant statistically (p=0.4). On multiple linear regression analysis, residual bed thickness (p<0.001) was independently the significant determinant of final posterior corneal elevation in both groups. Conclusion: The inadvertent occurrence of partial flap during LASIK procedure does not contribute to the increase in posterior corneal elevation. PMID:12543743
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
NASA Technical Reports Server (NTRS)
Hunt, L. R.; Villarreal, Ramiro
1987-01-01
System theorists understand that the same mathematical objects which determine controllability for nonlinear control systems of ordinary differential equations (ODEs) also determine hypoellipticity for linear partial differentail equations (PDEs). Moreover, almost any study of ODE systems begins with linear systems. It is remarkable that Hormander's paper on hypoellipticity of second order linear p.d.e.'s starts with equations due to Kolmogorov, which are shown to be analogous to the linear PDEs. Eigenvalue placement by state feedback for a controllable linear system can be paralleled for a Kolmogorov equation if an appropriate type of feedback is introduced. Results concerning transformations of nonlinear systems to linear systems are similar to results for transforming a linear PDE to a Kolmogorov equation.
Income analysis of goat farmers on the farmers group in district of Serdang Bedagai
NASA Astrophysics Data System (ADS)
Manurung, J. N.; Hasnudi; Supriana, T.
2018-02-01
The farmers group are expected to reduce the production cost of goat breeding and improve the income of farmers which impact on the welfare of goat farmers. This research aim to analyze the factors that influence the income of farmers group, in sub-district Dolok Masihul Pegajahan, and Dolok Merawan, Serdang Bedagai. The method used is survey method with 90 respondents. Data was analysed by multiple linear regression. The result showed, simultaneously goat cost, sale price of goat, fixed cost and variable cost had significant effect on income of goat farmers. Partially, goat cost, variable cost and sale price of goat had significant effect on income of goat farmers, while fixed cost had no significant effect.
Visual feature extraction from voxel-weighted averaging of stimulus images in 2 fMRI studies.
Hart, Corey B; Rose, William J
2013-11-01
Multiple studies have provided evidence for distributed object representation in the brain, with several recent experiments leveraging basis function estimates for partial image reconstruction from fMRI data. Using a novel combination of statistical decomposition, generalized linear models, and stimulus averaging on previously examined image sets and Bayesian regression of recorded fMRI activity during presentation of these data sets, we identify a subset of relevant voxels that appear to code for covarying object features. Using a technique we term "voxel-weighted averaging," we isolate image filters that these voxels appear to implement. The results, though very cursory, appear to have significant implications for hierarchical and deep-learning-type approaches toward the understanding of neural coding and representation.
NASA Astrophysics Data System (ADS)
Prapaipong, Panjai; Shock, Everett L.; Koretsky, Carla M.
1999-10-01
By combining results from regression and correlation methods, standard state thermodynamic properties for aqueous complexes between metal cations and divalent organic acid ligands (oxalate, malonate, succinate, glutarate, and adipate) are evaluated and applied to geochemical processes. Regression of experimental standard-state equilibrium constants with the revised Helgeson-Kirkham-Flowers (HKF) equation of state yields standard partial molal entropies (S¯°) of aqueous metal-organic complexes, which allow determination of thermodynamic properties of the complexes at elevated temperatures. In cases where S¯° is not available from either regression or calorimetric measurement, the values of S¯° can be estimated from a linear correlation between standard partial molal entropies of association (ΔS¯°r) and standard partial molal entropies of aqueous cations (S¯°M). The correlation is independent of cation charge, which makes it possible to predict S¯° for complexes between divalent organic acids and numerous metal cations. Similarly, correlations between standard Gibbs free energies of association of metal-organic complexes (ΔḠ°r) and Gibbs free energies of formation (ΔḠ°f) for divalent metal cations allow estimates of standard-state equilibrium constants where experimental data are not available. These correlations are found to be a function of ligand structure and cation charge. Predicted equilibrium constants for dicarboxylate complexes of numerous cations were included with those for inorganic and other organic complexes to study the effects of dicarboxylate complexes on the speciation of metals and organic acids in oil-field brines. Relatively low concentrations of oxalic and malonic acids affect the speciation of cations more than similar concentrations of succinic, glutaric, and adipic acids. However, the extent to which metal-dicarboxylate complexes contribute to the speciation of dissolved metals depends on the type of dicarboxylic acid ligand; relative concentration of inorganic, mono-, and dicarboxylate ligands; and the type of metal cation. As an example, in the same solution, dicarboxylic acids have a greater influence on the speciation of Fe+2 and Mg+2 than on the speciation of Zn+2 and Mn+2.
2015-07-15
Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
Liley, Helen; Zhang, Ju; Firth, Elwyn; Fernandez, Justin; Besier, Thor
2017-11-01
Population variance in bone shape is an important consideration when applying the results of subject-specific computational models to a population. In this letter, we demonstrate the ability of partial least squares regression to provide an improved shape prediction of the equine third metacarpal epiphysis, using two easily obtained measurements.
Scarborough, Donna Moxley; Linderman, Shannon; Berkson, Eric M.; Oh, Luke S.
2017-01-01
Objectives: Unilateral partial squat tasks are often used to assess athletes’ lower extremity (LE) neuromuscular control. Single squat biomechanics such as lateral drop of the non-stance limb’s pelvis have been linked to knee injury risk. Yet, there are limited studies on the factors contributing to pelvic instability during the unilateral partial squat such as anatomical alignment of the knee and hip strength. The purpose of this study was 1) to assess the influence of leg dominance on pelvic drop among female athletes during the repeated unilateral partial squat activity and 2) to investigate the contributions that lower limb kinematics and hip strength have on pelvis drop. Methods: 42 female athletes (27= softball pitchers, 15=gymnasts, avg age=16.48 ± 2.54 years) underwent lower limb assessment. The quadriceps angle (Q angle) and the average of 3 trials for hip abduction and extension strength (handheld dynamometer measurements) were used for analyses. 3D biomechanical analysis of the repeated unilateral partial squat activity followed using a 20 motion capture camera system which created a 15 segment model of each subject. The subject stood on one leg at the lateral edge of a 17.78 cm box with hands placed on the hips and squatted so that the free hanging contralateral limb came as close to the ground without contact for 5 continuous repetitions. One trial for each limb was performed. Peak pelvic drop and ankle, knee and hip angles and torques (normalized by weight) at this time point were calculated using Visual 3D (C-Motion) biomechanical software. Paired T-test, Spearman correlations and multiple regression model statistical analyses were performed. Results: Peak pelvic drop during the unilateral partial squat did not differ significantly on the basis of limb dominance (p=0.831, Dom: -3.40 ± 5.10° , ND: -3.46 ± 4.44°). Peak pelvic drop displayed a Spearman correlation with the functional measure of hip abduction/adduction (ABD/ADD) angle (rs= 0.627, p< 0.001) (Figure 1). No association was noted between peak pelvic drop and anatomical measures of Q angle or isometric hip extension strength. A multiple regression was performed to predict pelvis drop angle from the following 6 variables: isometric hip ABD strength, hip ABD/ADD angle, hip internal/external rotation angle, ankle supination/pronation (S/P) angle, height and weight. These variables statistically predicted pelvis drop, F(6,73) = 17.848, p < .0005, R2 = 0.595. The strongest combined predictor variables for pelvic drop in the female athletes were hip abduction/ adduction angle and strength followed by subject’s weight and ankle S/P angle (Table 1). Conclusion: Peak pelvic drop during the repeated unilateral partial squat activity did not correlate significantly with Q angle and hip extension strength. Instead, peak pelvic drop appears more related to a combination of biomechanical limb positioning, hip ABD strength and subject demographics. The regression model run on the repeated unilateral partial squat demonstrates predictive power of this dynamic assessment tool based on kinematic measures across multiple joints. Results could guide clinician screening for excessive pelvic drop in female athletes and based on the predictive model make recommendations for corrective conditioning to help prevent knee injury and guide return to sport following LE surgery. Table 1: Multivariate linear regression model for pelvic drop, Isometric hip strength and lower extremity kinematics during repeated partial squat activity among female athletes. Variable P Isometric hip abduction strength 0.034* Hip Abduction/Adduction Angle <0.001* Hip Internal/External Rotation Angle 0.936 Ankle Internal/External Rotation Angle 0.072 Height 0.398 Weight 0.011* * Level of significance established at p<0.05
NASA Astrophysics Data System (ADS)
Muller, Sybrand Jacobus; van Niekerk, Adriaan
2016-07-01
Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Optimal moving grids for time-dependent partial differential equations
NASA Technical Reports Server (NTRS)
Wathen, A. J.
1989-01-01
Various adaptive moving grid techniques for the numerical solution of time-dependent partial differential equations were proposed. The precise criterion for grid motion varies, but most techniques will attempt to give grids on which the solution of the partial differential equation can be well represented. Moving grids are investigated on which the solutions of the linear heat conduction and viscous Burgers' equation in one space dimension are optimally approximated. Precisely, the results of numerical calculations of optimal moving grids for piecewise linear finite element approximation of partial differential equation solutions in the least squares norm.
Estimating linear temporal trends from aggregated environmental monitoring data
Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.
2017-01-01
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siewicki, T.C.; Chandler, G.T.
1995-12-31
Eastern oysters (Crassostrea virginica) were continuously exposed to suspended {sup 14}C-fluoranthene spiked-sediment for either: (1) five days followed by 24 days deputation, or (2) 28 days exposure. Sediment less than 63 um contained fluoranthene concentrations one or ten times that measured at suburbanized sites in southeastern estuaries (133 or 1,300 ng/g). The data were evaluated both raw and normalized for tissue lipid and sediment organic carbon concentrations. Uptake rate constants were estimated using non-linear regression methods. Depuration rate constants were estimated by linear regression of the deputation phase following five-days exposure and as the second partial derivative of the non-linearmore » regression for the 28-day exposures. Uptake and deputation rate constants, bioconcentration factors and half-lives were similar regardless of exposure time, sediment fluoranthene concentration or use of data normalization. Uptake and deputation rate constants, bioconcentration factors and half-lives (days) were similar and low for all experiments, ranging from 0.02 to 0.10, 0.14 to 0.30, 0.09 to 0.46, and 2.4 to 5.0, respectively. Degradation by the mixed function oxidase system is not expected in oysters allowing the use of radiotracers for measuring very low concentrations of fluoranthene. The results suggest that short-term exposures followed by deputation are effective for estimating kinetic rate constants and that normalization provides little benefit in these controlled studies. The results further show that bioconcentration of sediment-associated fluoranthene, and possibly other polycyclic aromatic hydrocarbons, is very low compared to either dissolved forms or levels commonly used in regulatory actions.« less
Comparison of Conventional and ANN Models for River Flow Forecasting
NASA Astrophysics Data System (ADS)
Jain, A.; Ganti, R.
2011-12-01
Hydrological models are useful in many water resources applications such as flood control, irrigation and drainage, hydro power generation, water supply, erosion and sediment control, etc. Estimates of runoff are needed in many water resources planning, design development, operation and maintenance activities. River flow is generally estimated using time series or rainfall-runoff models. Recently, soft artificial intelligence tools such as Artificial Neural Networks (ANNs) have become popular for research purposes but have not been extensively adopted in operational hydrological forecasts. There is a strong need to develop ANN models based on real catchment data and compare them with the conventional models. In this paper, a comparative study has been carried out for river flow forecasting using the conventional and ANN models. Among the conventional models, multiple linear, and non linear regression, and time series models of auto regressive (AR) type have been developed. Feed forward neural network model structure trained using the back propagation algorithm, a gradient search method, was adopted. The daily river flow data derived from Godavari Basin @ Polavaram, Andhra Pradesh, India have been employed to develop all the models included here. Two inputs, flows at two past time steps, (Q(t-1) and Q(t-2)) were selected using partial auto correlation analysis for forecasting flow at time t, Q(t). A wide range of error statistics have been used to evaluate the performance of all the models developed in this study. It has been found that the regression and AR models performed comparably, and the ANN model performed the best amongst all the models investigated in this study. It is concluded that ANN model should be adopted in real catchments for hydrological modeling and forecasting.
Kikui, Miki; Kida, Momoyo; Kosaka, Takayuki; Yamamoto, Masaaki; Yoshimuta, Yoko; Yasui, Sakae; Nokubi, Takashi; Maeda, Yoshinobu; Kokubo, Yoshihiro; Watanabe, Makoto; Miyamoto, Yoshihiro
2015-01-01
Abstract There are numerous reports on the relationship between regular utilization of dental care services and oral health, but most are based on questionnaires and subjective evaluation. Few have objectively evaluated masticatory performance and its relationship to utilization of dental care services. The purpose of this study was to identify the effect of regular utilization of dental services on masticatory performance. The subjects consisted of 1804 general residents of Suita City, Osaka Prefecture (760 men and 1044 women, mean age 66.5 ± 7.9 years). Regular utilization of dental services and oral hygiene habits (frequency of toothbrushing and use of interdental aids) was surveyed, and periodontal status, occlusal support, and masticatory performance were measured. Masticatory performance was evaluated by a chewing test using gummy jelly. The correlation between age, sex, regular dental utilization, oral hygiene habits, periodontal status or occlusal support, and masticatory performance was analyzed using Spearman's correlation test and t‐test. In addition, multiple linear regression analysis was carried out to investigate the relationship of regular dental utilization with masticatory performance after controlling for other factors. Masticatory performance was significantly correlated to age when using Spearman's correlation test, and to regular dental utilization, periodontal status, or occlusal support with t‐test. Multiple linear regression analysis showed that regular utilization of dental services was significantly related to masticatory performance even after adjusting for age, sex, oral hygiene habits, periodontal status, and occlusal support (standardized partial regression coefficient β = 0.055). These findings suggested that the regular utilization of dental care services is an important factor influencing masticatory performance in a Japanese urban population. PMID:29744141
Kikui, Miki; Ono, Takahiro; Kida, Momoyo; Kosaka, Takayuki; Yamamoto, Masaaki; Yoshimuta, Yoko; Yasui, Sakae; Nokubi, Takashi; Maeda, Yoshinobu; Kokubo, Yoshihiro; Watanabe, Makoto; Miyamoto, Yoshihiro
2015-12-01
There are numerous reports on the relationship between regular utilization of dental care services and oral health, but most are based on questionnaires and subjective evaluation. Few have objectively evaluated masticatory performance and its relationship to utilization of dental care services. The purpose of this study was to identify the effect of regular utilization of dental services on masticatory performance. The subjects consisted of 1804 general residents of Suita City, Osaka Prefecture (760 men and 1044 women, mean age 66.5 ± 7.9 years). Regular utilization of dental services and oral hygiene habits (frequency of toothbrushing and use of interdental aids) was surveyed, and periodontal status, occlusal support, and masticatory performance were measured. Masticatory performance was evaluated by a chewing test using gummy jelly. The correlation between age, sex, regular dental utilization, oral hygiene habits, periodontal status or occlusal support, and masticatory performance was analyzed using Spearman's correlation test and t -test. In addition, multiple linear regression analysis was carried out to investigate the relationship of regular dental utilization with masticatory performance after controlling for other factors. Masticatory performance was significantly correlated to age when using Spearman's correlation test, and to regular dental utilization, periodontal status, or occlusal support with t -test. Multiple linear regression analysis showed that regular utilization of dental services was significantly related to masticatory performance even after adjusting for age, sex, oral hygiene habits, periodontal status, and occlusal support (standardized partial regression coefficient β = 0.055). These findings suggested that the regular utilization of dental care services is an important factor influencing masticatory performance in a Japanese urban population.
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Quantification of trace metals in infant formula premixes using laser-induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Cama-Moncunill, Raquel; Casado-Gavalda, Maria P.; Cama-Moncunill, Xavier; Markiewicz-Keszycka, Maria; Dixit, Yash; Cullen, Patrick J.; Sullivan, Carl
2017-09-01
Infant formula is a human milk substitute generally based upon fortified cow milk components. In order to mimic the composition of breast milk, trace elements such as copper, iron and zinc are usually added in a single operation using a premix. The correct addition of premixes must be verified to ensure that the target levels in infant formulae are achieved. In this study, a laser-induced breakdown spectroscopy (LIBS) system was assessed as a fast validation tool for trace element premixes. LIBS is a promising emission spectroscopic technique for elemental analysis, which offers real-time analyses, little to no sample preparation and ease of use. LIBS was employed for copper and iron determinations of premix samples ranging approximately from 0 to 120 mg/kg Cu/1640 mg/kg Fe. LIBS spectra are affected by several parameters, hindering subsequent quantitative analyses. This work aimed at testing three matrix-matched calibration approaches (simple-linear regression, multi-linear regression and partial least squares regression (PLS)) as means for precision and accuracy enhancement of LIBS quantitative analysis. All calibration models were first developed using a training set and then validated with an independent test set. PLS yielded the best results. For instance, the PLS model for copper provided a coefficient of determination (R2) of 0.995 and a root mean square error of prediction (RMSEP) of 14 mg/kg. Furthermore, LIBS was employed to penetrate through the samples by repetitively measuring the same spot. Consequently, LIBS spectra can be obtained as a function of sample layers. This information was used to explore whether measuring deeper into the sample could reduce possible surface-contaminant effects and provide better quantifications.
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
NASA Technical Reports Server (NTRS)
Title, A. M.
1978-01-01
Filter includes partial polarizer between birefrigent elements. Plastic film on partial polarizer compensates for any polarization rotation by partial polarizer. Two quarter-wave plates change incident, linearly polarized light into elliptically polarized light.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Flow-covariate prediction of stream pesticide concentrations.
Mosquin, Paul L; Aldworth, Jeremy; Chen, Wenlin
2018-01-01
Potential peak functions (e.g., maximum rolling averages over a given duration) of annual pesticide concentrations in the aquatic environment are important exposure parameters (or target quantities) for ecological risk assessments. These target quantities require accurate concentration estimates on nonsampled days in a monitoring program. We examined stream flow as a covariate via universal kriging to improve predictions of maximum m-day (m = 1, 7, 14, 30, 60) rolling averages and the 95th percentiles of atrazine concentration in streams where data were collected every 7 or 14 d. The universal kriging predictions were evaluated against the target quantities calculated directly from the daily (or near daily) measured atrazine concentration at 32 sites (89 site-yr) as part of the Atrazine Ecological Monitoring Program in the US corn belt region (2008-2013) and 4 sites (62 site-yr) in Ohio by the National Center for Water Quality Research (1993-2008). Because stream flow data are strongly skewed to the right, 3 transformations of the flow covariate were considered: log transformation, short-term flow anomaly, and normalized Box-Cox transformation. The normalized Box-Cox transformation resulted in predictions of the target quantities that were comparable to those obtained from log-linear interpolation (i.e., linear interpolation on the log scale) for 7-d sampling. However, the predictions appeared to be negatively affected by variability in regression coefficient estimates across different sample realizations of the concentration time series. Therefore, revised models incorporating seasonal covariates and partially or fully constrained regression parameters were investigated, and they were found to provide much improved predictions in comparison with those from log-linear interpolation for all rolling average measures. Environ Toxicol Chem 2018;37:260-273. © 2017 SETAC. © 2017 SETAC.
Allodji, Rodrigue S; Schwartz, Boris; Diallo, Ibrahima; Agbovon, Césaire; Laurier, Dominique; de Vathaire, Florent
2015-08-01
Analyses of the Life Span Study (LSS) of Japanese atomic bombing survivors have routinely incorporated corrections for additive classical measurement errors using regression calibration. Recently, several studies reported that the efficiency of the simulation-extrapolation method (SIMEX) is slightly more accurate than the simple regression calibration method (RCAL). In the present paper, the SIMEX and RCAL methods have been used to address errors in atomic bomb survivor dosimetry on solid cancer and leukaemia mortality risk estimates. For instance, it is shown that using the SIMEX method, the ERR/Gy is increased by an amount of about 29 % for all solid cancer deaths using a linear model compared to the RCAL method, and the corrected EAR 10(-4) person-years at 1 Gy (the linear terms) is decreased by about 8 %, while the corrected quadratic term (EAR 10(-4) person-years/Gy(2)) is increased by about 65 % for leukaemia deaths based on a linear-quadratic model. The results with SIMEX method are slightly higher than published values. The observed differences were probably due to the fact that with the RCAL method the dosimetric data were partially corrected, while all doses were considered with the SIMEX method. Therefore, one should be careful when comparing the estimated risks and it may be useful to use several correction techniques in order to obtain a range of corrected estimates, rather than to rely on a single technique. This work will enable to improve the risk estimates derived from LSS data, and help to make more reliable the development of radiation protection standards.
Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady
2017-09-01
Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights reserved.
2017-10-01
ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Jordan, Nika; Zakrajšek, Jure; Bohanec, Simona; Roškar, Robert; Grabnar, Iztok
2018-05-01
The aim of the present research is to show that the methodology of Design of Experiments can be applied to stability data evaluation, as they can be seen as multi-factor and multi-level experimental designs. Linear regression analysis is usually an approach for analyzing stability data, but multivariate statistical methods could also be used to assess drug stability during the development phase. Data from a stability study for a pharmaceutical product with hydrochlorothiazide (HCTZ) as an unstable drug substance was used as a case example in this paper. The design space of the stability study was modeled using Umetrics MODDE 10.1 software. We showed that a Partial Least Squares model could be used for a multi-dimensional presentation of all data generated in a stability study and for determination of the relationship among factors that influence drug stability. It might also be used for stability predictions and potentially for the optimization of the extent of stability testing needed to determine shelf life and storage conditions, which would be time and cost-effective for the pharmaceutical industry.
Low-flow frequency analyses for streams in west-central Florida
Hammett, K.M.
1985-01-01
The log-Pearson type III distribution was used for defining low-flow frequency at 116 continuous-record streamflow stations in west-central Florida. Frequency distributions were calculated for 1, 3, 7, 14, 30, 60, 90, 120, and 183 consecutive-day periods for recurrence intervals of 2, 5, 10, and 20 years. Discharge measurements at more than 100 low-flow partial-record stations and miscellaneous discharge-measurement stations were correlated with concurrent daily mean discharge at continuous-record stations. Estimates of the 7-day, 2-year; 7-day, 10-year; 30-day, 2-year; and 30-day, 10-year discharges were made for most of the low-flow partial-record and miscellaneous discharge-measurement stations based on those correlations. Multiple linear-regression analysis was used in an attempt to mathematically relate low-flow frequency data to basin characteristics. The resulting equations showed an apparent bias and were considered unsatisfactory for use in estimating low-flow characteristics. Maps of the 7-day, 10-year and 30-day, 10-year low flows are presented. Techniques that can be used to estimate low-flow characteristics at an ungaged site are also provided. (USGS)
NASA Astrophysics Data System (ADS)
Shao, Yongni; Jiang, Linjun; Zhou, Hong; Pan, Jian; He, Yong
2016-04-01
In our study, the feasibility of using visible/near infrared hyperspectral imaging technology to detect the changes of the internal components of Chlorella pyrenoidosa so as to determine the varieties of pesticides (such as butachlor, atrazine and glyphosate) at three concentrations (0.6 mg/L, 3 mg/L, 15 mg/L) was investigated. Three models (partial least squares discriminant analysis combined with full wavelengths, FW-PLSDA; partial least squares discriminant analysis combined with competitive adaptive reweighted sampling algorithm, CARS-PLSDA; linear discrimination analysis combined with regression coefficients, RC-LDA) were built by the hyperspectral data of Chlorella pyrenoidosa to find which model can produce the most optimal result. The RC-LDA model, which achieved an average correct classification rate of 97.0% was more superior than FW-PLSDA (72.2%) and CARS-PLSDA (84.0%), and it proved that visible/near infrared hyperspectral imaging could be a rapid and reliable technique to identify pesticide varieties. It also proved that microalgae can be a very promising medium to indicate characteristics of pesticides.
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Van Hoorenbeeck, K; Franckx, H; Debode, P; Aerts, P; Ramet, J; Van Gaal, L F; Desager, K N; De Backer, W A; Verhulst, S L
2013-07-01
Sleep-disordered breathing (SDB) is prevalent in obesity. Weight loss is one of the most effective treatment options. The aim was to assess the association of SDB and metabolic disruption before and after weight loss. Obese adolescents were included when entering an in-patient weight loss program. Fasting blood analysis was performed at baseline and after 4-6 months. Sleep screening was done at baseline and at follow-up in case of baseline SDB. 224 obese adolescents were included. Median age was 15.5 years (10.1-18.0) and mean BMI z-score was 2.74 ± 0.42. About 30% had SDB at baseline (N = 68). High-density lipoprotein (HDL)-cholesterol was associated with mean nocturnal oxygen saturation (
Korean Version of Child Perceptions Questionnaire and Dental Caries among Korean Children
Shin, Hye-Sun; Han, Dong-Hun; Shin, Myung-Seop; Lee, Hyun-Jin; Kim, Mi-Sun; Kim, Hyun-Duck
2015-01-01
Although dental caries has been a major oral health problem for children, the association between dental caries and oral health related quality of life has been still controversial. This study aims to evaluate the association between the Korean version of the Child Perceptions Questionnaire (K-CPQ) and dental caries among Korean children. Eight hundred one school children aged 8 to 14 years participated in this study. After the K-CPQ was validated we performed an association study. The K-CPQ was self-reported. Dental caries were evaluated by dentists using the World Health Organization Index. Correlation analyses (intraclass correlation coefficient, Cronbach’s alpha and Pearson’s correlation coefficient [r]) and linear regression models (partial r) including age, gender and type of school were applied. Untreated deciduous dental caries was associated with the K-CPQ8-10 overall score (partial r = 0.15, P <0.05). The link was highlighted in the domains of functional limitation and emotional well-being. Filled teeth due to caries (FT) was associated with the K-CPQ11-14 overall domain (partial r = 0.14, P = 0.002) as well as with the oral symptoms domain (partial r = 0.16, P = 0.001). This association was highlighted among public school children. Our data indicate that K-CPQ was independently associated with dental caries. The K-CPQ could be a practical tool to evaluate the subjective oral health among Korean children aged 8 to 14. PMID:25675410
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Wu, Jibo
2016-01-01
In this article, a generalized difference-based ridge estimator is proposed for the vector parameter in a partial linear model when the errors are dependent. It is supposed that some additional linear constraints may hold to the whole parameter space. Its mean-squared error matrix is compared with the generalized restricted difference-based estimator. Finally, the performance of the new estimator is explained by a simulation study and a numerical example.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Peng, Ying; Li, Su-Ning; Pei, Xuexue; Hao, Kun
2018-03-01
Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.
Semiantichains and Unichain Coverings in Direct Products of Partial Orders.
1980-09-01
34 Discrete Math . 5 (1973), 305-337. 13) G. B. Dantsig and A. J. 11offman, "Dilworth’s theorem on partially ordered sets,* in Linear Inequalities and Related...Sperner theorem,* Discrete Math . 17 (1977), 281-289. 118) A. J. Hoffman, ’The role of unimodularity in applying linear inequalities to combinatorial
Pyrogenic carbon distribution in mineral topsoils of the northeastern United States
Jauss, Verena; Sullivan, Patrick J.; Sanderman, Jonathan; Smith, David; Lehmann, Johannes
2017-01-01
Due to its slow turnover rates in soil, pyrogenic carbon (PyC) is considered an important C pool and relevant to climate change processes. Therefore, the amounts of soil PyC were compared to environmental covariates over an area of 327,757 km2 in the northeastern United States in order to understand the controls on PyC distribution over large areas. Topsoil (defined as the soil A horizon, after removal of any organic horizons) samples were collected at 165 field sites in a generalised random tessellation stratified design that corresponded to approximately 1 site per 1600 km2 and PyC was estimated from diffuse reflectance mid-infrared spectroscopy measurements using a partial least-squares regression analysis in conjunction with a large database of PyC measurements based on a solid-state 13C nuclear magnetic resonance spectroscopy technique. Three spatial models were applied to the data in order to relate critical environmental covariates to the changes in spatial density of PyC over the landscape. Regional mean density estimates of PyC were 11.0 g kg− 1 (0.84 Gg km− 2) for Ordinary Kriging, 25.8 g kg− 1(12.2 Gg km− 2) for Multivariate Linear Regression, and 26.1 g kg− 1 (12.4 Gg km− 2) for Bayesian Regression Kriging. Akaike Information Criterion (AIC) indicated that the Multivariate Linear Regression model performed best (AIC = 842.6; n = 165) compared to Ordinary Kriging (AIC = 982.4) and Bayesian Regression Kriging (AIC = 979.2). Soil PyC concentrations correlated well with total soil sulphur (P < 0.001; n = 165), plant tissue lignin (P = 0.003), and drainage class (P = 0.008). This suggests the opportunity of including related environmental parameters in the spatial assessment of PyC in soils. Better estimates of the contribution of PyC to the global carbon cycle will thus also require more accurate assessments of these covariates.
NASA Technical Reports Server (NTRS)
Abarbanel, Saul; Gottlieb, David; Carpenter, Mark H.
1994-01-01
It has been previously shown that the temporal integration of hyperbolic partial differential equations (PDE's) may, because of boundary conditions, lead to deterioration of accuracy of the solution. A procedure for removal of this error in the linear case has been established previously. In the present paper we consider hyperbolic (PDE's) (linear and non-linear) whose boundary treatment is done via the SAT-procedure. A methodology is present for recovery of the full order of accuracy, and has been applied to the case of a 4th order explicit finite difference scheme.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
NASA Astrophysics Data System (ADS)
Taylor, James S., Jr.; Davis, P. S.; Wolff, Lawrence B.
2003-09-01
Research has shown that naturally occurring light outdoors and underwater is partially linearly polarized. The polarized components can be combined to form an image that describes the polarization of the light in the scene. This image is known as the degree of linear polarization (DOLP) image or partial polarization image. These naturally occurring polarization signatures can provide a diver or an unmanned underwater vehicle (UUV) with more information to detect, classify, and identify threats such as obstacles and/or mines in the shallow water environment. The SHallow water Real-time IMaging Polarimeter (SHRIMP), recently developed under sponsorship of Dr. Tom Swean at the Office of Naval Research (Code 321OE), can measure underwater partial polarization imagery. This sensor is a passive, three-channel device that simultaneously measures the three components of the Stokes vector needed to determine the partial linear polarization of the scene. The testing of this sensor has been completed and the data has been analyzed. This paper presents performance results from the field-testing and quantifies the gain provided by the partial polarization signature of targets in the Very Shallow Water (VSW) and Surf Zone (SZ) regions.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.
ERIC Educational Resources Information Center
Schafer, William D.; Wang, Yuh-Yin
A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Locally linear regression for pose-invariant face recognition.
Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen
2007-07-01
The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
Vindimian, Éric; Garric, Jeanne; Flammarion, Patrick; Thybaud, Éric; Babut, Marc
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average value of the experts' judgements to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species. Copyright © 1999 SETAC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vindimian, E.; Garric, J.; Flammarion, P.
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average valuemore » of the experts' judgments to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species.« less
USDA-ARS?s Scientific Manuscript database
Purpose: The aim of this study was to develop a technique for the non-destructive and rapid prediction of the moisture content in red pepper powder using near-infrared (NIR) spectroscopy and a partial least squares regression (PLSR) model. Methods: Three red pepper powder products were separated in...
USDA-ARS?s Scientific Manuscript database
A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...
Hejri-Zarifi, Sudiyeh; Ahmadian-Kouchaksaraei, Zahra; Pourfarzad, Amir; Khodaparast, Mohammad Hossein Haddad
2014-12-01
Germinated palm date seeds were milled into two fractions: germ and residue. Dough rheological characteristics, baking (specific volume and sensory evaluation), and textural properties (at first day and during storage for 5 days) were determined in Barbari flat bread. Germ and residue fractions were incorporated at various levels ranged in 0.5-3 g/100 g of wheat flour. Water absorption, arrival time and gelatination temperature were decreased by germ fraction but accompanied by an increasing effect on the mixing tolerance index and degree of softening in most levels. Although improvement in dough stability was monitored but specific volume of bread was not affected by both fractions. Texture analysis of bread samples during 5 days of storage indicated that both fractions of germinated date seeds were able to diminish bread staling. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among dough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.
Scampicchio, Matteo; Mimmo, Tanja; Capici, Calogero; Huck, Christian; Innocente, Nadia; Drusch, Stephan; Cesco, Stefano
2012-11-14
Stable isotope values were used to develop a new analytical approach enabling the simultaneous identification of milk samples either processed with different heating regimens or from different geographical origins. The samples consisted of raw, pasteurized (HTST), and ultrapasteurized (UHT) milk from different Italian origins. The approach consisted of the analysis of the isotope ratio of δ¹³C and δ¹⁵N for the milk samples and their fractions (fat, casein, and whey). The main finding of this work is that as the heat processing affects the composition of the milk fractions, changes in δ¹³C and δ¹⁵N were also observed. These changes were used as markers to develop pattern recognition maps based on principal component analysis and supervised classification models, such as linear discriminant analysis (LDA), multivariate regression (MLR), principal component regression (PCR), and partial least-squares (PLS). The results give proof of the concept that isotope ratio mass spectroscopy can discriminate simultaneously between milk samples according to their geographical origin and type of processing.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Parsa, Azin; Ibrahim, Norliza; Hassan, Bassam; Motroni, Alessandro; van der Stelt, Paul; Wismeijer, Daniel
2012-01-01
To assess the reliability of cone beam computed tomography (CBCT) voxel gray value measurements using Hounsfield units (HU) derived from multislice computed tomography (MSCT) as a clinical reference (gold standard). Ten partially edentulous human mandibular cadavers were scanned by two types of computed tomography (CT) modalities: multislice CT and cone beam CT. On MSCT scans, eight regions of interest (ROI) designating the site for preoperative implant placement were selected in each mandible. The datasets from both CT systems were matched using a three-dimensional (3D) registration algorithm. The mean voxel gray values of the region around the implant sites were compared between MSCT and CBCT. Significant differences between the mean gray values obtained by CBCT and HU by MSCT were found. In all the selected ROIs, CBCT showed higher mean values than MSCT. A strong correlation (R=0.968) between mean voxel gray values of CBCT and mean HU of MSCT was determined. Voxel gray values from CBCT deviate from actual HU units. However, a strong linear correlation exists, which may permit deriving actual HU units from CBCT using linear regression models.
Riemannian multi-manifold modeling and clustering in brain networks
NASA Astrophysics Data System (ADS)
Slavakis, Konstantinos; Salsabilian, Shiva; Wack, David S.; Muldoon, Sarah F.; Baidoo-Williams, Henry E.; Vettel, Jean M.; Cieslak, Matthew; Grafton, Scott T.
2017-08-01
This paper introduces Riemannian multi-manifold modeling in the context of brain-network analytics: Brainnetwork time-series yield features which are modeled as points lying in or close to a union of a finite number of submanifolds within a known Riemannian manifold. Distinguishing disparate time series amounts thus to clustering multiple Riemannian submanifolds. To this end, two feature-generation schemes for brain-network time series are put forth. The first one is motivated by Granger-causality arguments and uses an auto-regressive moving average model to map low-rank linear vector subspaces, spanned by column vectors of appropriately defined observability matrices, to points into the Grassmann manifold. The second one utilizes (non-linear) dependencies among network nodes by introducing kernel-based partial correlations to generate points in the manifold of positivedefinite matrices. Based on recently developed research on clustering Riemannian submanifolds, an algorithm is provided for distinguishing time series based on their Riemannian-geometry properties. Numerical tests on time series, synthetically generated from real brain-network structural connectivity matrices, reveal that the proposed scheme outperforms classical and state-of-the-art techniques in clustering brain-network states/structures.
Piecewise exponential survival times and analysis of case-cohort data.
Li, Yan; Gail, Mitchell H; Preston, Dale L; Graubard, Barry I; Lubin, Jay H
2012-06-15
Case-cohort designs select a random sample of a cohort to be used as control with cases arising from the follow-up of the cohort. Analyses of case-cohort studies with time-varying exposures that use Cox partial likelihood methods can be computer intensive. We propose a piecewise-exponential approach where Poisson regression model parameters are estimated from a pseudolikelihood and the corresponding variances are derived by applying Taylor linearization methods that are used in survey research. The proposed approach is evaluated using Monte Carlo simulations. An illustration is provided using data from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study of male smokers in Finland, where a case-cohort study of serum glucose level and pancreatic cancer was analyzed. Copyright © 2012 John Wiley & Sons, Ltd.
The influences of consumer characteristics on the amount of rice consumption
NASA Astrophysics Data System (ADS)
Supriana, T.; Pane, TC
2018-02-01
This study aimed to analyze the characteristics of rice consumers and the influences of consumer characteristics on the amount of rice consumption. The research areas were determined purposively in the sub-districts with the most significant population in Medan City. The analytical methods used were descriptive and multiple linear regression analysis. The results showed that consumers in the study areas have various characteristics, concerning age, income, family size, health, and education. Simultaneously, characteristics of rice consumers have the significant effect on the amount of rice consumed. Partially, age and the number of family members have the significant effect on the amount of rice consumed. The implications of this research are, need different policies toward consumers of rice based on their income strata. Rice policies cannot be generalized.
Fast neutron irradiation for locally advanced pancreatic cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, F.P.; Schein, P.S.; MacDonald, J.S.
1981-11-01
Nineteen patients with locally advanced pancreatic cancer and one patient with islet cell cancer were treated with 1700-1500 neutron rad alone or in combination with 5-fluorouracil to exploit the theoretic advantages of higher linear energy of transfer, and lower oxygen enhancement ratio of neutrons. Only 5 of 14 (36%) obtained partial tumor regression. The median survival for all patients with pancreatic cancer was 6 months, which is less than that reported with 5-fluorouracil and conventional photon irradiation. Gastrointestinal toxicity was considerable; hemorhagic gastritis in five patients, colitis in two and esophagitis in one. One patient developed radiation myelitis. We therefore,more » caution any enthusiasm for this modality of therapy until clear evidence of a therapeutic advantage over photon therapy is demonstrated in controlled clinical trials.« less
Wang, Wen; Li, Nianfeng
2015-06-01
To measure retinol binding protein 4 (RBP4) levels in serum and bile and to analyze their relationship with insulin resistance, dyslipidemia or cholesterol saturation index (CSI). A total of 60 patients with gallstone were divided into a diabetes group (n=30) and a control group (n=30). The concentrations of RBP4 in serum and bile were detected by enzyme-linked immunosorbent assay (ELISA). Enzyme colorimetric method was used to measure the concentration of biliary cholesterol, bile acid and phospholipid. Biliary CSI was calculated by Carey table. Partial correlation and multiple linear regression analysis were used to evaluate the correlation between the RBP4 levels in serum or bile and the above indexes. The RBP4 concentrations in serum and bile in the diabetes group were significantly elevated compared with those in the control group (both P<0.01). There was no significant difference in the serum total bile acid (TBA), serum triglyceride (TG), serum high-density lipoprotein (HDL), bile TBA, bile total cholesterol (TC) , bile phospholipids and bile CSI between the 2 groups (all P>0.05); but the serum TC, low density lipoprotein (LDL), fasting blood glucose (FBG), fasting insulin (FINS), and homeostasis model assessment for insulin resistance (HOMA-IR) in the diabetes group were significantly increased compared to those in the control group (all P<0.05). The partial correlation analysis, which was adjusted by age, showed that the bile RBP4 was positively correlated with body mass index (BMI), waist circumference (WC), FINS, FBG, TC, LDL and HOMA-IR (r=0.283, 0.405, 0.685, 0.667, 0.553, 0.424 and 0.735, respectively), and the serum RBP4 was also positively correlated with the WC, FINS, FBG, TC, LDL and HOMA-IR (r=0.317, 0.734, 0.609, 0.528, 0.386 and 0.751, respectively). Stepwise multivariate linear regression analysis suggested that the HOMA-IR, BMI and WC were independently correlated with the level of bile RBP4 (multiple regression equation: Ybile RBP4=2.372XHOMA-IR+0.420XBMI+0.178XWC-26.813), and the serum RBP4 level was correlated with the HOMA-IR and WC independently (multiple regression equation: Yserum RBP4=2.832XHOMA-IR +0.235XWC-20.128). Multiple regression equations showed that HOMA-IR was the strongest correlation factor with RBP4. RBP4 concentrations in serum and bile in the diabetes group are significantly higher than those in the control group. HOMA-IR, BMI and WC are independently correlated with the level of bile RBP4. HOMA-IR and WC are independently correlated with the serum RBP4 level. HOMA-IR is the strongest correlation factor with RBP4. RBP4 might play an important role in the course of gallstone formation in Type 2 diabetes mellitus.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Mechanisms behind the estimation of photosynthesis traits from leaf reflectance observations
NASA Astrophysics Data System (ADS)
Dechant, Benjamin; Cuntz, Matthias; Doktor, Daniel; Vohland, Michael
2016-04-01
Many studies have investigated the reflectance-based estimation of leaf chlorophyll, water and dry matter contents of plants. Only few studies focused on photosynthesis traits, however. The maximum potential uptake of carbon dioxide under given environmental conditions is determined mainly by RuBisCO activity, limiting carboxylation, or the speed of photosynthetic electron transport. These two main limitations are represented by the maximum carboxylation capacity, V cmax,25, and the maximum electron transport rate, Jmax,25. These traits were estimated from leaf reflectance before but the mechanisms underlying the estimation remain rather speculative. The aim of this study was therefore to reveal the mechanisms behind reflectance-based estimation of V cmax,25 and Jmax,25. Leaf reflectance, photosynthetic response curves as well as nitrogen content per area, Narea, and leaf mass per area, LMA, were measured on 37 deciduous tree species. V cmax,25 and Jmax,25 were determined from the response curves. Partial Least Squares (PLS) regression models for the two photosynthesis traits V cmax,25 and Jmax,25 as well as Narea and LMA were studied using a cross-validation approach. Analyses of linear regression models based on Narea and other leaf traits estimated via PROSPECT inversion, PLS regression coefficients and model residuals were conducted in order to reveal the mechanisms behind the reflectance-based estimation. We found that V cmax,25 and Jmax,25 can be estimated from leaf reflectance with good to moderate accuracy for a large number of species and different light conditions. The dominant mechanism behind the estimations was the strong relationship between photosynthesis traits and leaf nitrogen content. This was concluded from very strong relationships between PLS regression coefficients, the model residuals as well as the prediction performance of Narea- based linear regression models compared to PLS regression models. While the PLS regression model for V cmax,25 was fully based on the correlation to Narea, the PLS regression model for Jmax,25 was not entirely based on it. Analyses of the contributions of different parts of the reflectance spectrum revealed that the information contributing to the Jmax,25 PLS regression model in addition to the main source of information, Narea, was mainly located in the visible part of the spectrum (500-900 nm). Estimated chlorophyll content could be excluded as potential source of this extra information. The PLS regression coefficients of the Jmax,25 model indicated possible contributions from chlorophyll fluorescence and cytochrome f content. In summary, we found that the main mechanism behind the estimation of V cmax,25 and Jmax,25 from leaf reflectance observations is the correlation to Narea but that there is additional information related to Jmax,25 mainly in the visible part of the spectrum.
Salturk, Cuneyt; Karakurt, Zuhal; Takir, Huriye Berk; Balci, Merih; Kargin, Feyza; Mocin, Ozlem Yazıcıoglu; Gungor, Gokay; Ozmen, Ipek; Oztas, Selahattin; Yalcinsoy, Murat; Evin, Ruya; Ozturk, Murat; Adiguzel, Nalan
2015-01-01
The objective of this study was to compare the change in 6-minute walking distance (6MWD) in 1 year as an indicator of exercise capacity among patients undergoing home non-invasive mechanical ventilation (NIMV) due to chronic hypercapnic respiratory failure (CHRF) caused by different etiologies. This retrospective cohort study was conducted in a tertiary pulmonary disease hospital in patients who had completed 1-year follow-up under home NIMV because of CHRF with different etiologies (ie, chronic obstructive pulmonary disease [COPD], obesity hypoventilation syndrome [OHS], kyphoscoliosis [KS], and diffuse parenchymal lung disease [DPLD]), between January 2011 and January 2012. The results of arterial blood gas (ABG) analyses and spirometry, and 6MWD measurements with 12-month interval were recorded from the patient files, in addition to demographics, comorbidities, and body mass indices. The groups were compared in terms of 6MWD via analysis of variance (ANOVA) and multiple linear regression (MLR) analysis (independent variables: analysis age, sex, baseline 6MWD, baseline forced expiratory volume in 1 second, and baseline partial carbon dioxide pressure, in reference to COPD group). A total of 105 patients with a mean age (± standard deviation) of 61±12 years of whom 37 had COPD, 34 had OHS, 20 had KS, and 14 had DPLD were included in statistical analysis. There were no significant differences between groups in the baseline and delta values of ABG and spirometry findings. Both univariate ANOVA and MLR showed that the OHS group had the lowest baseline 6MWD and the highest decrease in 1 year (linear regression coefficient -24.48; 95% CI -48.74 to -0.21, P=0.048); while the KS group had the best baseline values and the biggest improvement under home NIMV (linear regression coefficient 26.94; 95% CI -3.79 to 57.66, P=0.085). The 6MWD measurements revealed improvement in exercise capacity test in CHRF patients receiving home NIMV treatment on long-term depends on etiological diagnoses.
Puchtel, I.S.; Walker, R.J.; James, O.B.; Kring, D.A.
2008-01-01
To characterize the compositions of materials accreted to the Earth-Moon system between about 4.5 and 3.8 Ga, we have determined Os isotopic compositions and some highly siderophile element (HSE: Re, Os, Ir, Ru, Pt, and Pd) abundances in 48 subsamples of six lunar breccias. These are: Apollo 17 poikilitic melt breccias 72395 and 76215; Apollo 17 aphanitic melt breccias 73215 and 73255; Apollo 14 polymict breccia 14321; and lunar meteorite NWA482, a crystallized impact melt. Plots of Ir versus other HSE define excellent linear correlations, indicating that all data sets likely represent dominantly two-component mixtures of a low-HSE target, presumably endogenous component, and a high-HSE, presumably exogenous component. Linear regressions of these trends yield intercepts that are statistically indistinguishable from zero for all HSE, except for Ru and Pd in two samples. The slopes of the linear regressions are insensitive to target rock contributions of Ru and Pd of the magnitude observed; thus, the trendline slopes approximate the elemental ratios present in the impactor components contributed to these rocks. The 187Os/188Os and regression-derived elemental ratios for the Apollo 17 aphanitic melt breccias and the lunar meteorite indicate that the impactor components in these samples have close affinities to chondritic meteorites. The HSE in the Apollo 17 aphanitic melt breccias, however, might partially or entirely reflect the HSE characteristics of HSE-rich granulitic breccia clasts that were incorporated in the impact melt at the time of its creation. In this case, the HSE characteristics of these rocks may reflect those of an impactor that predated the impact event that led to the creation of the melt breccias. The impactor components in the Apollo 17 poikilitic melt breccias and in the Apollo 14 breccia have higher 187Os/188Os, Pt/Ir, and Ru/Ir and lower Os/Ir than most chondrites. These compositions suggest that the impactors they represent were chemically distinct from known chondrite types, and possibly represent a type of primitive material not currently delivered to Earth as meteorites. ?? 2008 Elsevier Ltd.
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Curran, Christopher A.; Eng, Ken; Konrad, Christopher P.
2012-01-01
Regional low-flow regression models for estimating Q7,10 at ungaged stream sites are developed from the records of daily discharge at 65 continuous gaging stations (including 22 discontinued gaging stations) for the purpose of evaluating explanatory variables. By incorporating the base-flow recession time constant τ as an explanatory variable in the regression model, the root-mean square error for estimating Q7,10 at ungaged sites can be lowered to 72 percent (for known values of τ), which is 42 percent less than if only basin area and mean annual precipitation are used as explanatory variables. If partial-record sites are included in the regression data set, τ must be estimated from pairs of discharge measurements made during continuous periods of declining low flows. Eight measurement pairs are optimal for estimating τ at partial-record sites, and result in a lowering of the root-mean square error by 25 percent. A low-flow survey strategy that includes paired measurements at partial-record sites requires additional effort and planning beyond a standard strategy, but could be used to enhance regional estimates of τ and potentially reduce the error of regional regression models for estimating low-flow characteristics at ungaged sites.
Prediction and causal reasoning in planning
NASA Technical Reports Server (NTRS)
Dean, T.; Boddy, M.
1987-01-01
Nonlinear planners are often touted as having an efficiency advantage over linear planners. The reason usually given is that nonlinear planners, unlike their linear counterparts, are not forced to make arbitrary commitments to the order in which actions are to be performed. This ability to delay commitment enables nonlinear planners to solve certain problems with far less effort than would be required of linear planners. Here, it is argued that this advantage is bought with a significant reduction in the ability of a nonlinear planner to accurately predict the consequences of actions. Unfortunately, the general problem of predicting the consequences of a partially ordered set of actions is intractable. In gaining the predictive power of linear planners, nonlinear planners sacrifice their efficiency advantage. There are, however, other advantages to nonlinear planning (e.g., the ability to reason about partial orders and incomplete information) that make it well worth the effort needed to extend nonlinear methods. A framework is supplied for causal inference that supports reasoning about partially ordered events and actions whose effects depend upon the context in which they are executed. As an alternative to a complete but potentially exponential-time algorithm, researchers provide a provably sound polynomial-time algorithm for predicting the consequences of partially ordered events.
Johnson, Brent A
2009-10-01
We consider estimation and variable selection in the partial linear model for censored data. The partial linear model for censored data is a direct extension of the accelerated failure time model, the latter of which is a very important alternative model to the proportional hazards model. We extend rank-based lasso-type estimators to a model that may contain nonlinear effects. Variable selection in such partial linear model has direct application to high-dimensional survival analyses that attempt to adjust for clinical predictors. In the microarray setting, previous methods can adjust for other clinical predictors by assuming that clinical and gene expression data enter the model linearly in the same fashion. Here, we select important variables after adjusting for prognostic clinical variables but the clinical effects are assumed nonlinear. Our estimator is based on stratification and can be extended naturally to account for multiple nonlinear effects. We illustrate the utility of our method through simulation studies and application to the Wisconsin prognostic breast cancer data set.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Deep Wavelet Scattering for Quantum Energy Regression
NASA Astrophysics Data System (ADS)
Hirn, Matthew
Physical functionals are usually computed as solutions of variational problems or from solutions of partial differential equations, which may require huge computations for complex systems. Quantum chemistry calculations of ground state molecular energies is such an example. Indeed, if x is a quantum molecular state, then the ground state energy E0 (x) is the minimum eigenvalue solution of the time independent Schrödinger Equation, which is computationally intensive for large systems. Machine learning algorithms do not simulate the physical system but estimate solutions by interpolating values provided by a training set of known examples {(xi ,E0 (xi) } i <= n . However, precise interpolations may require a number of examples that is exponential in the system dimension, and are thus intractable. This curse of dimensionality may be circumvented by computing interpolations in smaller approximation spaces, which take advantage of physical invariants. Linear regressions of E0 over a dictionary Φ ={ϕk } k compute an approximation E 0 as: E 0 (x) =∑kwkϕk (x) , where the weights {wk } k are selected to minimize the error between E0 and E 0 on the training set. The key to such a regression approach then lies in the design of the dictionary Φ. It must be intricate enough to capture the essential variability of E0 (x) over the molecular states x of interest, while simple enough so that evaluation of Φ (x) is significantly less intensive than a direct quantum mechanical computation (or approximation) of E0 (x) . In this talk we present a novel dictionary Φ for the regression of quantum mechanical energies based on the scattering transform of an intermediate, approximate electron density representation ρx of the state x. The scattering transform has the architecture of a deep convolutional network, composed of an alternating sequence of linear filters and nonlinear maps. Whereas in many deep learning tasks the linear filters are learned from the training data, here the physical properties of E0 (invariance to isometric transformations of the state x, stable to deformations of x) are leveraged to design a collection of linear filters ρx *ψλ for an appropriate wavelet ψ. These linear filters are composed with the nonlinear modulus operator, and the process is iterated upon so that at each layer stable, invariant features are extracted: ϕk (x) = ∥ | | ρx *ψλ1 | * ψλ2 | * ... *ψλm ∥ , k = (λ1 , ... ,λm) , m = 1 , 2 , ... The scattering transform thus encodes not only interactions at multiple scales (in the first layer, m = 1), but also features that encode complex phenomena resulting from a cascade of interactions across scales (in subsequent layers, m >= 2). Numerical experiments give state of the art accuracy over data bases of organic molecules, while theoretical results guarantee performance for the component of the ground state energy resulting from Coulombic interactions. Supported by the ERC InvariantClass 320959 Grant.
Sensitivity Analysis of the Integrated Medical Model for ISS Programs
NASA Technical Reports Server (NTRS)
Goodenow, D. A.; Myers, J. G.; Arellano, J.; Boley, L.; Garcia, Y.; Saile, L.; Walton, M.; Kerstman, E.; Reyes, D.; Young, M.
2016-01-01
Sensitivity analysis estimates the relative contribution of the uncertainty in input values to the uncertainty of model outputs. Partial Rank Correlation Coefficient (PRCC) and Standardized Rank Regression Coefficient (SRRC) are methods of conducting sensitivity analysis on nonlinear simulation models like the Integrated Medical Model (IMM). The PRCC method estimates the sensitivity using partial correlation of the ranks of the generated input values to each generated output value. The partial part is so named because adjustments are made for the linear effects of all the other input values in the calculation of correlation between a particular input and each output. In SRRC, standardized regression-based coefficients measure the sensitivity of each input, adjusted for all the other inputs, on each output. Because the relative ranking of each of the inputs and outputs is used, as opposed to the values themselves, both methods accommodate the nonlinear relationship of the underlying model. As part of the IMM v4.0 validation study, simulations are available that predict 33 person-missions on ISS and 111 person-missions on STS. These simulated data predictions feed the sensitivity analysis procedures. The inputs to the sensitivity procedures include the number occurrences of each of the one hundred IMM medical conditions generated over the simulations and the associated IMM outputs: total quality time lost (QTL), number of evacuations (EVAC), and number of loss of crew lives (LOCL). The IMM team will report the results of using PRCC and SRRC on IMM v4.0 predictions of the ISS and STS missions created as part of the external validation study. Tornado plots will assist in the visualization of the condition-related input sensitivities to each of the main outcomes. The outcomes of this sensitivity analysis will drive review focus by identifying conditions where changes in uncertainty could drive changes in overall model output uncertainty. These efforts are an integral part of the overall verification, validation, and credibility review of IMM v4.0.
Takatsu, Yasuo; Ueyama, Tsuyoshi; Miyati, Tosiaki; Yamamura, Kenichirou
2016-12-01
The image characteristics in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) depend on the partial Fourier fraction and contrast medium concentration. These characteristics were assessed and the modulation transfer function (MTF) was calculated by computer simulation. A digital phantom was created from signal intensity data acquired at different contrast medium concentrations on a breast model. The frequency images [created by fast Fourier transform (FFT)] were divided into 512 parts and rearranged to form a new image. The inverse FFT of this image yielded the MTF. From the reference data, three linear models (low, medium, and high) and three exponential models (slow, medium, and rapid) of the signal intensity were created. Smaller partial Fourier fractions, and higher gradients in the linear models, corresponded to faster MTF decline. The MTF more gradually decreased in the exponential models than in the linear models. The MTF, which reflects the image characteristics in DCE-MRI, was more degraded as the partial Fourier fraction decreased.
Quantum Private Comparison Protocol with Linear Optics
NASA Astrophysics Data System (ADS)
Luo, Qing-bin; Yang, Guo-wu; She, Kun; Li, Xiaoyu
2016-12-01
In this paper, we propose an innovative quantum private comparison(QPC) protocol based on partial Bell-state measurement from the view of linear optics, which enabling two parties to compare the equality of their private information with the help of a semi-honest third party. Partial Bell-state measurement has been realized by using only linear optical elements in experimental measurement-device-independent quantum key distribution(MDI-QKD) schemes, which makes us believe that our protocol can be realized in the near future. The security analysis shows that the participants will not leak their private information.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
USE IT OR LOSE IT: EAT AND EXERCISE DURING RADIOTHERAPY OR CHEMORADIOTHERAPY FOR PHARYNGEAL CANCERS
Hutcheson, Katherine A.; Bhayani, Mihir K.; Beadle, Beth M.; Gold, Kathryn A.; Shinn, Eileen H.; Lai, Stephen Y.; Lewin, Jan
2014-01-01
Objective Proactive swallowing therapy promotes ongoing use of the swallowing mechanism during radiotherapy through 2 goals: eat and exercise. The purpose of this study was to evaluate the independent effects of maintaining oral intake throughout treatment and preventive swallowing exercise. Design Retrospective observational study. Setting The University of Texas MD Anderson Cancer Center, Houston. Patients The study included 497 patients treated with definitive radiotherapy (RT) or chemoradiation (CRT) for pharyngeal cancer (458 oropharynx, 39 hypopharynx) between 2002 and 2008. Main Outcome Measures Swallowing-related endpoints were: final diet after RT/CRT and length of gastrostomy-dependence. Primary independent variables included per oral (PO) status at the end of RT/CRT (nothing per oral [NPO], partial PO, 100% PO) and swallowing exercise adherence. Multiple linear regression and ordered logistic regression models were analyzed. Results At the conclusion of RT/CRT, 131 (26%) were NPO, 74% were PO (167 [34%] partial, 199 [40%] full). Fifty-eight percent (286/497) reported adherence to swallowing exercises. Maintenance of PO intake during RT/CRT and swallowing exercise adherence were independently associated (p<0.05) with better long-term diet after RT/CRT and shorter length of gastrostomy dependence in models adjusted for tumor and treatment burden. Conclusions Data indicate independent, positive associations between maintenance of PO intake throughout RT/CRT and swallowing exercise adherence with long-term swallowing outcomes. Patients who either eat or exercise fare better than those who do neither. Patients who both eat and exercise have the highest return to a regular diet and shortest gastrostomy dependence. PMID:24051544
Rapid Quantitative Determination of Squalene in Shark Liver Oils by Raman and IR Spectroscopy.
Hall, David W; Marshall, Susan N; Gordon, Keith C; Killeen, Daniel P
2016-01-01
Squalene is sourced predominantly from shark liver oils and to a lesser extent from plants such as olives. It is used for the production of surfactants, dyes, sunscreen, and cosmetics. The economic value of shark liver oil is directly related to the squalene content, which in turn is highly variable and species-dependent. Presented here is a validated gas chromatography-mass spectrometry analysis method for the quantitation of squalene in shark liver oils, with an accuracy of 99.0 %, precision of 0.23 % (standard deviation), and linearity of >0.999. The method has been used to measure the squalene concentration of 16 commercial shark liver oils. These reference squalene concentrations were related to infrared (IR) and Raman spectra of the same oils using partial least squares regression. The resultant models were suitable for the rapid quantitation of squalene in shark liver oils, with cross-validation r (2) values of >0.98 and root mean square errors of validation of ≤4.3 % w/w. Independent test set validation of these models found mean absolute deviations of the 4.9 and 1.0 % w/w for the IR and Raman models, respectively. Both techniques were more accurate than results obtained by an industrial refractive index analysis method, which is used for rapid, cheap quantitation of squalene in shark liver oils. In particular, the Raman partial least squares regression was suited to quantitative squalene analysis. The intense and highly characteristic Raman bands of squalene made quantitative analysis possible irrespective of the lipid matrix.
Estimation of Relative Economic Weights of Hanwoo Carcass Traits Based on Carcass Market Price
Choy, Yun Ho; Park, Byoung Ho; Choi, Tae Jung; Choi, Jae Gwan; Cho, Kwang Hyun; Lee, Seung Soo; Choi, You Lim; Koh, Kyung Chul; Kim, Hyo Sun
2012-01-01
The objective of this study was to estimate economic weights of Hanwoo carcass traits that can be used to build economic selection indexes for selection of seedstocks. Data from carcass measures for determining beef yield and quality grades were collected and provided by the Korean Institute for Animal Products Quality Evaluation (KAPE). Out of 1,556,971 records, 476,430 records collected from 13 abattoirs from 2008 to 2010 after deletion of outlying observations were used to estimate relative economic weights of bid price per kg carcass weight on cold carcass weight (CW), eye muscle area (EMA), backfat thickness (BF) and marbling score (MS) and the phenotypic relationships among component traits. Price of carcass tended to increase linearly as yield grades or quality grades, in marginal or in combination, increased. Partial regression coefficients for MS, EMA, BF, and for CW in original scales were +948.5 won/score, +27.3 won/cm2, −95.2 won/mm and +7.3 won/kg when all three sex categories were taken into account. Among four grade determining traits, relative economic weight of MS was the greatest. Variations in partial regression coefficients by sex categories were great but the trends in relative weights for each carcass measures were similar. Relative economic weights of four traits in integer values when standardized measures were fit into covariance model were +4:+1:−1:+1 for MS:EMA:BF:CW. Further research is required to account for the cost of production per unit carcass weight or per unit production under different economic situations. PMID:25049531
NASA Astrophysics Data System (ADS)
Hvilshøj, S.; Jensen, K. H.; Barlebo, H. C.; Madsen, B.
1999-08-01
Inverse numerical modeling was applied to analyze pumping tests of partially penetrating wells carried out in three wells established in an unconfined aquifer in Vejen, Denmark, where extensive field investigations had previously been carried out, including tracer tests, mini-slug tests, and other hydraulic tests. Drawdown data from multiple piezometers located at various horizontal and vertical distances from the pumping well were included in the optimization. Horizontal and vertical hydraulic conductivities, specific storage, and specific yield were estimated, assuming that the aquifer was either a homogeneous system with vertical anisotropy or composed of two or three layers of different hydraulic properties. In two out of three cases, a more accurate interpretation was obtained for a multi-layer model defined on the basis of lithostratigraphic information obtained from geological descriptions of sediment samples, gammalogs, and flow-meter tests. Analysis of the pumping tests resulted in values for horizontal hydraulic conductivities that are in good accordance with those obtained from slug tests and mini-slug tests. Besides the horizontal hydraulic conductivity, it is possible to determine the vertical hydraulic conductivity, specific yield, and specific storage based on a pumping test of a partially penetrating well. The study demonstrates that pumping tests of partially penetrating wells can be analyzed using inverse numerical models. The model used in the study was a finite-element flow model combined with a non-linear regression model. Such a model can accommodate more geological information and complex boundary conditions, and the parameter-estimation procedure can be formalized to obtain optimum estimates of hydraulic parameters and their standard deviations.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Electric current-producing device having sulfone-based electrolyte
Angell, Charles Austen; Sun, Xiao-Guang
2010-11-16
Electrolytic solvents and applications of such solvents including electric current-producing devices. For example, a solvent can include a sulfone compound of R1--SO2--R2, with R1 being an alkyl group and R2 a partially oxygenated alkyl group, to exhibit high chemical and thermal stability and high oxidation resistance. For another example, a battery can include, between an anode and a cathode, an electrolyte which includes ionic electrolyte salts and a non-aqueous electrolyte solvent which includes a non-symmetrical, non-cyclic sulfone. The sulfone has a formula of R1--SO2--R2, wherein R1 is a linear or branched alkyl or partially or fully fluorinated linear or branched alkyl group having 1 to 7 carbon atoms, and R2 is a linear or branched or partially or fully fluorinated linear or branched oxygen containing alkyl group having 1 to 7 carbon atoms. The electrolyte can include an electrolyte co-solvent and an electrolyte additive for protective layer formation.
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
GIS Tools to Estimate Average Annual Daily Traffic
DOT National Transportation Integrated Search
2012-06-01
This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Optimal moving grids for time-dependent partial differential equations
NASA Technical Reports Server (NTRS)
Wathen, A. J.
1992-01-01
Various adaptive moving grid techniques for the numerical solution of time-dependent partial differential equations were proposed. The precise criterion for grid motion varies, but most techniques will attempt to give grids on which the solution of the partial differential equation can be well represented. Moving grids are investigated on which the solutions of the linear heat conduction and viscous Burgers' equation in one space dimension are optimally approximated. Precisely, the results of numerical calculations of optimal moving grids for piecewise linear finite element approximation of PDE solutions in the least-squares norm are reported.
NASA Astrophysics Data System (ADS)
Fang, Wei; Huang, Shengzhi; Huang, Qiang; Huang, Guohe; Meng, Erhao; Luan, Jinkai
2018-06-01
In this study, reference evapotranspiration (ET0) forecasting models are developed for the least economically developed regions subject to meteorological data scarcity. Firstly, the partial mutual information (PMI) capable of capturing the linear and nonlinear dependence is investigated regarding its utility to identify relevant predictors and exclude those that are redundant through the comparison with partial linear correlation. An efficient input selection technique is crucial for decreasing model data requirements. Then, the interconnection between global climate indices and regional ET0 is identified. Relevant climatic indices are introduced as additional predictors to comprise information regarding ET0, which ought to be provided by meteorological data unavailable. The case study in the Jing River and Beiluo River basins, China, reveals that PMI outperforms the partial linear correlation in excluding the redundant information, favouring the yield of smaller predictor sets. The teleconnection analysis identifies the correlation between Nino 1 + 2 and regional ET0, indicating influences of ENSO events on the evapotranspiration process in the study area. Furthermore, introducing Nino 1 + 2 as predictors helps to yield more accurate ET0 forecasts. A model performance comparison also shows that non-linear stochastic models (SVR or RF with input selection through PMI) do not always outperform linear models (MLR with inputs screen by linear correlation). However, the former can offer quite comparable performance depending on smaller predictor sets. Therefore, efforts such as screening model inputs through PMI and incorporating global climatic indices interconnected with ET0 can benefit the development of ET0 forecasting models suitable for data-scarce regions.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Non-linear dynamics of compound sawteeth in tokamaks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, J.-H., E-mail: jae-heon.ahn@polytechnique.edu; Garbet, X.; Sabot, R.
2016-05-15
Compound sawteeth is studied with the XTOR-2F code. Non-linear full 3D magnetohydrodynamic simulations show that the plasma hot core is radially displaced and rotates during the partial crash, but is not fully expelled out of the q = 1 surface. Partial crashes occur when the radius of the q = 1 surface exceeds a critical value, at fixed poloidal beta. This critical value depends on the plasma elongation. The partial crash time is larger than the collapse time of an ordinary sawtooth, likely due to a weaker diamagnetic stabilization. This suggests that partial crashes result from a competition between destabilizing effects such as themore » q = 1 radius and diamagnetic stabilization.« less
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
Davies, M A
2015-10-01
Salicylic acid (SA) is a widely used active in anti-acne face wash products. Only about 1-2% of the total dose is actually deposited on skin during washing, and more efficient deposition systems are sought. The objective of this work was to develop an improved method, including data analysis, to measure deposition of SA from wash-off formulae. Full fluorescence excitation-emission matrices (EEMs) were acquired for non-invasive measurement of deposition of SA from wash-off products. Multivariate data analysis methods - parallel factor analysis and N-way partial least-squares regression - were used to develop and compare deposition models on human volunteers and porcine skin. Although both models are useful, there are differences between them. First, the range of linear response to dosages of SA was 60 μg cm(-2) in vivo compared to 25 μg cm(-2) on porcine skin. Second, the actual shape of the SA band was different between substrates. The methods employed in this work highlight the utility of the use of EEMs, in conjunction with multivariate analysis tools such as parallel factor analysis and multiway partial least-squares calibration, in determining sources of spectral variability in skin and quantification of exogenous species deposited on skin. The human model exhibited the widest range of linearity, but porcine model is still useful up to deposition levels of 25 μg cm(-2) or used with nonlinear calibration models. © 2015 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.
McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N
2017-01-01
Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed.
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers
McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N.
2017-01-01
Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed. PMID:28228741
Mason, Ross; Kapoor, Anil; Liu, Zhihui; Saarela, Olli; Tanguay, Simon; Jewett, Michael; Finelli, Antonio; Lacombe, Louis; Kawakami, Jun; Moore, Ronald; Morash, Christopher; Black, Peter; Rendon, Ricardo A
2016-11-01
Patients who undergo surgical management of renal cell carcinoma (RCC) are at risk for chronic kidney disease and its sequelae. This study describes the natural history of renal function after radical and partial nephrectomy and explores factors associated with postoperative decline in renal function. This is a multi-institutional cohort study of patients in the Canadian Kidney Cancer Information System who underwent partial or radical nephrectomy for RCC. Estimated glomerular filtration rate (eGFR) and stage of chronic kidney disease were determined preoperatively and at 3, 12, and 24 months postoperatively. Linear regression was used to determine the association between postoperative eGFR and type of surgery (radical vs. partial), duration of ischemia, ischemia type (warm vs. cold), and tumor size. With a median follow-up of 26 months, 1,379 patients were identified from the Canadian Kidney Cancer Information System database including 665 and 714 who underwent partial and radical nephrectomy, respectively. Patients undergoing radical nephrectomy had a lower eGFR (mean = 19ml/min/1.73m 2 lower) at 3, 12, and 24 months postoperatively (P<0.001). Decline in renal function occurred early and remained stable throughout follow-up. A lower preoperative eGFR and increasing age were also associated with a lower postoperative eGFR (P<0.01). Ischemia type and duration were not predictive of postoperative decline in eGFR (P>0.05). Severe renal failure (eGFR<30ml/min/1.73m 2 ) developed postoperatively in 12.5% and 4.1% of radical and partial nephrectomy patients, respectively (P<0.001). After the initial postoperative decline, renal function remains stable in patients undergoing surgery for RCC. Patients undergoing radical nephrectomy have a greater long-term reduction in renal function compared with those undergoing partial nephrectomy. Ischemia duration and type are not predictive of postoperative renal function when adhering to generally short ischemia durations. Copyright © 2016 Elsevier Inc. All rights reserved.
Ahlgren, André; Wirestam, Ronnie; Petersen, Esben Thade; Ståhlberg, Freddy; Knutsson, Linda
2014-09-01
Quantitative perfusion MRI based on arterial spin labeling (ASL) is hampered by partial volume effects (PVEs), arising due to voxel signal cross-contamination between different compartments. To address this issue, several partial volume correction (PVC) methods have been presented. Most previous methods rely on segmentation of a high-resolution T1 -weighted morphological image volume that is coregistered to the low-resolution ASL data, making the result sensitive to errors in the segmentation and coregistration. In this work, we present a methodology for partial volume estimation and correction, using only low-resolution ASL data acquired with the QUASAR sequence. The methodology consists of a T1 -based segmentation method, with no spatial priors, and a modified PVC method based on linear regression. The presented approach thus avoids prior assumptions about the spatial distribution of brain compartments, while also avoiding coregistration between different image volumes. Simulations based on a digital phantom as well as in vivo measurements in 10 volunteers were used to assess the performance of the proposed segmentation approach. The simulation results indicated that QUASAR data can be used for robust partial volume estimation, and this was confirmed by the in vivo experiments. The proposed PVC method yielded probable perfusion maps, comparable to a reference method based on segmentation of a high-resolution morphological scan. Corrected gray matter (GM) perfusion was 47% higher than uncorrected values, suggesting a significant amount of PVEs in the data. Whereas the reference method failed to completely eliminate the dependence of perfusion estimates on the volume fraction, the novel approach produced GM perfusion values independent of GM volume fraction. The intra-subject coefficient of variation of corrected perfusion values was lowest for the proposed PVC method. As shown in this work, low-resolution partial volume estimation in connection with ASL perfusion estimation is feasible, and provides a promising tool for decoupling perfusion and tissue volume. Copyright © 2014 John Wiley & Sons, Ltd.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Decomposing associations between acculturation and drinking in Mexican Americans
Mills, Britain A.; Caetano, Raul
2011-01-01
Background Acculturation to life in the United States is a known predictor of Hispanic drinking behavior. We compare the ability of 2 theoretical models of this effect – sociocultural theory and general stress theory – to account for associations between acculturation and drinking in a sample of Mexican Americans. Limitations of previous evaluations of these theoretical models are addressed by using a broader range of hypothesized cognitive mediators and a more direct measure of acculturative stress. In addition, we explore nonlinearities as possible underpinnings of attenuated acculturation effects among males. Methods Respondents (N = 2,595, current drinker N = 1,351) were interviewed as part of 2 recent multistage probability samples in a study of drinking behavior among Mexican Americans in the United States. The ability of norms, drinking motives, alcohol expectancies, and acculturation stress to account for relations between acculturation and drinking outcomes (volume and heavy drinking days) were assessed with a hierarchical linear regression strategy. Nonlinear trends were assessed by modeling quadratic effects of acculturation and acculturation stress on cognitive mediators and drinking outcomes. Results Consistent with previous findings, acculturation effects on drinking outcomes were stronger for females than males. Among females, only drinking motives explained acculturation associations with volume or heavy drinking days. Among males, acculturation was linked to increases in norms, and norms were positive predictors of drinking outcomes. However, adjusted effects of acculturation were non-existent or trending in a negative direction, which counter-acted this indirect normative influence. Acculturation stress did not explain positive associations between acculturation and drinking. Conclusions Stress and alcohol outcome expectancies play little role in the positive linear association between acculturation and drinking outcomes, but drinking motives appears to at least partially account for this effect. Consistent with recent reports, these results challenge stress models of linear acculturation effects on drinking outcomes and provide (partial) support for sociocultural models. Inconsistent mediation patterns – rather than nonlinearities – represented a more plausible statistical description of why acculturation-drinking associations are weakened among males. PMID:22316139
Decomposing associations between acculturation and drinking in Mexican Americans.
Mills, Britain A; Caetano, Raul
2012-07-01
Acculturation to life in the United States is a known predictor of Hispanic drinking behavior. We compare the ability of 2 theoretical models of this effect-sociocultural theory and general stress theory-to account for associations between acculturation and drinking in a sample of Mexican Americans. Limitations of previous evaluations of these theoretical models are addressed using a broader range of hypothesized cognitive mediators and a more direct measure of acculturative stress. In addition, we explore nonlinearities as possible underpinnings of attenuated acculturation effects among men. Respondents (N = 2,595, current drinker N = 1,351) were interviewed as part of 2 recent multistage probability samples in a study of drinking behavior among Mexican Americans in the United States. The ability of norms, drinking motives, alcohol expectancies, and acculturation stress to account for relations between acculturation and drinking outcomes (volume and heavy drinking days) were assessed with a hierarchical linear regression strategy. Nonlinear trends were assessed by modeling quadratic effects of acculturation and acculturation stress on cognitive mediators and drinking outcomes. Consistent with previous findings, acculturation effects on drinking outcomes were stronger for women than men. Among women, only drinking motives explained acculturation associations with volume or heavy drinking days. Among men, acculturation was linked to increases in norms, and norms were positive predictors of drinking outcomes. However, adjusted effects of acculturation were nonexistent or trending in a negative direction, which counteracted this indirect normative influence. Acculturation stress did not explain the positive associations between acculturation and drinking. Stress and alcohol outcome expectancies play little role in the positive linear association between acculturation and drinking outcomes, but drinking motives appear to at least partially account for this effect. Consistent with recent reports, these results challenge stress models of linear acculturation effects on drinking outcomes and provide (partial) support for sociocultural models. Inconsistent mediation patterns-rather than nonlinearities-represented a more plausible statistical description of why acculturation-drinking associations are weakened among men. Copyright © 2012 by the Research Society on Alcoholism.
Rasouli, Zolaikha; Ghavami, Raouf
2016-08-05
Vanillin (VA), vanillic acid (VAI) and syringaldehyde (SIA) are important food additives as flavor enhancers. The current study for the first time is devote to the application of partial least square (PLS-1), partial robust M-regression (PRM) and feed forward neural networks (FFNNs) as linear and nonlinear chemometric methods for the simultaneous detection of binary and ternary mixtures of VA, VAI and SIA using data extracted directly from UV-spectra with overlapped peaks of individual analytes. Under the optimum experimental conditions, for each compound a linear calibration was obtained in the concentration range of 0.61-20.99 [LOD=0.12], 0.67-23.19 [LOD=0.13] and 0.73-25.12 [LOD=0.15] μgmL(-1) for VA, VAI and SIA, respectively. Four calibration sets of standard samples were designed by combination of a full and fractional factorial designs with the use of the seven and three levels for each factor for binary and ternary mixtures, respectively. The results of this study reveal that both the methods of PLS-1 and PRM are similar in terms of predict ability each binary mixtures. The resolution of ternary mixture has been accomplished by FFNNs. Multivariate curve resolution-alternating least squares (MCR-ALS) was applied for the description of spectra from the acid-base titration systems each individual compound, i.e. the resolution of the complex overlapping spectra as well as to interpret the extracted spectral and concentration profiles of any pure chemical species identified. Evolving factor analysis (EFA) and singular value decomposition (SVD) were used to distinguish the number of chemical species. Subsequently, their corresponding dissociation constants were derived. Finally, FFNNs has been used to detection active compounds in real and spiked water samples. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Rasouli, Zolaikha; Ghavami, Raouf
2016-08-01
Vanillin (VA), vanillic acid (VAI) and syringaldehyde (SIA) are important food additives as flavor enhancers. The current study for the first time is devote to the application of partial least square (PLS-1), partial robust M-regression (PRM) and feed forward neural networks (FFNNs) as linear and nonlinear chemometric methods for the simultaneous detection of binary and ternary mixtures of VA, VAI and SIA using data extracted directly from UV-spectra with overlapped peaks of individual analytes. Under the optimum experimental conditions, for each compound a linear calibration was obtained in the concentration range of 0.61-20.99 [LOD = 0.12], 0.67-23.19 [LOD = 0.13] and 0.73-25.12 [LOD = 0.15] μg mL- 1 for VA, VAI and SIA, respectively. Four calibration sets of standard samples were designed by combination of a full and fractional factorial designs with the use of the seven and three levels for each factor for binary and ternary mixtures, respectively. The results of this study reveal that both the methods of PLS-1 and PRM are similar in terms of predict ability each binary mixtures. The resolution of ternary mixture has been accomplished by FFNNs. Multivariate curve resolution-alternating least squares (MCR-ALS) was applied for the description of spectra from the acid-base titration systems each individual compound, i.e. the resolution of the complex overlapping spectra as well as to interpret the extracted spectral and concentration profiles of any pure chemical species identified. Evolving factor analysis (EFA) and singular value decomposition (SVD) were used to distinguish the number of chemical species. Subsequently, their corresponding dissociation constants were derived. Finally, FFNNs has been used to detection active compounds in real and spiked water samples.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Judd, Kevin
2013-12-01
Many physical and biochemical systems are well modelled as a network of identical non-linear dynamical elements with linear coupling between them. An important question is how network structure affects chaotic dynamics, for example, by patterns of synchronisation and coherence. It is shown that small networks can be characterised precisely into patterns of exact synchronisation and large networks characterised by partial synchronisation at the local and global scale. Exact synchronisation modes are explained using tools of symmetry groups and invariance, and partial synchronisation is explained by finite-time shadowing of exact synchronisation modes.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2013-09-01
Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively.
[Job Demands-Resources, exhaustion and work engagement in a long-term care institution].
Conway, P M; Neri, L; Campanini, P; Francioli, L; Camerino, D; Punzi, S; Fichera, G P; Costa, G
2012-01-01
In this study, we aimed at testing the main hypotheses of the Job Demands-Resources model (JD-R) in a sample of employees (n = 205, mainly healthcare workers) of a long-term care institution located in Northern Italy. Hierarchical linear regression analyses show that almost all job demands considered were significantly associated with higher general psycho-physical exhaustion (beta ranging from 0.14 to 0.29), whereas more unfavourable scores in all job resources were associated with lower work engagement (from -0.27 to -0.51). However, also significant cross-over associations were observed, mainly between job resources and exhaustion, with effect sizes comparable with those found for the relationships between job demands and exhaustion. Hence, our study only partially supports the JD-R model. Implications of results for work-related stress management are finally discussed.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra
NASA Astrophysics Data System (ADS)
Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong
2017-08-01
Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
Gröbner Bases and Generation of Difference Schemes for Partial Differential Equations
NASA Astrophysics Data System (ADS)
Gerdt, Vladimir P.; Blinkov, Yuri A.; Mozzhilkin, Vladimir V.
2006-05-01
In this paper we present an algorithmic approach to the generation of fully conservative difference schemes for linear partial differential equations. The approach is based on enlargement of the equations in their integral conservation law form by extra integral relations between unknown functions and their derivatives, and on discretization of the obtained system. The structure of the discrete system depends on numerical approximation methods for the integrals occurring in the enlarged system. As a result of the discretization, a system of linear polynomial difference equations is derived for the unknown functions and their partial derivatives. A difference scheme is constructed by elimination of all the partial derivatives. The elimination can be achieved by selecting a proper elimination ranking and by computing a Gröbner basis of the linear difference ideal generated by the polynomials in the discrete system. For these purposes we use the difference form of Janet-like Gröbner bases and their implementation in Maple. As illustration of the described methods and algorithms, we construct a number of difference schemes for Burgers and Falkowich-Karman equations and discuss their numerical properties.
Peripheral Refraction, Peripheral Eye Length, and Retinal Shape in Myopia.
Verkicharla, Pavan K; Suheimat, Marwan; Schmid, Katrina L; Atchison, David A
2016-09-01
To investigate how peripheral refraction and peripheral eye length are related to retinal shape. Relative peripheral refraction (RPR) and relative peripheral eye length (RPEL) were determined in 36 young adults (M +0.75D to -5.25D) along horizontal and vertical visual field meridians out to ±35° and ±30°, respectively. Retinal shape was determined in terms of vertex radius of curvature Rv, asphericity Q, and equivalent radius of curvature REq using a partial coherence interferometry method involving peripheral eye lengths and model eye raytracing. Second-order polynomial fits were applied to RPR and RPEL as functions of visual field position. Linear regressions were determined for the fits' second order coefficients and for retinal shape estimates as functions of central spherical refraction. Linear regressions investigated relationships of RPR and RPEL with retinal shape estimates. Peripheral refraction, peripheral eye lengths, and retinal shapes were significantly affected by meridian and refraction. More positive (hyperopic) relative peripheral refraction, more negative RPELs, and steeper retinas were found along the horizontal than along the vertical meridian and in myopes than in emmetropes. RPR and RPEL, as represented by their second-order fit coefficients, correlated significantly with retinal shape represented by REq. Effects of meridian and refraction on RPR and RPEL patterns are consistent with effects on retinal shape. Patterns derived from one of these predict the others: more positive (hyperopic) RPR predicts more negative RPEL and steeper retinas, more negative RPEL predicts more positive relative peripheral refraction and steeper retinas, and steeper retinas derived from peripheral eye lengths predict more positive RPR.
Glutamatergic system abnormalities in posttraumatic stress disorder.
Nishi, Daisuke; Hashimoto, Kenji; Noguchi, Hiroko; Hamazaki, Kei; Hamazaki, Tomohito; Matsuoka, Yutaka
2015-12-01
Accumulating evidence suggests involvement of the glutamatergic system in the biological mechanisms of posttraumatic stress disorder (PTSD), but few studies have demonstrated an association between glutamatergic system abnormalities and PTSD diagnosis or severity. We aimed to examine whether abnormalities in serum glutamate and in the glutamine/glutamate ratio were associated with PTSD diagnosis and severity in severely injured patients at risk for PTSD and major depressive disorder (MDD). This is a nested case-control study in TPOP (Tachikawa project for prevention of posttraumatic stress disorder with polyunsaturated fatty acid) trial. Diagnosis and severity of PTSD were assessed 3 months after the accidents using the Clinician-Administered PTSD Scale. The associations of glutamate levels and the glutamine/glutamate ratio with diagnosis and severity of PTSD and MDD were investigated by univariate and multiple linear regression analyses. Ninety-seven of 110 participants (88 %) completed assessments at 3 months. Serum glutamate levels were significantly higher for participants with full or partial PTSD than for participants without PTSD (p = 0.049) and for participants with MDD than for participants without MDD (p = 0.048). Multiple linear regression analyses showed serum glutamate levels were significantly positively associated with PTSD severity (p = 0.02) and MDD severity (p = 0.03). The glutamine/glutamate ratio was also significantly inversely associated with PTSD severity (p = 0.03), but not with MDD severity (p = 0.07). These findings suggest that the glutamatergic system may play a major role in the pathogenesis of PTSD and the need for new treatments targeting the glutamatergic system to be developed for PTSD.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
USDA-ARS?s Scientific Manuscript database
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Fernández-Novales, Juan; López, María-Isabel; González-Caballero, Virginia; Ramírez, Pilar; Sánchez, María-Teresa
2011-06-01
Volumic mass-a key component of must quality control tests during alcoholic fermentation-is of great interest to the winemaking industry. Transmitance near-infrared (NIR) spectra of 124 must samples over the range of 200-1,100-nm were obtained using a miniature spectrometer. The performance of this instrument to predict volumic mass was evaluated using partial least squares (PLS) regression and multiple linear regression (MLR). The validation statistics coefficient of determination (r(2)) and the standard error of prediction (SEP) were r(2) = 0.98, n = 31 and r(2) = 0.96, n = 31, and SEP = 5.85 and 7.49 g/dm(3) for PLS and MLR equations developed to fit reference data for volumic mass and spectral data. Comparison of results from MLR and PLS demonstrates that a MLR model with six significant wavelengths (P < 0.05) fit volumic mass data to transmittance (1/T) data slightly worse than a more sophisticated PLS model using the full scanning range. The results suggest that NIR spectroscopy is a suitable technique for predicting volumic mass during alcoholic fermentation, and that a low-cost NIR instrument can be used for this purpose.
Consideration of species community composition in statistical ...
Diseases are increasing in marine ecosystems, and these increases have been attributed to a number of environmental factors including climate change, pollution, and overfishing. However, many studies pool disease prevalence into taxonomic groups, disregarding host species composition when comparing sites or assessing environmental impacts on patterns of disease presence. We used simulated data under a known environmental effect to assess the ability of standard statistical methods (binomial and linear regression, ANOVA) to detect a significant environmental effect on pooled disease prevalence with varying species abundance distributions and relative susceptibilities to disease. When one species was more susceptible to a disease and both species only partially overlapped in their distributions, models tended to produce a greater number of false positives (Type I error). Differences in disease risk between regions or along an environmental gradient tended to be underestimated, or even in the wrong direction, when highly susceptible taxa had reduced abundances in impacted sites, a situation likely to be common in nature. Including relative abundance as an additional variable in regressions improved model accuracy, but tended to be conservative, producing more false negatives (Type II error) when species abundance was strongly correlated with the environmental effect. Investigators should be cautious of underlying assumptions of species similarity in susceptib
Indrehus, Oddny; Aralt, Tor Tybring
2005-04-01
Aerosol, NO and CO concentration, temperature, air humidity, air flow and number of running ventilation fans were measured by continuous analysers every minute for a whole week for six different one-week periods spread over ten months in 2001 and 2002 at measuring stations in the 7860 m long tunnel. The ventilation control system was mainly based on aerosol measurements taken by optical scatter sensors. The ventilation turned out to be satisfactory according to Norwegian air quality standards for road tunnels; however, there was some uncertainty concerning the NO2 levels. The air humidity and temperature inside the tunnel were highly influenced by the outside metrological conditions. Statistical models for NO concentration were developed and tested; correlations between predicted and measured NO were 0.81 for a partial least squares regression (PLS1) model based on CO and aerosol, and 0.77 for a linear regression model based only on aerosol. Hence, the ventilation control system should not solely be based on aerosol measurements. Since NO2 is the hazardous polluter, modelling NO2 concentration rather than NO should be preferred in any further optimising of the ventilation control.
Párta, László; Zalai, Dénes; Borbély, Sándor; Putics, Akos
2014-02-01
The application of dielectric spectroscopy was frequently investigated as an on-line cell culture monitoring tool; however, it still requires supportive data and experience in order to become a robust technique. In this study, dielectric spectroscopy was used to predict viable cell density (VCD) at industrially relevant high levels in concentrated fed-batch culture of Chinese hamster ovary cells producing a monoclonal antibody for pharmaceutical purposes. For on-line dielectric spectroscopy measurements, capacitance was scanned within a wide range of frequency values (100-19,490 kHz) in six parallel cell cultivation batches. Prior to detailed mathematical analysis of the collected data, principal component analysis (PCA) was applied to compare dielectric behavior of the cultivations. PCA analysis resulted in detecting measurement disturbances. By using the measured spectroscopic data, partial least squares regression (PLS), Cole-Cole, and linear modeling were applied and compared in order to predict VCD. The Cole-Cole and the PLS model provided reliable prediction over the entire cultivation including both the early and decline phases of cell growth, while the linear model failed to estimate VCD in the later, declining cultivation phase. In regards to the measurement error sensitivity, remarkable differences were shown among PLS, Cole-Cole, and linear modeling. VCD prediction accuracy could be improved in the runs with measurement disturbances by first derivative pre-treatment in PLS and by parameter optimization of the Cole-Cole modeling.
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon; ...
2017-05-19
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
Qiu, F Y; Tian, R L; Qiang, Y; He, K P; Liu, H R; Zhang, W; Song, H
2016-05-01
To investigate the relationship between occupational chronic psychological stress with heat shock protein 70 (HSP70) and tumor necrosis factor-alpha (TNF-α). Using case-control study design, we selected 622 cases in 20 to 60 years old and unrelated patients with metabolic syndrome as the case group between October 2011 and October 2012 at two hospitals of Ningxia hui autonomous region. At the same time, we selected 600 healthy people from health check-up crowd in the above two hospitals as control group. The the research objects were sex, age, nation, height, weight, smoking, drinking, exercise, and so on. After informed consent, all the research objects were collected fasting venous blood samples 10 ml in order to proceed laboratory testing of biochemical indicators. The expression of HSP70 and TNF-α in serum was determined by ELISA. Using the revised occupational stress inventory (OSI) to survey the occupational chronic psychological stress factors and stress level of research object. The correlation of occupational chronic psychological stress scores with HSP70 and TNF-α was investigated by partial correlation analysis. We built a multivariate linear regression equation With HSP70 and TNF alpha as the independent variable and occupational chronic psychological stress scores as the dependent variable, using equation of the determination coefficient R(2) to judge the degree of fitting equation. The total points of chronic stress factors in all respondents was (136.65±16.19). Among them, the mild stress level group was 313, moderate was 588, severe was 321, chronic heart stress factors scores were (119.96±13.30), (135.33±3.23), (155.33±13.55) points, respectively. In the case group subjects, the expression of HSP70 in mild, moderate and severe occupational chronic psychological stress levels were (29.88±30.08), (36.38±30.08), (27.16±23.77) ng/ml (F=6.85, P=0.001). The control group were (27.64±9.89), (39.78±29.77), (3.94±3.09) ng/ml (F=125.71, P<0.001). Multiple linear regression analysis showed the expression of psychological stress and HSP70 was a negative linear relationship, while positive linear relationship with TNF-α, the fitting of the regression equation was y=-0.07x1+ 0.011x2+ 136.88. Partial correlation analysis results showed that the occupational chronic psychological stress scores and negatively correlated with HSP70 (r=-0.11, P<0.001) and was positively related with TNF-α (r=0.11, P<0.001). In all survey respondents, the expression of HSP70 in mild, moderate and severe occupational chronic psychological stress levels group were (28.49±20.10), (37.99±29.96), (17.98±21.77) ng/ml (F=64.08, P<0.001). The expression of TNF-α were (133.61±129.51), (171.23±133.69), (169.31±196.09) pg/ml (F=6.93, P=0.001). The expression levels of HSP70 and TNF-α in serum were affected by occupational chronic psychological stress. While the level of occupational chronic psychological stress increased, the expression level of HSP70 in serum reduced, the expression level of TNF-α raised.
Specialization Agreements in the Council for Mutual Economic Assistance
1988-02-01
proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations
ERIC Educational Resources Information Center
Pek, Jolynn; Wong, Octavia; Wong, C. M.
2017-01-01
Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Qichun; Zhang, Xuesong; Xu, Xingya
Riverine carbon cycling is an important, but insufficiently investigated component of the global carbon cycle. Analyses of environmental controls on riverine carbon cycling are critical for improved understanding of mechanisms regulating carbon processing and storage along the terrestrial-aquatic continuum. Here, we compile and analyze riverine dissolved organic carbon (DOC) concentration data from 1402 United States Geological Survey (USGS) gauge stations to examine the spatial variability and environmental controls of DOC concentrations in the United States (U.S.) surface waters. DOC concentrations exhibit high spatial variability, with an average of 6.42 ± 6.47 mg C/ L (Mean ± Standard Deviation). In general,more » high DOC concentrations occur in the Upper Mississippi River basin and the Southeastern U.S., while low concentrations are mainly distributed in the Western U.S. Single-factor analysis indicates that slope of drainage areas, wetlands, forests, percentage of first-order streams, and instream nutrients (such as nitrogen and phosphorus) pronouncedly influence DOC concentrations, but the explanatory power of each bivariate model is lower than 35%. Analyses based on the general multi-linear regression models suggest DOC concentrations are jointly impacted by multiple factors. Soil properties mainly show positive correlations with DOC concentrations; forest and shrub lands have positive correlations with DOC concentrations, but urban area and croplands demonstrate negative impacts; total instream phosphorus and dam density correlate positively with DOC concentrations. Notably, the relative importance of these environmental controls varies substantially across major U.S. water resource regions. In addition, DOC concentrations and environmental controls also show significant variability from small streams to large rivers, which may be caused by changing carbon sources and removal rates by river orders. In sum, our results reveal that general multi-linear regression analysis of twenty one terrestrial and aquatic environmental factors can partially explain (56%) the DOC concentration variation. In conclusion, this study highlights the complexity of the interactions among these environmental factors in determining DOC concentrations, thus calls for processes-based, non-linear methodologies to constrain uncertainties in riverine DOC cycling.« less
Time-varying trends of global vegetation activity
NASA Astrophysics Data System (ADS)
Pan, N.; Feng, X.; Fu, B.
2016-12-01
Vegetation plays an important role in regulating the energy change, water cycle and biochemical cycle in terrestrial ecosystems. Monitoring the dynamics of vegetation activity and understanding their driving factors have been an important issue in global change research. Normalized Difference Vegetation Index (NDVI), an indicator of vegetation activity, has been widely used in investigating vegetation changes at regional and global scales. Most studies utilized linear regression or piecewise linear regression approaches to obtain an averaged changing rate over a certain time span, with an implicit assumption that the trend didn't change over time during that period. However, no evidence shows that this assumption is right for the non-linear and non-stationary NDVI time series. In this study, we adopted the multidimensional ensemble empirical mode decomposition (MEEMD) method to extract the time-varying trends of NDVI from original signals without any a priori assumption of their functional form. Our results show that vegetation trends are spatially and temporally non-uniform during 1982-2013. Most vegetated area exhibited greening trends in the 1980s. Nevertheless, the area with greening trends decreased over time since the early 1990s, and the greening trends have stalled or even reversed in many places. Regions with browning trends were mainly located in southern low latitudes in the 1980s, whose area decreased before the middle 1990s and then increased at an accelerated rate. The greening-to-browning reversals were widespread across all continents except Oceania (43% of the vegetated areas), most of which happened after the middle 1990s. In contrast, the browning-to-greening reversals occurred in smaller area and earlier time. The area with monotonic greening and browning trends accounted for 33% and 5% of the vegetated area, respectively. By performing partial correlation analyses between NDVI and climatic elements (temperature, precipitation and cloud cover) and analyzing the MEEMD-extracted trends of these climatic elements, we discussed possible driving factors of the time-varying trends of NDVI in several specific regions where trend reversals occurred.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Fremond, L; Bouché, O; Diébold, M D; Demange, L; Zeitoun, P; Thiefin, G
1995-01-01
Barrett's oesophagus is a premalignant condition. The possibility of eradicating at least partially the metaplastic epithelium has been reported recently. In this case report, a patient with Barrett's oesophagus complicated by high grade dysplasia and focal adenocarcinoma was treated by Nd:Yag laser then high dose rate intraluminal irradiation while on omeprazole 40 mg/day. A partial eradication of Barrett's oesophagus and a transient tumoural regression were obtained. Histologically, residual specialized-type glandular tissue was observed beneath regenerative squamous epithelium. Four months after intraluminal irradiation, a local tumoural recurrence was detected while the area of restored squamous epithelium was unchanged on omeprazole 40 mg/day. This indicates that physical destruction of Barrett's oesophagus associated with potent antisecretory treatment can induce a regression of the metaplastic epithelium, even in presence of high grade dysplasia. The persistence of specialized-type glands beneath the squamous epithelium raises important issues about its potential malignant degeneration.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Note: Nonpolar solute partial molar volume response to attractive interactions with water.
Williams, Steven M; Ashbaugh, Henry S
2014-01-07
The impact of attractive interactions on the partial molar volumes of methane-like solutes in water is characterized using molecular simulations. Attractions account for a significant 20% volume drop between a repulsive Weeks-Chandler-Andersen and full Lennard-Jones description of methane interactions. The response of the volume to interaction perturbations is characterized by linear fits to our simulations and a rigorous statistical thermodynamic expression for the derivative of the volume to increasing attractions. While a weak non-linear response is observed, an average effective slope accurately captures the volume decrease. This response, however, is anticipated to become more non-linear with increasing solute size.
Note: Nonpolar solute partial molar volume response to attractive interactions with water
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Steven M.; Ashbaugh, Henry S., E-mail: hanka@tulane.edu
2014-01-07
The impact of attractive interactions on the partial molar volumes of methane-like solutes in water is characterized using molecular simulations. Attractions account for a significant 20% volume drop between a repulsive Weeks-Chandler-Andersen and full Lennard-Jones description of methane interactions. The response of the volume to interaction perturbations is characterized by linear fits to our simulations and a rigorous statistical thermodynamic expression for the derivative of the volume to increasing attractions. While a weak non-linear response is observed, an average effective slope accurately captures the volume decrease. This response, however, is anticipated to become more non-linear with increasing solute size.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
NASA Astrophysics Data System (ADS)
Whiteley, J. P.
2017-10-01
Large, incompressible elastic deformations are governed by a system of nonlinear partial differential equations. The finite element discretisation of these partial differential equations yields a system of nonlinear algebraic equations that are usually solved using Newton's method. On each iteration of Newton's method, a linear system must be solved. We exploit the structure of the Jacobian matrix to propose a preconditioner, comprising two steps. The first step is the solution of a relatively small, symmetric, positive definite linear system using the preconditioned conjugate gradient method. This is followed by a small number of multigrid V-cycles for a larger linear system. Through the use of exemplar elastic deformations, the preconditioner is demonstrated to facilitate the iterative solution of the linear systems arising. The number of GMRES iterations required has only a very weak dependence on the number of degrees of freedom of the linear systems.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Women's Endorsement of Models of Sexual Response: Correlates and Predictors.
Nowosielski, Krzysztof; Wróbel, Beata; Kowalczyk, Robert
2016-02-01
Few studies have investigated endorsement of female sexual response models, and no single model has been accepted as a normative description of women's sexual response. The aim of the study was to establish how women from a population-based sample endorse current theoretical models of the female sexual response--the linear models and circular model (partial and composite Basson models)--as well as predictors of endorsement. Accordingly, 174 heterosexual women aged 18-55 years were included in a cross-sectional study: 74 women diagnosed with female sexual dysfunction (FSD) based on DSM-5 criteria and 100 non-dysfunctional women. The description of sexual response models was used to divide subjects into four subgroups: linear (Masters-Johnson and Kaplan models), circular (partial Basson model), mixed (linear and circular models in similar proportions, reflective of the composite Basson model), and a different model. Women were asked to choose which of the models best described their pattern of sexual response and how frequently they engaged in each model. Results showed that 28.7% of women endorsed the linear models, 19.5% the partial Basson model, 40.8% the composite Basson model, and 10.9% a different model. Women with FSD endorsed the partial Basson model and a different model more frequently than did non-dysfunctional controls. Individuals who were dissatisfied with a partner as a lover were more likely to endorse a different model. Based on the results, we concluded that the majority of women endorsed a mixed model combining the circular response with the possibility of an innate desire triggering a linear response. Further, relationship difficulties, not FSD, predicted model endorsement.
Song, Man-Kyu; Ha, Jee Hyun; Ryu, Seung-Ho; Yu, Jaehak
2012-01-01
Objective This study aims to analyze how much heart rate variability (HRV) indices discriminatively respond to age and severity of sleep apnea in the obstructive sleep apnea syndrome (OSAS). Methods 176 male OSAS patients were classified into four groups according to their age and apnea-hypopnea index (AHI). The HRV indices were compared via analysis of covariance (ANCOVA). In particular, the partial correlation method was performed to identify the most statistically significant HRV indices in the time and frequency domains. Stepwise multiple linear regressions were further executed to examine the effects of age, AHI, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), and sleep parameters on the significant HRV indices. Results The partial correlation analysis yielded the NN50 count (defined as the number of adjacent R-wave to R-wave intervals differing by more than 50 ms) and low frequency/high frequency (LF/HF) ratio to be two most statistically significant HRV indices in both time and frequency domains. The two indices showed significant differences between the groups. The NN50 count was affected by age (p<0.001) and DBP (p=0.039), while the LF/HF ratio was affected by AHI (p<0.001), the amount of Stage 2 sleep (p=0.005), and age (p=0.021) in the order named in the regression analysis. Conclusion The NN50 count more sensitively responded to age than to AHI, suggesting that the index is mainly associated with an age-related parasympathetic system. On the contrary, the LF/HF ratio responded to AHI more sensitively than to age, suggesting that it is mainly associated with a sympathetic tone likely reflecting the severity of sleep apnea. PMID:22396687
Jethwa, Krishan R; Kahila, Mohamed M; Mara, Kristin C; Harmsen, William S; Routman, David M; Pumper, Geralyn M; Corbin, Kimberly S; Sloan, Jeff A; Ruddy, Kathryn J; Hieken, Tina J; Park, Sean S; Mutter, Robert W
2018-05-01
Accelerated partial breast irradiation (APBI) and whole breast irradiation (WBI) are treatment options for early-stage breast cancer. The purpose of this study was to compare patient-reported-outcomes (PRO) between patients receiving multi-channel intra-cavitary brachytherapy APBI or WBI. Between 2012 and 2015, 131 patients with ductal carcinoma in situ (DCIS) or early stage invasive breast cancer were treated with adjuvant APBI (64) or WBI (67) and participated in a PRO questionnaire. The linear analog scale assessment (LASA), harvard breast cosmesis scale (HBCS), PRO-common terminology criteria for adverse events- PRO (PRO-CTCAE), and breast cancer treatment outcome scale (BCTOS) were used to assess quality of life (QoL), pain, fatigue, aesthetic and functional status, and breast cosmesis. Comparisons of PROs were performed using t-tests, Wilcoxon rank-sum, Chi square, Fisher exact test, and regression methods. Median follow-up from completion of radiotherapy and questionnaire completion was 13.3 months. There was no significant difference in QoL, pain, or fatigue severity, as assessed by the LASA, between treatment groups (p > 0.05). No factors were found to be predictive of overall QoL on regression analysis. BCTOS health-related QoL scores were similar between treatment groups (p = 0.52).The majority of APBI and WBI patients reported excellent/good breast cosmesis, 88.5% versus 93.7% (p = 0.37). Skin color change (p = 0.011) and breast elevation (p = 0.01) relative to baseline were more common in the group receiving WBI. APBI and WBI were both associated with favorable patient-reported outcomes in early follow-up. APBI resulted in a lesser degree of patient-reported skin color change and breast elevation relative to baseline.
Chen, Yen-Po; Prashar, Ankush; Hocking, Paul M; Erichsen, Jonathan T; To, Chi Ho; Schaeffel, Frank; Guggenheim, Jeremy A
2010-02-01
There is considerable variation in the degree of form-deprivation myopia (FDM) induced in chickens by a uniform treatment regimen. Sex and pretreatment eye size have been found to be predictive of the rate of FD-induced eye growth. Therefore, this study was undertaken to test whether the greater rate of myopic eye growth in males is a consequence of their larger eyes or of some other aspect of their sex. Monocular FDM was induced in 4-day-old White Leghorn chicks for 4 days. Changes in ocular component dimensions and refractive error were assessed by A-scan ultrasonography and retinoscopy, respectively. Sex identification of chicks was performed by DNA test. Relationships between traits were assessed by multiple regression. FD produced (mean +/- SD) 13.47 +/- 3.12 D of myopia and 0.47 +/- 0.14 mm of vitreous chamber elongation. The level of induced myopia was not significantly different between the sexes, but the males had larger eyes initially and showed greater myopic eye growth than did the females. In multiple linear regression analysis, the partial correlation between sex and the degree of induced eye growth remained significant (P = 0.008) after adjustment for eye size, whereas the partial correlation between initial eye size and the degree of induced eye growth was no longer significant after adjustment for sex (P = 0.11). After adjustment for other factors, the chicks' sex accounted for 6.4% of the variation in FD-induced vitreous chamber elongation. The sex of the chick influences the rate of experimentally induced myopic eye growth, independent of its effects on eye size.
Razi-Asrami, Mahboobeh; Ghasemi, Jahan B; Amiri, Nayereh; Sadeghi, Seyed Jamal
2017-04-01
In this paper, a simple, fast, and inexpensive method is introduced for the simultaneous spectrophotometric determination of crystal violet (CV) and malachite green (MG) contents in aquatic samples using partial least squares regression (PLS) as a multivariate calibration technique after preconcentration by graphene oxide (GO). The method was based on the sorption and desorption of analytes onto GO and direct determination by ultraviolet-visible spectrophotometric techniques. GO was synthesized according to Hummers method. To characterize the shape and structure of GO, FT-IR, SEM, and XRD were used. The effective factors on the extraction efficiency such as pH, extraction time, and the amount of adsorbent were optimized using central composite design. The optimum values of these factors were 6, 15 min, and 12 mg, respectively. The maximum capacity of GO for the adsorption of CV and MG was 63.17 and 77.02 mg g -1 , respectively. Preconcentration factors and extraction recoveries were obtained and were 19.6, 98% for CV and 20, 100% for MG, respectively. LOD and linear dynamic ranges for CV and MG were 0.009, 0.03-0.3, 0.015, and 0.05-0.5 (μg mL -1 ), respectively. The intra-day and inter-day relative standard deviations were 1.99 and 0.58 for CV and 1.69 and 3.13 for MG at the concentration level of 50 ng mL -1 , respectively. Finally, the proposed DSPE/PLS method was successfully applied for the simultaneous determination of the trace amount of CV and MG in the real water samples.
Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E
2016-06-01
The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.
Study on the social adaptation of Chinese children with down syndrome.
Wang, Yan-Xia; Mao, Shan-Shan; Xie, Chun-Hong; Qin, Yu-Feng; Zhu, Zhi-Wei; Zhan, Jian-Ying; Shao, Jie; Li, Rong; Zhao, Zheng-Yan
2007-06-30
To evaluate social adjustment and related factors among Chinese children with Down syndrome (DS). A structured interview and Peabody Picture Vocabulary Test (PPVT) were conducted with a group of 36 DS children with a mean age of 106.28 months, a group of 30 normally-developing children matched for mental age (MA) and a group of 40 normally-developing children matched for chronological age (CA). Mean scores of social adjustment were compared between the three groups, and partial correlations and stepwise multiple regression models were used to further explore related factors. There was no difference between the DS group and the MA group in terms of communication skills. However, the DS group scored much better than the MA group in self-dependence, locomotion, work skills, socialization and self-management. Children in the CA group achieved significantly higher scores in all aspects of social adjustment than the DS children. Partial correlations indicate a relationship between social adjustment and the PPVT raw score and also between social adjustment and age (significant r ranging between 0.24 and 0.92). A stepwise linear regression analysis showed that family structure was the main predictor of social adjustment. Newborn history was also a predictor of work skills, communication, socialization and self-management. Parental education was found to account for 8% of self-dependence. Maternal education explained 6% of the variation in locomotion. Although limited by the small sample size, these results indicate that Chinese DS children have better social adjustment skills when compared to their mental-age-matched normally-developing peers, but that the Chinese DS children showed aspects of adaptive development that differed from Western DS children. Analyses of factors related to social adjustment suggest that effective early intervention may improve social adaptability.
1994-09-01
Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System
1989-09-01
residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?
ERIC Educational Resources Information Center
Blankmeyer, Eric
2006-01-01
Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method
ERIC Educational Resources Information Center
Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev
2018-01-01
The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
The Relationship Between Oxygen Reserve Index and Arterial Partial Pressure of Oxygen During Surgery
Dorotta, Ihab L.; Wells, Briana; Juma, David; Applegate, Patricia M.
2016-01-01
BACKGROUND: The use of intraoperative pulse oximetry (Spo2) enhances hypoxia detection and is associated with fewer perioperative hypoxic events. However, Spo2 may be reported as 98% when arterial partial pressure of oxygen (Pao2) is as low as 70 mm Hg. Therefore, Spo2 may not provide advance warning of falling arterial oxygenation until Pao2 approaches this level. Multiwave pulse co-oximetry can provide a calculated oxygen reserve index (ORI) that may add to information from pulse oximetry when Spo2 is >98%. This study evaluates the ORI to Pao2 relationship during surgery. METHODS: We studied patients undergoing scheduled surgery in which arterial catheterization and intraoperative arterial blood gas analysis were planned. Data from multiple pulse co-oximetry sensors on each patient were continuously collected and stored on a research computer. Regression analysis was used to compare ORI with Pao2 obtained from each arterial blood gas measurement and changes in ORI with changes in Pao2 from sequential measurements. Linear mixed-effects regression models for repeated measures were then used to account for within-subject correlation across the repeatedly measured Pao2 and ORI and for the unequal time intervals of Pao2 determination over elapsed surgical time. Regression plots were inspected for ORI values corresponding to Pao2 of 100 and 150 mm Hg. ORI and Pao2 were compared using mixed-effects models with a subject-specific random intercept. RESULTS: ORI values and Pao2 measurements were obtained from intraoperative data collected from 106 patients. Regression analysis showed that the ORI to Pao2 relationship was stronger for Pao2 to 240 mm Hg (r2 = 0.536) than for Pao2 over 240 mm Hg (r2 = 0.0016). Measured Pao2 was ≥100 mm Hg for all ORI over 0.24. Measured Pao2 was ≥150 mm Hg in 96.6% of samples when ORI was over 0.55. A random intercept variance component linear mixed-effects model for repeated measures indicated that Pao2 was significantly related to ORI (β[95% confidence interval] = 0.002 [0.0019–0.0022]; P < 0.0001). A similar analysis indicated a significant relationship between change in Pao2 and change in ORI (β [95% confidence interval] = 0.0044 [0.0040–0.0048]; P < 0.0001). CONCLUSIONS: These findings suggest that ORI >0.24 can distinguish Pao2 ≥100 mm Hg when Spo2 is over 98%. Similarly, ORI > 0.55 appears to be a threshold to distinguish Pao2 ≥150 mm Hg. The usefulness of these values should be evaluated prospectively. Decreases in ORI to near 0.24 may provide advance indication of falling Pao2 approaching 100 mm Hg when Spo2 is >98%. The clinical utility of interventions based on continuous ORI monitoring should be studied prospectively. PMID:27007078
Applegate, Richard L; Dorotta, Ihab L; Wells, Briana; Juma, David; Applegate, Patricia M
2016-09-01
The use of intraoperative pulse oximetry (SpO2) enhances hypoxia detection and is associated with fewer perioperative hypoxic events. However, SpO2 may be reported as 98% when arterial partial pressure of oxygen (PaO2) is as low as 70 mm Hg. Therefore, SpO2 may not provide advance warning of falling arterial oxygenation until PaO2 approaches this level. Multiwave pulse co-oximetry can provide a calculated oxygen reserve index (ORI) that may add to information from pulse oximetry when SpO2 is >98%. This study evaluates the ORI to PaO2 relationship during surgery. We studied patients undergoing scheduled surgery in which arterial catheterization and intraoperative arterial blood gas analysis were planned. Data from multiple pulse co-oximetry sensors on each patient were continuously collected and stored on a research computer. Regression analysis was used to compare ORI with PaO2 obtained from each arterial blood gas measurement and changes in ORI with changes in PaO2 from sequential measurements. Linear mixed-effects regression models for repeated measures were then used to account for within-subject correlation across the repeatedly measured PaO2 and ORI and for the unequal time intervals of PaO2 determination over elapsed surgical time. Regression plots were inspected for ORI values corresponding to PaO2 of 100 and 150 mm Hg. ORI and PaO2 were compared using mixed-effects models with a subject-specific random intercept. ORI values and PaO2 measurements were obtained from intraoperative data collected from 106 patients. Regression analysis showed that the ORI to PaO2 relationship was stronger for PaO2 to 240 mm Hg (r = 0.536) than for PaO2 over 240 mm Hg (r = 0.0016). Measured PaO2 was ≥100 mm Hg for all ORI over 0.24. Measured PaO2 was ≥150 mm Hg in 96.6% of samples when ORI was over 0.55. A random intercept variance component linear mixed-effects model for repeated measures indicated that PaO2 was significantly related to ORI (β[95% confidence interval] = 0.002 [0.0019-0.0022]; P < 0.0001). A similar analysis indicated a significant relationship between change in PaO2 and change in ORI (β [95% confidence interval] = 0.0044 [0.0040-0.0048]; P < 0.0001). These findings suggest that ORI >0.24 can distinguish PaO2 ≥100 mm Hg when SpO2 is over 98%. Similarly, ORI > 0.55 appears to be a threshold to distinguish PaO2 ≥150 mm Hg. The usefulness of these values should be evaluated prospectively. Decreases in ORI to near 0.24 may provide advance indication of falling PaO2 approaching 100 mm Hg when SpO2 is >98%. The clinical utility of interventions based on continuous ORI monitoring should be studied prospectively.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
An Improved Heaviside Approach to Partial Fraction Expansion and Its Applications
ERIC Educational Resources Information Center
Man, Yiu-Kwong
2009-01-01
In this note, we present an improved Heaviside approach to compute the partial fraction expansions of proper rational functions. This method uses synthetic divisions to determine the unknown partial fraction coefficients successively, without the need to use differentiation or to solve a system of linear equations. Examples of its applications in…
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Caicedo, Alexander; Varon, Carolina; Hunyadi, Borbala; Papademetriou, Maria; Tachtsidis, Ilias; Van Huffel, Sabine
2016-01-01
Clinical data is comprised by a large number of synchronously collected biomedical signals that are measured at different locations. Deciphering the interrelationships of these signals can yield important information about their dependence providing some useful clinical diagnostic data. For instance, by computing the coupling between Near-Infrared Spectroscopy signals (NIRS) and systemic variables the status of the hemodynamic regulation mechanisms can be assessed. In this paper we introduce an algorithm for the decomposition of NIRS signals into additive components. The algorithm, SIgnal DEcomposition base on Obliques Subspace Projections (SIDE-ObSP), assumes that the measured NIRS signal is a linear combination of the systemic measurements, following the linear regression model y = Ax + ϵ . SIDE-ObSP decomposes the output such that, each component in the decomposition represents the sole linear influence of one corresponding regressor variable. This decomposition scheme aims at providing a better understanding of the relation between NIRS and systemic variables, and to provide a framework for the clinical interpretation of regression algorithms, thereby, facilitating their introduction into clinical practice. SIDE-ObSP combines oblique subspace projections (ObSP) with the structure of a mean average system in order to define adequate signal subspaces. To guarantee smoothness in the estimated regression parameters, as observed in normal physiological processes, we impose a Tikhonov regularization using a matrix differential operator. We evaluate the performance of SIDE-ObSP by using a synthetic dataset, and present two case studies in the field of cerebral hemodynamics monitoring using NIRS. In addition, we compare the performance of this method with other system identification techniques. In the first case study data from 20 neonates during the first 3 days of life was used, here SIDE-ObSP decoupled the influence of changes in arterial oxygen saturation from the NIRS measurements, facilitating the use of NIRS as a surrogate measure for cerebral blood flow (CBF). The second case study used data from a 3-years old infant under Extra Corporeal Membrane Oxygenation (ECMO), here SIDE-ObSP decomposed cerebral/peripheral tissue oxygenation, as a sum of the partial contributions from different systemic variables, facilitating the comparison between the effects of each systemic variable on the cerebral/peripheral hemodynamics.
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Den Otter, Willem; Hack, Margot; Jacobs, John J L; Tan, Jurgen F V; Rozendaal, Lawrence; Van Moorselaar, R Jeroen A
2015-02-01
To improve the treatment of transmissible venereal tumors (TVTs) in dogs with intratumoral injections of interleukin-2 (IL-2). We treated 13 dogs with 18 natural TVTs with IL-2. The tumors were treated with intratumoral application of 2×10(6) units IL-2. Three months after injection of IL-2, the tumors in 2/13 dogs had regressed completely, those in 1/13 had regressed partially, and 4/13 dogs had stable disease. Local IL-2 treatment of TVT is therapeutically effective, as indicated by complete regression (CR), partial regression (PR) and stable disease (SD) of the tumors of 7 out of 13 dogs. In addition, we observed that the intratumoral treatment with IL-2 did not cause any toxic side-effects. Copyright© 2015 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Roemer, Frank W; Kwoh, C Kent; Hannon, Michael J; Hunter, David J; Eckstein, Felix; Grago, Jason; Boudreau, Robert M; Englund, Martin; Guermazi, Ali
2017-01-01
To assess whether partial meniscectomy is associated with increased risk of radiographic osteoarthritis (ROA) and worsening cartilage damage in the following year. We studied 355 knees from the Osteoarthritis Initiative that developed ROA (Kellgren-Lawrence grade ≥ 2), which were matched with control knees. The MR images were assessed using the semi-quantitative MOAKS system. Conditional logistic regression was applied to estimate risk of incident ROA. Logistic regression was used to assess the risk of worsening cartilage damage in knees with partial meniscectomy that developed ROA. In the group with incident ROA, 4.4 % underwent partial meniscectomy during the year prior to the case-defining visit, compared with none of the knees that did not develop ROA. All (n = 31) knees that had partial meniscectomy and 58.9 % (n = 165) of the knees with prevalent meniscal damage developed ROA (OR = 2.51, 95 % CI [1.73, 3.64]). In knees that developed ROA, partial meniscectomy was associated with an increased risk of worsening cartilage damage (OR = 4.51, 95 % CI [1.53, 13.33]). The probability of having had partial meniscectomy was higher in knees that developed ROA. When looking only at knees that developed ROA, partial meniscectomy was associated with greater risk of worsening cartilage damage. • Partial meniscectomy is a controversial treatment option for degenerative meniscal tears. • Partial meniscectomy is strongly associated with incident osteoarthritis within 1 year. • Partial meniscectomy is associated with increased risk of worsening cartilage damage.
USDA-ARS?s Scientific Manuscript database
Dairy cattle feed efficiency (FE) can be defined as the ability to convert DMI into milk energy (MILKE) and maintenance or metabolic body weight (MBW). In other words, DMI is conditional on MILKE and MBW (DMI|MILKE,MBW). These partial regressions or partial efficiencies (PE) of DMI on MILKE and MBW ...
Radon exposure and leukaemia in adulthood.
Viel, J F
1993-08-01
Positive associations between leukaemia and radon concentrations have been observed in England, Scotland and Wales, and Canada. Results of a similar study for the populations of 41 French administrative areas ('départements') are reported for 1984-1986. The average indoor radon and gamma ray concentrations per 'département' range from 12 to 147 Bq.m-3 and from 28 to 142 nG.h-1, respectively. Acute lymphoid leukaemia mortality rate is similar to the national level, whereas an excess of acute myeloid leukaemia deaths is observed. According to Poisson regression models and modified tests for partial correlation, acute myeloid leukaemia mortality is significantly and positively related to indoor radon concentration whether or not adjustment is made for indoor gamma ray dose, socioeconomic status and linear gradient. This result reinforces the evidence that indoor exposure to high levels of radon is a leukaemic environmental hazard.
NASA Astrophysics Data System (ADS)
Kumavat, Hemraj Ramdas
2016-09-01
The compressive stress-strain behavior and mechanical properties of clay brick masonry and its constituents clay bricks and mortar, have been studied by several laboratory tests. Using linear regression analysis, a analytical model has been proposed for obtaining the stress-strain curves for masonry that can be used in the analysis and design procedures. The model requires only the compressive strengths of bricks and mortar as input data, which can be easily obtained experimentally. Development of analytical model from the obtained experimental results of Young's modulus and compressive strength. Simple relationships have been identified for obtaining the modulus of elasticity of bricks, mortar, and masonry from their corresponding compressive strengths. It was observed that the proposed analytical model clearly demonstrates a reasonably good prediction of the stress-strain curves when compared with the experimental curves.
Schneider, Margaret L.; Kwan, Bethany M.
2013-01-01
Objectives To further understanding of the factors influencing adolescents’ motivations for physical activity, the relationship of variables derived from Self-Determination Theory to adolescents’ affective response to exercise was examined. Design Correlational. Method Adolescents (N = 182) self-reported psychological needs satisfaction (perceived competence, relatedness, and autonomy) and intrinsic motivation related to exercise. In two clinic visits, adolescents reported their affect before, during, and after a moderate-intensity and a hard-intensity exercise task. Results Affective response to exercise and psychological needs satisfaction independently contributed to the prediction of intrinsic motivation in hierarchical linear regression models. The association between affective response to exercise and intrinsic motivation was partially mediated by psychological needs satisfaction. Conclusions Intrinsic motivation for exercise among adolescents may be enhanced when the environment supports perceived competence, relatedness, and autonomy, and when adolescents participate in activities that they find enjoyable. PMID:24015110
Correlates of psychological distress, burnout, and resilience among Chinese female nurses
ZOU, Guiyuan; SHEN, Xiuying; TIAN, Xiaohong; LIU, Chunqin; LI, Guopeng; KONG, Linghua; LI, Ping
2016-01-01
The present survey investigated the association between resilience, burnout and psychological distress among Chinese female nurses. A total of 366 female nurses were enrolled in our study. A series of self-reported questionnaires that dispose of the following constructs: psychological distress, burnout, and resilience were estimated. The hierarchical linear regression models were used to evaluate the mediating effect of resilience on the relationship between burnout and psychological distress. Results of the survey showed 85.5% nurses experienced psychological distress. Resilience was negatively related to psychological distress and burnout whereas burnout was positively associated with psychological distress. Mediation analysis revealed that resilience could partially mediate the relationship between the dimensions of emotional exhaustion, depersonalization, and psychological distress. This study highlights the mediator of resilience between burnout and psychological distress of female nurses. As such, interventions that attend to resilience training may be the focus for future clinical and research endeavors. PMID:27021058
Lewis, Marilyn W; Cavanagh, Paul K; Ahn, Grace; Yoshioka, Marianne R
2008-06-01
Prior history of trauma may sensitize individuals to subsequent trauma, including terrorist attacks. Using a convenience sample of secondary, cross-sectional data, pregnant women were grouped based on lifetime interpersonal violence history. Cumulative risk theory was used to evaluate the association of lifetime interpersonal violence history and subjective impact of the September 11, 2001 (9/11) terrorists attacks. Using hierarchical linear regression, cumulative risk theory was partially supported. Women with a history of only one type of interpfersonal violence reported greater effect of 9/11 than did women without a history, but women with both types of violence did not report a greater effect of 9/11 compared to women endorsing history of one type. These data corroborate the literature in that level of exposure to terrorist-related trauma predicts subjective reaction to the attacks. Future research with a larger sample and standardized instruments is warranted.
Darwish, Hany W; Hassan, Said A; Salem, Maissa Y; El-Zeany, Badr A
2013-09-01
Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively. Copyright © 2013 Elsevier B.V. All rights reserved.
Roemer, Frank W.; Kwoh, C. Kent; Hannon, Michael J.; Hunter, David J.; Eckstein, Felix; Grago, Jason; Boudreau, Robert M.; Englund, Martin; Guermazi, Ali
2016-01-01
Objectives To assess whether partial meniscectomy is associated with increased risk of radiographic osteoarthritis (ROA) and worsening cartilage damage in the following year. Methods We studied 355 knees from the Osteoarthritis Initiative that developed ROA (Kellgren-Lawrence grade ≥ 2), which were matched with control knees. The MR images were assessed using the semi-quantitative MOAKS system. Conditional logistic regression was applied to estimate risk of incident ROA. Logistic regression was used to assess the risk of worsening cartilage damage in knees with partial meniscectomy that developed ROA. Results In the group with incident ROA, 4.4% underwent partial meniscectomy during the year prior to the case-defining visit, compared with none of the knees that did not develop ROA. All (n=31) knees that had partial meniscectomy and 58.9% (n=165) of the knees with prevalent meniscal damage developed ROA (OR=2.51, 95% CI [1.73, 3.64]). In knees that developed ROA, partial meniscectomy was associated with an increased risk of worsening cartilage damage (OR=4.51, 95% CI [1.53, 13.33]). Conclusions The probability of having had partial meniscectomy was higher in knees that developed ROA. When looking only at knees that developed ROA, partial meniscectomy was associated with greater risk of worsening cartilage damage. PMID:27121931
Transmission of linearly polarized light in seawater: implications for polarization signaling.
Shashar, Nadav; Sabbah, Shai; Cronin, Thomas W
2004-09-01
Partially linearly polarized light is abundant in the oceans. The natural light field is partially polarized throughout the photic range, and some objects and animals produce a polarization pattern of their own. Many polarization-sensitive marine animals take advantage of the polarization information, using it for tasks ranging from navigation and finding food to communication. In such tasks, the distance to which the polarization information propagates is of great importance. Using newly designed polarization sensors, we measured the changes in linear polarization underwater as a function of distance from a standard target. In the relatively clear waters surrounding coral reefs, partial (%) polarization decreased exponentially as a function of distance from the target, resulting in a 50% reduction of partial polarization at a distance of 1.25-3 m, depending on water quality. Based on these measurements, we predict that polarization sensitivity will be most useful for short-range (in the order of meters) visual tasks in water and less so for detecting objects, signals, or structures from far away. Navigation and body orientation based on the celestial polarization pattern are predicted to be limited to shallow waters as well, while navigation based on the solar position is possible through a deeper range.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Siyao; Yan, Huirong; Lazarian, A., E-mail: syxu@pku.edu.cn, E-mail: huirong.yan@desy.de, E-mail: lazarian@astro.wisc.edu
2016-08-01
We study the damping processes of both incompressible and compressible magnetohydrodynamic (MHD) turbulence in a partially ionized medium. We start from the linear analysis of MHD waves, applying both single-fluid and two-fluid treatments. The damping rates derived from the linear analysis are then used in determining the damping scales of MHD turbulence. The physical connection between the damping scale of MHD turbulence and the cutoff boundary of linear MHD waves is investigated. We find two branches of slow modes propagating in ions and neutrals, respectively, below the damping scale of slow MHD turbulence, and offer a thorough discussion of theirmore » propagation and dissipation behavior. Our analytical results are shown to be applicable in a variety of partially ionized interstellar medium (ISM) phases and the solar chromosphere. The importance of neutral viscosity in damping the Alfvenic turbulence in the interstellar warm neutral medium and the solar chromosphere is demonstrated. As a significant astrophysical utility, we introduce damping effects to the propagation of cosmic rays in partially ionized ISM. The important role of turbulence damping in both transit-time damping and gyroresonance is identified.« less
Techniques for estimating magnitude and frequency of peak flows for Pennsylvania streams
Stuckey, Marla H.; Reed, Lloyd A.
2000-01-01
Regression equations for estimating the magnitude and frequency of floods on ungaged streams in Pennsylvania with drainage areas less that 2,000 square miles were developed on the basis of peak-flow data collected at 313 streamflow-gaging stations. All streamflow-gaging stations used in the development of the equations had 10 or more years of record and include active and discontinued continuous-record and crest-stage partial-record streamflow-gaging stations. Regional regression equations were developed for flood flows expected every 10, 25, 50, 100, and 500 years by the use of a weighted multiple linear regression model.The State was divided into two regions. The largest region, Region A, encompasses about 78 percent of Pennsylvania. The smaller region, Region B, includes only the northwestern part of the State. Basin characteristics used in the regression equations for Region A are drainage area, percentage of forest cover, percentage of urban development, percentage of basin underlain by carbonate bedrock, and percentage of basin controlled by lakes, swamps, and reservoirs. Basin characteristics used in the regression equations for Region B are drainage area and percentage of basin controlled by lakes, swamps, and reservoirs. The coefficient of determination (R2) values for the five flood-frequency equations for Region A range from 0.93 to 0.82, and for Region B, the range is from 0.96 to 0.89.While the regression equations can be used to predict the magnitude and frequency of peak flows for most streams in the State, they should not be used for streams with drainage areas greater than 2,000 square miles or less than 1.5 square miles, for streams that drain extensively mined areas, or for stream reaches immediately below flood-control reservoirs. In addition, the equations presented for Region B should not be used if the stream drains a basin with more than 5 percent urban development.
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
NASA Astrophysics Data System (ADS)
Solimun
2017-05-01
The aim of this research is to model survival data from kidney-transplant patients using the partial least squares (PLS)-Cox regression, which can both meet and not meet the no-multicollinearity assumption. The secondary data were obtained from research entitled "Factors affecting the survival of kidney-transplant patients". The research subjects comprised 250 patients. The predictor variables consisted of: age (X1), sex (X2); two categories, prior hemodialysis duration (X3), diabetes (X4); two categories, prior transplantation number (X5), number of blood transfusions (X6), discrepancy score (X7), use of antilymphocyte globulin(ALG) (X8); two categories, while the response variable was patient survival time (in months). Partial least squares regression is a model that connects the predictor variables X and the response variable y and it initially aims to determine the relationship between them. Results of the above analyses suggest that the survival of kidney transplant recipients ranged from 0 to 55 months, with 62% of the patients surviving until they received treatment that lasted for 55 months. The PLS-Cox regression analysis results revealed that patients' age and the use of ALG significantly affected the survival time of patients. The factor of patients' age (X1) in the PLS-Cox regression model merely affected the failure probability by 1.201. This indicates that the probability of dying for elderly patients with a kidney transplant is 1.152 times higher than that for younger patients.
Huang, Jehn-Yu; Pekmezci, Melike; Mesiwala, Nisreen; Kao, Andrew; Lin, Shan
2011-02-01
To evaluate the capability of the optic disc, peripapillary retinal nerve fiber layer (P-RNFL), macular inner retinal layer (M-IRL) parameters, and their combination obtained by Fourier-domain optical coherent tomography (OCT) in differentiating a glaucoma suspect from perimetric glaucoma. Two hundred and twenty eyes from 220 patients were enrolled in this study. The optic disc morphology, P-RNFL, and M-IRL were assessed by the Fourier-domain OCT (RTVue OCT, Model RT100, Optovue, Fremont, CA). A linear discriminant function was generated by stepwise linear discriminant analysis on the basis of OCT parameters and demographic factors. The diagnostic power of these parameters was evaluated with receiver operating characteristic (ROC) curve analysis. The diagnostic power in the clinically relevant range (specificity ≥ 80%) was presented as the partial area under the ROC curve (partial AROC). The individual OCT parameter with the largest AROC and partial AROC in the high specificity (≥ 80%) range were cup/disc vertical ratio (AROC = 0.854 and partial AROC = 0.142) for the optic disc parameters, average thickness (AROC = 0.919 and partial AROC = 0.147) for P-RNFL parameters, inferior hemisphere thickness (AROC = 0.871 and partial AROC = 0.138) for M-IRL parameters, respectively. The linear discriminant function further enhanced the ability in detecting perimetric glaucoma (AROC = 0.970 and partial AROC = 0.172). Average P-RNFL thickness is the optimal individual OCT parameter to detect perimetric glaucoma. Simultaneous evaluation on disc morphology, P-RNFL, and M-IRL thickness can improve the diagnostic accuracy in diagnosing glaucoma.
FINITE DIFFERENCE THEORY, * LINEAR ALGEBRA , APPLIED MATHEMATICS, APPROXIMATION(MATHEMATICS), BOUNDARY VALUE PROBLEMS, COMPUTATIONS, HYPERBOLAS, MATHEMATICAL MODELS, NUMERICAL ANALYSIS, PARTIAL DIFFERENTIAL EQUATIONS, STABILITY.
Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe
2010-12-01
To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
Damage-control laparoscopic partial cholecystectomy with an endoscopic linear stapler.
Özçınar, Beyza; Memişoğlu, Ecem; Gök, Ali Fuat Kaan; Ağcaoğlu, Orhan; Yanar, Fatih; İlhan, Mehmet; Yanar, Hakan Teoman; Günay, Kayıhan
2017-01-01
Several damage-control procedures have been described in the literature in case of severe Calot's triangle inflammation and fibrosis. In this report, we describe patients who underwent laparoscopic partial cholecystectomy using an endoscopic linear stapler. Five patients with acute cholecystitis underwent laparoscopic partial cholecystectomy in our clinic between January - December 2011. All patients had severe fibrosis and inflammation of Calot's triangle. The anterior and posterior walls of the gallbladder were totally resected if possible. The gallbladder was transected at its neck or Hartmann's pouch, leaving a remnant gallbladder pouch behind. Five patients had laparoscopic partial cholecystectomy with an endoscopic linear stapler. The main symptom of all patients on admission to the emergency room was abdominal pain. The mean time for the surgical procedure was 140 minutes (range, 120-180 minutes). Inflammation and fibrosis of Calot's triangle was detected in all patients during surgery and a phlegmonous gallbladder was detected in one patient. Surgical drains were used in all patients and no biliary leakage was detected. Remnant common bile duct calculi were detected in one patient and this patient underwent endoscopic retrograde cholangiopancreatography one month after surgery. When a reliable view of Calot's triangle cannot be obtained due to severe inflammation and fibrosis during laparoscopy, laparoscopic partial cholecystectomy seems to be a safe and feasible alternative to open surgery with an acceptable morbidity rate.
ERIC Educational Resources Information Center
Rule, David L.
Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance
1989-12-01
v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
1990-03-01
and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course
ERIC Educational Resources Information Center
Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna
2010-01-01
Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
ERIC Educational Resources Information Center
Man, Yiu-Kwong
2012-01-01
Partial fraction decomposition is a useful technique often taught at senior secondary or undergraduate levels to handle integrations, inverse Laplace transforms or linear ordinary differential equations, etc. In recent years, an improved Heaviside's approach to partial fraction decomposition was introduced and developed by the author. An important…
NASA Technical Reports Server (NTRS)
Toomarian, N.; Fijany, A.; Barhen, J.
1993-01-01
Evolutionary partial differential equations are usually solved by decretization in time and space, and by applying a marching in time procedure to data and algorithms potentially parallelized in the spatial domain.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Accuracy of analytic energy level formulas applied to hadronic spectroscopy of heavy mesons
NASA Technical Reports Server (NTRS)
Badavi, Forooz F.; Norbury, John W.; Wilson, John W.; Townsend, Lawrence W.
1988-01-01
Linear and harmonic potential models are used in the nonrelativistic Schroedinger equation to obtain article mass spectra for mesons as bound states of quarks. The main emphasis is on the linear potential where exact solutions of the S-state eigenvalues and eigenfunctions and the asymptotic solution for the higher order partial wave are obtained. A study of the accuracy of two analytical energy level formulas as applied to heavy mesons is also included. Cornwall's formula is found to be particularly accurate and useful as a predictor of heavy quarkonium states. Exact solution for all partial waves of eigenvalues and eigenfunctions for a harmonic potential is also obtained and compared with the calculated discrete spectra of the linear potential. Detailed derivations of the eigenvalues and eigenfunctions of the linear and harmonic potentials are presented in appendixes.
Regression of non-linear coupling of noise in LIGO detectors
NASA Astrophysics Data System (ADS)
Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.
2018-03-01
In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
Evaluating Feynman integrals by the hypergeometry
NASA Astrophysics Data System (ADS)
Feng, Tai-Fu; Chang, Chao-Hsi; Chen, Jian-Bin; Gu, Zhi-Hua; Zhang, Hai-Bin
2018-02-01
The hypergeometric function method naturally provides the analytic expressions of scalar integrals from concerned Feynman diagrams in some connected regions of independent kinematic variables, also presents the systems of homogeneous linear partial differential equations satisfied by the corresponding scalar integrals. Taking examples of the one-loop B0 and massless C0 functions, as well as the scalar integrals of two-loop vacuum and sunset diagrams, we verify our expressions coinciding with the well-known results of literatures. Based on the multiple hypergeometric functions of independent kinematic variables, the systems of homogeneous linear partial differential equations satisfied by the mentioned scalar integrals are established. Using the calculus of variations, one recognizes the system of linear partial differential equations as stationary conditions of a functional under some given restrictions, which is the cornerstone to perform the continuation of the scalar integrals to whole kinematic domains numerically with the finite element methods. In principle this method can be used to evaluate the scalar integrals of any Feynman diagrams.
Solution of two-body relativistic bound state equations with confining plus Coulomb interactions
NASA Technical Reports Server (NTRS)
Maung, Khin Maung; Kahana, David E.; Norbury, John W.
1992-01-01
Studies of meson spectroscopy have often employed a nonrelativistic Coulomb plus Linear Confining potential in position space. However, because the quarks in mesons move at an appreciable fraction of the speed of light, it is necessary to use a relativistic treatment of the bound state problem. Such a treatment is most easily carried out in momentum space. However, the position space Linear and Coulomb potentials lead to singular kernels in momentum space. Using a subtraction procedure we show how to remove these singularities exactly and thereby solve the Schroedinger equation in momentum space for all partial waves. Furthermore, we generalize the Linear and Coulomb potentials to relativistic kernels in four dimensional momentum space. Again we use a subtraction procedure to remove the relativistic singularities exactly for all partial waves. This enables us to solve three dimensional reductions of the Bethe-Salpeter equation. We solve six such equations for Coulomb plus Confining interactions for all partial waves.
Lu, Tao; Lu, Minggen; Wang, Min; Zhang, Jun; Dong, Guang-Hui; Xu, Yong
2017-12-18
Longitudinal competing risks data frequently arise in clinical studies. Skewness and missingness are commonly observed for these data in practice. However, most joint models do not account for these data features. In this article, we propose partially linear mixed-effects joint models to analyze skew longitudinal competing risks data with missingness. In particular, to account for skewness, we replace the commonly assumed symmetric distributions by asymmetric distribution for model errors. To deal with missingness, we employ an informative missing data model. The joint models that couple the partially linear mixed-effects model for the longitudinal process, the cause-specific proportional hazard model for competing risks process and missing data process are developed. To estimate the parameters in the joint models, we propose a fully Bayesian approach based on the joint likelihood. To illustrate the proposed model and method, we implement them to an AIDS clinical study. Some interesting findings are reported. We also conduct simulation studies to validate the proposed method.
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Samadi-Maybodi, Abdolraouf; Darzi, S. K. Hassani Nejad
2008-10-01
Resolution of binary mixtures of vitamin B12, methylcobalamin and B12 coenzyme with minimum sample pre-treatment and without analyte separation has been successfully achieved by methods of partial least squares algorithm with one dependent variable (PLS1), orthogonal signal correction/partial least squares (OSC/PLS), principal component regression (PCR) and hybrid linear analysis (HLA). Data of analysis were obtained from UV-vis spectra. The UV-vis spectra of the vitamin B12, methylcobalamin and B12 coenzyme were recorded in the same spectral conditions. The method of central composite design was used in the ranges of 10-80 mg L -1 for vitamin B12 and methylcobalamin and 20-130 mg L -1 for B12 coenzyme. The models refinement procedure and validation were performed by cross-validation. The minimum root mean square error of prediction (RMSEP) was 2.26 mg L -1 for vitamin B12 with PLS1, 1.33 mg L -1 for methylcobalamin with OSC/PLS and 3.24 mg L -1 for B12 coenzyme with HLA techniques. Figures of merit such as selectivity, sensitivity, analytical sensitivity and LOD were determined for three compounds. The procedure was successfully applied to simultaneous determination of three compounds in synthetic mixtures and in a pharmaceutical formulation.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES
Zhu, Liping; Huang, Mian; Li, Runze
2012-01-01
This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Detection of partial-thickness tears in ligaments and tendons by Stokes-polarimetry imaging
NASA Astrophysics Data System (ADS)
Kim, Jihoon; John, Raheel; Walsh, Joseph T.
2008-02-01
A Stokes polarimetry imaging (SPI) system utilizes an algorithm developed to construct degree of polarization (DoP) image maps from linearly polarized light illumination. Partial-thickness tears of turkey tendons were imaged by the SPI system in order to examine the feasibility of the system to detect partial-thickness rotator cuff tear or general tendon pathology. The rotating incident polarization angle (IPA) for the linearly polarized light provides a way to analyze different tissue types which may be sensitive to IPA variations. Degree of linear polarization (DoLP) images revealed collagen fiber structure, related to partial-thickness tears, better than standard intensity images. DoLP images also revealed structural changes in tears that are related to the tendon load. DoLP images with red-wavelength-filtered incident light may show tears and related organization of collagen fiber structure at a greater depth from the tendon surface. Degree of circular polarization (DoCP) images exhibited well the horizontal fiber orientation that is not parallel to the vertically aligned collagen fibers of the tendon. The SPI system's DOLP images reveal alterations in tendons and ligaments, which have a tissue matrix consisting largely of collagen, better than intensity images. All polarized images showed modulated intensity as the IPA was varied. The optimal detection of the partial-thickness tendon tears at a certain IPA was observed. The SPI system with varying IPA and spectral information can improve the detection of partial-thickness rotator cuff tears by higher visibility of fiber orientations and thereby improve diagnosis and treatment of tendon related injuries.
Southern Indian Ocean SST as a modulator for the progression of Indian summer monsoon
NASA Astrophysics Data System (ADS)
Shahi, Namendra Kumar; Rai, Shailendra; Mishra, Nishant
2018-01-01
This study explores the possibility of southern Indian Ocean (SIO) sea surface temperature (SST) as a modulator for the early phase of Indian summer monsoon and its possible physical mechanism. A dipole-like structure is obtained from the empirical orthogonal function (EOF) analysis which is similar to an Indian Ocean subtropical dipole (IOSD) found earlier. A subtropical dipole index (SDI) is defined based on the SST anomaly over the positive and negative poles. The regression map of rainfall over India in the month of June corresponding to the SDI during 1983-2013 shows negative patterns along the Western Ghats and Central India. However, the regression pattern is insignificant during 1952-1982. The multiple linear regression models and partial correlation analysis also indicate that the SDI acts as a dominant factor to influence the rainfall over India in the month of June during 1983-2013. The similar result is also obtained with the help of composite rainfall over the land points of India in the month of June for positive (negative) SDI events. It is also observed that the positive (negative) SDI delays (early) the onset dates of Indian monsoon over Kerala during the time domain of our study. The study is further extended to identify the physical mechanism of this impact, and it is found that the heating (cooling) in the region covering SDI changes the circulation pattern in the SIO and hence impacts the progression of monsoon in India.
CIEL*a*b* color space predictive models for colorimetry devices--analysis of perfume quality.
Korifi, Rabia; Le Dréau, Yveline; Antinelli, Jean-François; Valls, Robert; Dupuy, Nathalie
2013-01-30
Color perception plays a major role in the consumer evaluation of perfume quality. Consumers need first to be entirely satisfied with the sensory properties of products, before other quality dimensions become relevant. The evaluation of complex mixtures color presents a challenge even for modern analytical techniques. A variety of instruments are available for color measurement. They can be classified as tristimulus colorimeters and spectrophotometers. Obsolescence of the electronics of old tristimulus colorimeter arises from the difficulty in finding repair parts and leads to its replacement by more modern instruments. High quality levels in color measurement, i.e., accuracy and reliability in color control are the major advantages of the new generation of color instrumentation, the integrating sphere spectrophotometer. Two models of spectrophotometer were tested in transmittance mode, employing the d/0° geometry. The CIEL(*)a(*)b(*) color space parameters were measured with each instrument for 380 samples of raw materials and bases used in the perfume compositions. The results were graphically compared between the colorimeter device and the spectrophotometer devices. All color space parameters obtained with the colorimeter were used as dependent variables to generate regression equations with values obtained from the spectrophotometers. The data was statistically analyzed to create predictive model between the reference and the target instruments through two methods. The first method uses linear regression analysis and the second method consists of partial least square regression (PLS) on each component. Copyright © 2012 Elsevier B.V. All rights reserved.
Chen, Tzu-An; Baranowski, Janice; Thompson, Deborah; Baranowski, Tom
2013-01-01
Abstract Background Children's physical activity (PA) is inversely associated with children's weight status. Parents may be an important influence on children's PA by restricting sedentary time or supporting PA. The aim of this study was to investigate the association of PA and screen-media–related [television (TV) and videogame] parenting practices with children's PA. Methods Secondary analyses of baseline data were performed from an intervention with 9- to 12-year-olds who received active or inactive videogames (n=83) to promote PA. Children's PA was assessed with 1 week of accelerometry at baseline. Parents reported their PA, TV, and videogame parenting practices and child's bedroom screen-media availability. Associations were investigated using Spearman's partial correlations and linear regressions. Results Although several TV and videogame parenting practices were significantly intercorrelated, only a few significant correlations existed between screen-media and PA parenting practices. In linear regression models, restrictive TV parenting practices were associated with greater child sedentary time (p=0.03) and less moderate-to-vigorous PA (MVPA; p=0.01). PA logistic support parenting practices were associated with greater child MVPA (p=0.03). Increased availability of screen-media equipment in the child's bedroom was associated with more sedentary time (p=0.02) and less light PA (p=0.01) and MVPA (p=0.05) in all three models. Conclusion In this cross-sectional sample, restrictive screen-media and supportive PA parenting practices had opposite associations with children's PA. Longitudinal and experimental child PA studies should assess PA and screen-media parenting separately to understand how parents influence their child's PA behaviors and whether the child's baseline PA or screen media behaviors affect the parent's use of parenting practices. Recommendations to remove screens from children's bedrooms may also affect their PA. PMID:24028564
Chen, Fangfang; Wang, Wenpeng; Teng, Yue; Hou, Dongqing; Zhao, Xiaoyuan; Yang, Ping; Yan, Yinkun; Mi, Jie
2014-06-01
To explore the relationship between high-sensitivity C-reactive protein (hsCRP) and obesity/metabolic syndrome (MetS) related factors in children. 403 children aged 10-14 and born in Beijing were involved in this study. Height, weight, waist circumference, fat mass percentage (Fat%), blood pressure (BP), hsCRP, triglyceride (TG), total cholesterol (TC), fasting plasma glucose (FPG), high and low density lipoprotein cholesterol (HDL-C, LDL-C) were observed among these children. hsCRP was transformed with base 10 logarithm (lgCRP). MetS was defined according to the International Diabetes Federation 2007 definition. Associations between MetS related components and hsCRP were tested using partial correlation analysis, analysis of covariance and linear regression models. 1) lgCRP was positively correlated with BMI, waist circumference, Fat%,BP, FPG, LDL-C and TC while negatively correlated with HDL-C. With BMI under control, the relationships disappeared, but LDL-C (r = 0.102). 2) The distributions of lgCRP showed obvious differences in all the metabolic indices, in most groups, respectively. With BMI under control, close relationships between lgCRP and high blood pressure/high TG disappeared and the relationship with MetS weakened. 3) Through linear regression models, factors as waist circumference, BMI, Fat% were the strongest factors related to hsCRP, followed by systolic BP, HDL-C, diastolic BP, TG and LDL-C. With BMI under control, the relationships disappeared, but LDL-C(β = 0.045). hsCRP was correlated with child obesity, lipid metabolism and MetS. Waist circumference was the strongest factors related with hsCRP. Obesity was the strongest and the independent influencing factor of hsCRP.
Nickel and blood counts in workers exposed to urban stressors.
Rosati, Maria Valeria; Casale, Teodorico; Ciarrocca, Manuela; Weiderpass, Elisabete; Capozzella, Assunta; Schifano, Maria Pia; Tomei, Francesco; Nieto, Hector Alberto; Marrocco, Mariasilvia; Tomei, Gianfranco; Caciari, Tiziana; Sancini, Angela
2016-06-01
Nickel (Ni) and Ni compounds are widely present in the urban air. The purpose of this study is to estimate exposure of individuals to Ni and the correlation between this exposure and the values of blood counts in outdoor workers. This study focused on a sample of 101 outdoor workers (55 male and 46 female; 65 nonsmokers and 36 smokers), all employed in the municipal police in a large Italian city. The personal levels of exposure to Ni were assessed through (a) environmental monitoring of Ni present in the urban air obtained from individual samples and (b) biological monitoring of urinary and blood Ni. The blood count parameters were obtained from the hemochromocytometric tests. Pearson correlation coefficients (r) were calculated to assess the association between the blood and urinary Ni and the complete blood count. Multiple linear regression models were used to examine the associations between the complete blood count and the independent variables (age, gender, years of work for current tasks, cigarette smoking habit (current and never smoker), values of airborne Ni, and blood and urinary Ni). Multiple linear regression analysis performed on the total group of 101 subjects confirms the association among the red blood cells count, the hematocrit, and the urinary Ni (R(2) = 0.520, p = 0.025 and R(2) = 0.530, p = 0.030). These results should lead to further studies on the effects of Ni in working populations exposed to urban pollutants. The possibility that the associations found in our study may be partially explained by other urban pollutants (such as benzene, toluene, and other heavy metals) not taken into consideration in this study cannot be ruled out. © The Author(s) 2014.
Weiser, Sheri D; Gupta, Reshma; Tsai, Alexander C; Frongillo, Edward A; Grede, Nils; Kumbakumba, Elias; Kawuma, Annet; Hunt, Peter W; Martin, Jeffrey N; Bangsberg, David R
2012-10-01
To investigate whether time on antiretroviral therapy (ART) is associated with improvements in food security and nutritional status, and the extent to which associations are mediated by improved physical health status. The Uganda AIDS Rural Treatment Outcomes study, a prospective cohort of HIV-infected adults newly initiating ART in Mbarara, Uganda. Participants initiating ART underwent quarterly structured interview and blood draws. The primary explanatory variable was time on ART, constructed as a set of binary variables for each 3-month period. Outcomes were food insecurity, nutritional status, and PHS. We fit multiple regression models with cluster-correlated robust estimates of variance to account for within-person dependence of observations over time, and analyses were adjusted for clinical and sociodemographic characteristics. Two hundred twenty-eight ART-naive participants were followed for up to 3 years, and 41% were severely food insecure at baseline. The mean food insecurity score progressively declined (test for linear trend P < 0.0001), beginning with the second quarter (b = -1.6; 95% confidence interval: -2.7 to -0.45) and ending with the final quarter (b = -6.4; 95% confidence interval: -10.3 to -2.5). PHS and nutritional status improved in a linear fashion over study follow-up (P < 0.001). Inclusion of PHS in the regression model attenuated the relationship between ART duration and food security. Among HIV-infected individuals in Uganda, food insecurity decreased and nutritional status and PHS improved over time after initiation of ART. Changes in food insecurity were partially explained by improvements in PHS. These data support early initiation of ART in resource-poor settings before decline in functional status to prevent worsening food insecurity and its detrimental effects on HIV treatment outcomes.
Jilcott, Stephanie B; Wall-Bassett, Elizabeth D; Burke, Sloane C; Moore, Justin B
2011-11-01
Obesity disproportionately affects low-income and minority individuals and has been linked with food insecurity, particularly among women. More research is needed to examine potential mechanisms linking obesity and food insecurity. Therefore, this study's purpose was to examine cross-sectional associations between food insecurity, Supplemental Nutrition Assistance Program (SNAP) benefits per household member, perceived stress, and body mass index (BMI) among female SNAP participants in eastern North Carolina (n=202). Women were recruited from the Pitt County Department of Social Services between October 2009 and April 2010. Household food insecurity was measured using the validated US Department of Agriculture 18-item food security survey module. Perceived stress was measured using the 14-item Cohen's Perceived Stress Scale. SNAP benefits and number of children in the household were self-reported and used to calculate benefits per household member. BMI was calculated from measured height and weight (as kg/m(2)). Multivariate linear regression was used to examine associations between BMI, SNAP benefits, stress, and food insecurity while adjusting for age and physical activity. In adjusted linear regression analyses, perceived stress was positively related to food insecurity (P<0.0001), even when SNAP benefits were included in the model. BMI was positively associated with food insecurity (P=0.04). Mean BMI was significantly greater among women receiving <$150 in SNAP benefits per household member vs those receiving ≥$150 in benefits per household member (35.8 vs 33.1; P=0.04). Results suggest that provision of adequate SNAP benefits per household member might partially ameliorate the negative effects of food insecurity on BMI. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Gupta, Deepak K; Claggett, Brian; Wells, Quinn; Cheng, Susan; Li, Man; Maruthur, Nisa; Selvin, Elizabeth; Coresh, Josef; Konety, Suma; Butler, Kenneth R; Mosley, Thomas; Boerwinkle, Eric; Hoogeveen, Ron; Ballantyne, Christie M; Solomon, Scott D
2015-01-01
Background Natriuretic peptides promote natriuresis, diuresis, and vasodilation. Experimental deficiency of natriuretic peptides leads to hypertension (HTN) and cardiac hypertrophy, conditions more common among African Americans. Hospital-based studies suggest that African Americans may have reduced circulating natriuretic peptides, as compared to Caucasians, but definitive data from community-based cohorts are lacking. Methods and Results We examined plasma N-terminal pro B-type natriuretic peptide (NTproBNP) levels according to race in 9137 Atherosclerosis Risk in Communities (ARIC) Study participants (22% African American) without prevalent cardiovascular disease at visit 4 (1996–1998). Multivariable linear and logistic regression analyses were performed adjusting for clinical covariates. Among African Americans, percent European ancestry was determined from genetic ancestry informative markers and then examined in relation to NTproBNP levels in multivariable linear regression analysis. NTproBNP levels were significantly lower in African Americans (median, 43 pg/mL; interquartile range [IQR], 18, 88) than Caucasians (median, 68 pg/mL; IQR, 36, 124; P<0.0001). In multivariable models, adjusted log NTproBNP levels were 40% lower (95% confidence interval [CI], −43, −36) in African Americans, compared to Caucasians, which was consistent across subgroups of age, gender, HTN, diabetes, insulin resistance, and obesity. African-American race was also significantly associated with having nondetectable NTproBNP (adjusted OR, 5.74; 95% CI, 4.22, 7.80). In multivariable analyses in African Americans, a 10% increase in genetic European ancestry was associated with a 7% (95% CI, 1, 13) increase in adjusted log NTproBNP. Conclusions African Americans have lower levels of plasma NTproBNP than Caucasians, which may be partially owing to genetic variation. Low natriuretic peptide levels in African Americans may contribute to the greater risk for HTN and its sequalae in this population. PMID:25999400
Liu, Bin; Geng, Huizhen; Yang, Juan; Zhang, Ying; Deng, Langhui; Chen, Weiqing; Wang, Zilian
2016-03-17
Hyperlipidemia and high fasting plasma glucose levels at the first prenatal visit (First Visit FPG) are both related to gestational diabetes mellitus, maternal obesity/overweight and fetal overgrowth. The purpose of the present study is to investigate the correlation between First Visit FPG and lipid concentrations, and their potential association with offspring size at delivery. Pregnant women that received regular prenatal care and delivered in our center in 2013 were recruited for the study. Fasting plasma glucose levels were tested at the first prenatal visit (First Visit FPG) and prior to delivery (Before Delivery FPG). HbA1c and lipid profiles were examined at the time of OGTT test. Maternal and neonatal clinical data were collected for analysis. Data was analyzed by independent sample t test, Pearson correlation, and Chi-square test, followed by partial correlation and multiple linear regression analyses to confirm association. Statistical significance level was α =0.05. Analyses were based on 1546 mother-baby pairs. First Visit FPG was not correlated with any lipid parameters after adjusting for maternal pregravid BMI, maternal age and gestational age at First Visit FPG. HbA1c was positively correlated with triglyceride and Apolipoprotein B in the whole cohort and in the NGT group after adjusting for maternal age and maternal BMI at OGTT test. Multiple linear regression analyses showed neonatal birth weight, head circumference and shoulder circumference were all associated with First Visit FPG and triglyceride levels. Fasting plasma glucose at first prenatal visit is not associated with lipid concentrations in mid-pregnancy, but may influence fetal growth together with triglyceride concentration.
Kawasaki, Y; Tamaura, Y; Akamatsu, R; Sakai, M; Fujiwara, K
2018-01-01
Nursing staff have an important role in patients' nutritional care. The aim of this study was to demonstrate how the practice of sharing a patient's nutritional status with colleagues was affected by the nursing staff's attitude, knowledge and their priority to provide nutritional care. The participants were 492 nursing staff. We obtained participants' demographic data, the practice of sharing patients' nutritional information and information about participants' knowledge, attitude and priority of providing nutritional care by the questionnaire. We performed partial correlation analyses and linear regression analyses to describe the relationship between the total scores of the practice of sharing patients' nutritional information based on their knowledge, attitude and priority to provide nutritional care. Among the 492 participants, 396 nursing staff (80.5%) completed the questionnaire and were included in analyses. Mean±s.d. of total score of the 396 participants was 8.4±3.1. Nursing staff shared information when they had a high nutritional knowledge (r=0.36, P<0.01) and attitude (r=0.13, P<0.05); however, their correlation coefficients were low. In the linear regression analyses, job categories (β=-0.28, P<0.01), knowledge (β=0.33, P<0.01) and attitude (β=0.10, P<0.05) were independently associated with the practice of sharing information. Nursing staff's priority to provide nutritional care practice was not significantly associated with the practice of sharing information. Knowledge and attitude were independently associated with the practice of sharing patients' nutrition information with colleagues, regardless of their priority to provide nutritional care. An effective approach should be taken to improve the practice of providing nutritional care practice.
NASA Astrophysics Data System (ADS)
Filimonov, M. Yu.
2017-12-01
The method of special series with recursively calculated coefficients is used to solve nonlinear partial differential equations. The recurrence of finding the coefficients of the series is achieved due to a special choice of functions, in powers of which the solution is expanded in a series. We obtain a sequence of linear partial differential equations to find the coefficients of the series constructed. In many cases, one can deal with a sequence of linear ordinary differential equations. We construct classes of solutions in the form of convergent series for a certain class of nonlinear evolution equations. A new class of solutions of generalized Boussinesque equation with an arbitrary function in the form of a convergent series is constructed.
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Libiger, Ondrej; Schork, Nicholas J.
2015-01-01
It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061
The Seismic Tool-Kit (STK): an open source software for seismology and signal processing.
NASA Astrophysics Data System (ADS)
Reymond, Dominique
2016-04-01
We present an open source software project (GNU public license), named STK: Seismic ToolKit, that is dedicated mainly for seismology and signal processing. The STK project that started in 2007, is hosted by SourceForge.net, and count more than 19 500 downloads at the date of writing. The STK project is composed of two main branches: First, a graphical interface dedicated to signal processing (in the SAC format (SAC_ASCII and SAC_BIN): where the signal can be plotted, zoomed, filtered, integrated, derivated, ... etc. (a large variety of IFR and FIR filter is proposed). The estimation of spectral density of the signal are performed via the Fourier transform, with visualization of the Power Spectral Density (PSD) in linear or log scale, and also the evolutive time-frequency representation (or sonagram). The 3-components signals can be also processed for estimating their polarization properties, either for a given window, or either for evolutive windows along the time. This polarization analysis is useful for extracting the polarized noises, differentiating P waves, Rayleigh waves, Love waves, ... etc. Secondly, a panel of Utilities-Program are proposed for working in a terminal mode, with basic programs for computing azimuth and distance in spherical geometry, inter/auto-correlation, spectral density, time-frequency for an entire directory of signals, focal planes, and main components axis, radiation pattern of P waves, Polarization analysis of different waves (including noize), under/over-sampling the signals, cubic-spline smoothing, and linear/non linear regression analysis of data set. A MINimum library of Linear AlGebra (MIN-LINAG) is also provided for computing the main matrix process like: QR/QL decomposition, Cholesky solve of linear system, finding eigen value/eigen vectors, QR-solve/Eigen-solve of linear equations systems ... etc. STK is developed in C/C++, mainly under Linux OS, and it has been also partially implemented under MS-Windows. Usefull links: http://sourceforge.net/projects/seismic-toolkit/ http://sourceforge.net/p/seismic-toolkit/wiki/browse_pages/
Feedback linearization for control of air breathing engines
NASA Technical Reports Server (NTRS)
Phillips, Stephen; Mattern, Duane
1991-01-01
The method of feedback linearization for control of the nonlinear nozzle and compressor components of an air breathing engine is presented. This method overcomes the need for a large number of scheduling variables and operating points to accurately model highly nonlinear plants. Feedback linearization also results in linear closed loop system performance simplifying subsequent control design. Feedback linearization is used for the nonlinear partial engine model and performance is verified through simulation.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.