Ridge: a computer program for calculating ridge regression estimates
Donald E. Hilt; Donald W. Seegrist
1977-01-01
Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Park, Trevor
2017-01-01
A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
Yamazaki, Takeshi; Takeda, Hisato; Hagiya, Koichi; Yamaguchi, Satoshi; Sasaki, Osamu
2018-03-13
Because lactation periods in dairy cows lengthen with increasing total milk production, it is important to predict individual productivities after 305 days in milk (DIM) to determine the optimal lactation period. We therefore examined whether the random regression (RR) coefficient from 306 to 450 DIM (M2) can be predicted from those during the first 305 DIM (M1) by using a random regression model. We analyzed test-day milk records from 85690 Holstein cows in their first lactations and 131727 cows in their later (second to fifth) lactations. Data in M1 and M2 were analyzed separately by using different single-trait RR animal models. We then performed a multiple regression analysis of the RR coefficients of M2 on those of M1 during the first and later lactations. The first-order Legendre polynomials were practical covariates of random regression for the milk yields of M2. All RR coefficients for the additive genetic (AG) effect and the intercept for the permanent environmental (PE) effect of M2 had moderate to strong correlations with the intercept for the AG effect of M1. The coefficients of determination for multiple regression of the combined intercepts for the AG and PE effects of M2 on the coefficients for the AG effect of M1 were moderate to high. The daily milk yields of M2 predicted by using the RR coefficients for the AG effect of M1 were highly correlated with those obtained by using the coefficients of M2. Milk production after 305 DIM can be predicted by using the RR coefficient estimates of the AG effect during the first 305 DIM.
Kitagawa, Yasuhisa; Teramoto, Tamio; Daida, Hiroyuki
2012-01-01
We evaluated the impact of adherence to preferable behavior on serum lipid control assessed by a self-reported questionnaire in high-risk patients taking pravastatin for primary prevention of coronary artery disease. High-risk patients taking pravastatin were followed for 2 years. Questionnaire surveys comprising 21 questions, including 18 questions concerning awareness of health, and current status of diet, exercise, and drug therapy, were conducted at baseline and after 1 year. Potential domains were established by factor analysis from the results of questionnaires, and adherence scores were calculated in each domain. The relationship between adherence scores and lipid values during the 1-year treatment period was analyzed by each domain using multiple regression analysis. A total of 5,792 patients taking pravastatin were included in the analysis. Multiple regression analysis showed a significant correlation in terms of "Intake of high fat/cholesterol/sugar foods" (regression coefficient -0.58, p=0.0105) and "Adherence to instructions for drug therapy" (regression coefficient -6.61, p<0.0001). Low-density lipoprotein cholesterol (LDL-C) values were significantly lower in patients who had an increase in the adherence score in the "Awareness of health" domain compared with those with a decreased score. There was a significant correlation between high-density lipoprotein (HDL-C) values and "Awareness of health" (regression coefficient 0.26; p= 0.0037), "Preferable dietary behaviors" (regression coefficient 0.75; p<0.0001), and "Exercise" (regression coefficient 0.73; p= 0.0002). Similar relations were seen with triglycerides. In patients who have a high awareness of their health, a positive attitude toward lipid-lowering treatment including diet, exercise, and high adherence to drug therapy, is related with favorable overall lipid control even in patients under treatment with pravastatin.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Ecotoxicology of phenylphosphonothioates.
Francis, B M; Hansen, L G; Fukuto, T R; Lu, P Y; Metcalf, R L
1980-01-01
The phenylphosphonothioate insecticides EPN and leptophos, and several analogs, were evaluated with respect to their delayed neurotoxic effects in hens and their environmental behavior in a terrestrial-aquatic model ecosystem. Acute toxicity to insects was highly correlated with sigma sigma of the substituted phenyl group (regression coefficient r = -0.91) while acute toxicity to mammals was slightly less well correlated (regression coefficient r = -0.71), and neurotoxicity was poorly correlated with sigma sigma (regression coefficient r = -0.35). Both EPN and leptophos were markedly more persistent and bioaccumulative in the model ecosystem than parathion. Desbromoleptophos, a contaminant and metabolite of leptophos, was seen to be a highly stable and persistent terminal residue of leptophos. PMID:6159210
Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert
2013-01-01
Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.
Testing a single regression coefficient in high dimensional linear models
Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2017-01-01
In linear regression models with high dimensional data, the classical z-test (or t-test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z-test to assess the significance of each covariate. Based on the p-value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively. PMID:28663668
Testing a single regression coefficient in high dimensional linear models.
Lan, Wei; Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling
2016-11-01
In linear regression models with high dimensional data, the classical z -test (or t -test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z -test to assess the significance of each covariate. Based on the p -value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively.
Modified Regression Correlation Coefficient for Poisson Regression Model
NASA Astrophysics Data System (ADS)
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Investigating bias in squared regression structure coefficients
Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa
2008-01-01
This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.
Analysis of oscillatory motion of a light airplane at high values of lift coefficient
NASA Technical Reports Server (NTRS)
Batterson, J. G.
1983-01-01
A modified stepwise regression is applied to flight data from a light research air-plane operating at high angles at attack. The well-known phenomenon referred to as buckling or porpoising is analyzed and modeled using both power series and spline expansions of the aerodynamic force and moment coefficients associated with the longitudinal equations of motion.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
Metrics to Compare Aircraft Operating and Support Costs in the Department of Defense
2015-01-01
a phenomenon in regression analysis called multicollinear - ity, which makes problematic the interpretation of the coefficient esti- mates of highly...indicating a very high amount of multicollinearity and suggesting that the magnitude of the coefficients on those variables should be treated with caution... multicollinearity between these independent variables, one must be cautious when interpreting the statistical relationship between flying hours and cost. The
Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.
ERIC Educational Resources Information Center
Thompson, Bruce
Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…
Biases and Standard Errors of Standardized Regression Coefficients
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Chan, Wai
2011-01-01
The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
On the Occurrence of Standardized Regression Coefficients Greater than One.
ERIC Educational Resources Information Center
Deegan, John, Jr.
1978-01-01
It is demonstrated here that standardized regression coefficients greater than one can legitimately occur. Furthermore, the relationship between the occurrence of such coefficients and the extent of multicollinearity present among the set of predictor variables in an equation is examined. Comments on the interpretation of these coefficients are…
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients
NASA Astrophysics Data System (ADS)
Adnan, Arisman; Sugiarto, Sigit
2017-06-01
The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.
Osinga, Rik; Babst, Doris; Bodmer, Elvira S; Link, Bjoern C; Fritsche, Elmar; Hug, Urs
2017-12-01
This work assessed both subjective and objective postoperative parameters after breast reduction surgery and compared between patients and plastic surgeons. After an average postoperative observation period of 6.7 ± 2.7 (2 - 13) years, 159 out of 259 patients (61 %) were examined. The mean age at the time of surgery was 37 ± 14 (15 - 74) years. The postoperative anatomy of the breast and other anthropometric parameters were measured in cm with the patient in an upright position. The visual analogue scale (VAS) values for symmetry, size, shape, type of scar and overall satisfaction both from the patient's and from four plastic surgeons' perspectives were assessed and compared. Patients rated the postoperative result significantly better than surgeons. Good subjective ratings by patients for shape, symmetry and sensitivity correlated with high scores for overall assessment. Shape had the strongest influence on overall satisfaction (regression coefficient 0.357; p < 0.001), followed by symmetry (regression coefficient 0.239; p < 0.001) and sensitivity (regression coefficient 0.109; p = 0.040) of the breast. The better the subjective rating for symmetry by the patient, the smaller the measured difference of the jugulum-mamillary distance between left and right (regression coefficient -0.773; p = 0.002) and the smaller the difference in height of the lowest part of the breast between left and right (regression coefficient -0.465; p = 0.035). There was no significant correlation between age, weight, height, BMI, resected weight of the breast, postoperative breast size or type of scar with overall satisfaction. After breast reduction surgery, long-term outcome is rated significantly better by patients than by plastic surgeons. Good subjective ratings by patients for shape, symmetry and sensitivity correlated with high scores for overall assessment. Shape had the strongest influence on overall satisfaction, followed by symmetry and sensitivity of the breast. Postoperative size of the breast, resection weight, type of scar, age or BMI was not of significant influence. Symmetry was the only assessed subjective parameter of this study that could be objectified by postoperative measurements. Georg Thieme Verlag KG Stuttgart · New York.
Viability estimation of pepper seeds using time-resolved photothermal signal characterization
NASA Astrophysics Data System (ADS)
Kim, Ghiseok; Kim, Geon-Hee; Lohumi, Santosh; Kang, Jum-Soon; Cho, Byoung-Kwan
2014-11-01
We used infrared thermal signal measurement system and photothermal signal and image reconstruction techniques for viability estimation of pepper seeds. Photothermal signals from healthy and aged seeds were measured for seven periods (24, 48, 72, 96, 120, 144, and 168 h) using an infrared camera and analyzed by a regression method. The photothermal signals were regressed using a two-term exponential decay curve with two amplitudes and two time variables (lifetime) as regression coefficients. The regression coefficients of the fitted curve showed significant differences for each seed groups, depending on the aging times. In addition, the viability of a single seed was estimated by imaging of its regression coefficient, which was reconstructed from the measured photothermal signals. The time-resolved photothermal characteristics, along with the regression coefficient images, can be used to discriminate the aged or dead pepper seeds from the healthy seeds.
Gjerde, Hallvard; Verstraete, Alain
2010-02-25
To study several methods for estimating the prevalence of high blood concentrations of tetrahydrocannabinol and amphetamine in a population of drug users by analysing oral fluid (saliva). Five methods were compared, including simple calculation procedures dividing the drug concentrations in oral fluid by average or median oral fluid/blood (OF/B) drug concentration ratios or linear regression coefficients, and more complex Monte Carlo simulations. Populations of 311 cannabis users and 197 amphetamine users from the Rosita-2 Project were studied. The results of a feasibility study suggested that the Monte Carlo simulations might give better accuracies than simple calculations if good data on OF/B ratios is available. If using only 20 randomly selected OF/B ratios, a Monte Carlo simulation gave the best accuracy but not the best precision. Dividing by the OF/B regression coefficient gave acceptable accuracy and precision, and was therefore the best method. None of the methods gave acceptable accuracy if the prevalence of high blood drug concentrations was less than 15%. Dividing the drug concentration in oral fluid by the OF/B regression coefficient gave an acceptable estimation of high blood drug concentrations in a population, and may therefore give valuable additional information on possible drug impairment, e.g. in roadside surveys of drugs and driving. If good data on the distribution of OF/B ratios are available, a Monte Carlo simulation may give better accuracy. 2009 Elsevier Ireland Ltd. All rights reserved.
Predicting Student Engagement in Online High Schools
ERIC Educational Resources Information Center
Vieira, Christopher James
2013-01-01
The purpose of this study was to analyze student engagement in online high schools based on demographic information of high school students using a mixed methods research design. Key findings through a multiple regression analysis and Pearson correlation coefficient suggest that although the majority of participants in the study are highly engaged…
Penalized spline estimation for functional coefficient regression models.
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan
2010-04-01
The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.
Srivastava, Nishi; Srivastava, Amit; Srivastava, Sharad; Rawat, Ajay Kumar Singh; Khan, Abdul Rahman
2016-03-01
A rapid, sensitive, selective and robust quantitative densitometric high-performance thin-layer chromatographic method was developed and validated for separation and quantification of syringic acid (SYA) and kaempferol (KML) in the hydrolyzed extracts of Bergenia ciliata and Bergenia stracheyi. The separation was performed on silica gel 60F254 high-performance thin-layer chromatography plates using toluene : ethyl acetate : formic acid (5 : 4: 1, v/v/v) as the mobile phase. The quantification of SYA and KML was carried out using a densitometric reflection/absorption mode at 290 nm. A dense spot of SYA and KML appeared on the developed plate at a retention factor value of 0.61 ± 0.02 and 0.70 ± 0.01. A precise and accurate quantification was performed using linear regression analysis by plotting the peak area vs concentration 100-600 ng/band (correlation coefficient: r = 0.997, regression coefficient: R(2) = 0.996) for SYA and 100-600 ng/band (correlation coefficient: r = 0.995, regression coefficient: R(2) = 0.991) for KML. The developed method was validated in terms of accuracy, recovery and inter- and intraday study as per International Conference on Harmonisation guidelines. The limit of detection and limit of quantification of SYA and KML were determined, respectively, as 91.63, 142.26 and 277.67, 431.09 ng. The statistical data analysis showed that the method is reproducible and selective for the estimation of SYA and KML in extracts of B. ciliata and B. stracheyi. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
NASA Astrophysics Data System (ADS)
Gholizadeh, H.; Robeson, S. M.
2015-12-01
Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.
Influences on Academic Achievement Across High and Low Income Countries: A Re-Analysis of IEA Data.
ERIC Educational Resources Information Center
Heyneman, S.; Loxley, W.
Previous international studies of science achievement put the data through a process of winnowing to decide which variables to keep in the final regressions. Variables were allowed to enter the final regressions if they met a minimum beta coefficient criterion of 0.05 averaged across rich and poor countries alike. The criterion was an average…
Ke, Tracy; Fan, Jianqing; Wu, Yichao
2014-01-01
This paper explores the homogeneity of coefficients in high-dimensional regression, which extends the sparsity concept and is more general and suitable for many applications. Homogeneity arises when regression coefficients corresponding to neighboring geographical regions or a similar cluster of covariates are expected to be approximately the same. Sparsity corresponds to a special case of homogeneity with a large cluster of known atom zero. In this article, we propose a new method called clustering algorithm in regression via data-driven segmentation (CARDS) to explore homogeneity. New mathematics are provided on the gain that can be achieved by exploring homogeneity. Statistical properties of two versions of CARDS are analyzed. In particular, the asymptotic normality of our proposed CARDS estimator is established, which reveals better estimation accuracy for homogeneous parameters than that without homogeneity exploration. When our methods are combined with sparsity exploration, further efficiency can be achieved beyond the exploration of sparsity alone. This provides additional insights into the power of exploring low-dimensional structures in high-dimensional regression: homogeneity and sparsity. Our results also shed lights on the properties of the fussed Lasso. The newly developed method is further illustrated by simulation studies and applications to real data. Supplementary materials for this article are available online. PMID:26085701
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa
2011-08-01
In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient.
Cui, Yang; Wang, Silong; Yan, Shaokui
2016-01-01
Phi coefficient directly depends on the frequencies of occurrence of organisms and has been widely used in vegetation ecology to analyse the associations of organisms with site groups, providing a characterization of ecological preference, but its application in soil ecology remains rare. Based on a single field experiment, this study assessed the applicability of phi coefficient in indicating the habitat preferences of soil fauna, through comparing phi coefficient-induced results with those of ordination methods in charactering soil fauna-habitat(factors) relationships. Eight different habitats of soil fauna were implemented by reciprocal transfer of defaunated soil cores between two types of subtropical forests. Canonical correlation analysis (CCorA) showed that ecological patterns of fauna-habitat relationships and inter-fauna taxa relationships expressed, respectively, by phi coefficients and predicted abundances calculated from partial redundancy analysis (RDA), were extremely similar, and a highly significant relationship between the two datasets was observed (Pillai's trace statistic = 1.998, P = 0.007). In addition, highly positive correlations between phi coefficients and predicted abundances for Acari, Collembola, Nematode and Hemiptera were observed using linear regression analysis. Quantitative relationships between habitat preferences and soil chemical variables were also obtained by linear regression, which were analogous to the results displayed in a partial RDA biplot. Our results suggest that phi coefficient could be applicable on a local scale in evaluating habitat preferences of soil fauna at coarse taxonomic levels, and that the phi coefficient-induced information, such as ecological preferences and the associated quantitative relationships with habitat factors, will be largely complementary to the results of ordination methods. The application of phi coefficient in soil ecology may extend our knowledge about habitat preferences and distribution-abundance relationships, which will benefit the understanding of biodistributions and variations in community compositions in the soil. Similar studies in other places and scales apart from our local site will be need for further evaluation of phi coefficient. PMID:26930593
Chaurasia, Ashok; Harel, Ofer
2015-02-10
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Richter, Jörg
2015-04-01
Methods to assess intervention progress and outcome for frequent use are needed. To provide preliminary information about psychometric properties for the Norwegian version of the Brief Problems Monitor. Cronbach's alpha scores and intra-class correlation coefficients as indicators for internal consistency (reliability) and Pearson correlation coefficients between corresponding subscales of the long and short ASEBA form versions as well as multiple regression coefficients to explore the predictive power of the reduced item-set related to the corresponding scale-scores of the long version were calculated in large, representative data sets of Norwegian children and adolescents. Cronbach's alpha scores of the Norwegian version of the BPM subscales varied between 0.67 (attention BPM-youth) and 0.88 (attention BPM-teacher) and between 0.90 (BPM-youth) and 0.96 (BPM-teacher) for its total problem score. Corresponding subscales from the long versions and the BPM as well as the total problems scores were closely correlated with coefficients of high effect size (all r > 0.80). The variance of the items of the BPM explained about three-quarters or more of the variance in the corresponding subscales of the long version. The Norwegian BPM has good psychometric properties in terms of 1) being acceptable to good internal consistency and in terms of 2) regression coefficients of high effect size from the BPM items to the problem-scale scores of the long versions as validity indicators. Its use in clinical practice and research can be recommended.
Satellite remote sensing of fine particulate air pollutants over Indian mega cities
NASA Astrophysics Data System (ADS)
Sreekanth, V.; Mahesh, B.; Niranjan, K.
2017-11-01
In the backdrop of the need for high spatio-temporal resolution data on PM2.5 mass concentrations for health and epidemiological studies over India, empirical relations between Aerosol Optical Depth (AOD) and PM2.5 mass concentrations are established over five Indian mega cities. These relations are sought to predict the surface PM2.5 mass concentrations from high resolution columnar AOD datasets. Current study utilizes multi-city public domain PM2.5 data (from US Consulate and Embassy's air monitoring program) and MODIS AOD, spanning for almost four years. PM2.5 is found to be positively correlated with AOD. Station-wise linear regression analysis has shown spatially varying regression coefficients. Similar analysis has been repeated by eliminating data from the elevated aerosol prone seasons, which has improved the correlation coefficient. The impact of the day to day variability in the local meteorological conditions on the AOD-PM2.5 relationship has been explored by performing a multiple regression analysis. A cross-validation approach for the multiple regression analysis considering three years of data as training dataset and one-year data as validation dataset yielded an R value of ∼0.63. The study was concluded by discussing the factors which can improve the relationship.
Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses. PMID:22457655
Tools to support interpreting multiple regression in the face of multicollinearity.
Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K
2012-01-01
While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Yoneoka, Daisuke; Henmi, Masayuki
2017-11-30
Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta-analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2-step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed-effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts. Copyright © 2017 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
NASA Astrophysics Data System (ADS)
Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi
2017-03-01
Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
NASA Astrophysics Data System (ADS)
Zhan, Liwei; Li, Chengwei
2017-02-01
A hybrid PSO-SVM-based model is proposed to predict the friction coefficient between aircraft tire and coating. The presented hybrid model combines a support vector machine (SVM) with particle swarm optimization (PSO) technique. SVM has been adopted to solve regression problems successfully. Its regression accuracy is greatly related to optimizing parameters such as the regularization constant C , the parameter gamma γ corresponding to RBF kernel and the epsilon parameter \\varepsilon in the SVM training procedure. However, the friction coefficient which is predicted based on SVM has yet to be explored between aircraft tire and coating. The experiment reveals that drop height and tire rotational speed are the factors affecting friction coefficient. Bearing in mind, the friction coefficient can been predicted using the hybrid PSO-SVM-based model by the measured friction coefficient between aircraft tire and coating. To compare regression accuracy, a grid search (GS) method and a genetic algorithm (GA) are used to optimize the relevant parameters (C , γ and \\varepsilon ), respectively. The regression accuracy could be reflected by the coefficient of determination ({{R}2} ). The result shows that the hybrid PSO-RBF-SVM-based model has better accuracy compared with the GS-RBF-SVM- and GA-RBF-SVM-based models. The agreement of this model (PSO-RBF-SVM) with experiment data confirms its good performance.
Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.
Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil
2015-12-07
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.
Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil
2015-01-01
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190
Use of Thematic Mapper for water quality assessment
NASA Technical Reports Server (NTRS)
Horn, E. M.; Morrissey, L. A.
1984-01-01
The evaluation of simulated TM data obtained on an ER-2 aircraft at twenty-five predesignated sample sites for mapping water quality factors such as conductivity, pH, suspended solids, turbidity, temperature, and depth, is discussed. Using a multiple regression for the seven TM bands, an equation is developed for the suspended solids. TM bands 1, 2, 3, 4, and 6 are used with logarithm conductivity in a multiple regression. The assessment of regression equations for a high coefficient of determination (R-squared) and statistical significance is considered. Confidence intervals about the mean regression point are calculated in order to assess the robustness of the regressions used for mapping conductivity, turbidity, and suspended solids, and by regressing random subsamples of sites and comparing the resultant range of R-squared, cross validation is conducted.
Detection of Cutting Tool Wear using Statistical Analysis and Regression Model
NASA Astrophysics Data System (ADS)
Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin
2010-10-01
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.
AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad
2016-01-01
Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463
SCI model structure determination program (OSR) user's guide. [optimal subset regression
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program, OSR (Optimal Subset Regression) which estimates models for rotorcraft body and rotor force and moment coefficients is described. The technique used is based on the subset regression algorithm. Given time histories of aerodynamic coefficients, aerodynamic variables, and control inputs, the program computes correlation between various time histories. The model structure determination is based on these correlations. Inputs and outputs of the program are given.
Pape, B E; Cary, P L; Clay, L C; Godolphin, W
1983-01-01
Pentobarbital serum concentrations associated with a high-dose therapeutic regimen were determined using EMIT immunoassay reagents. Replicate analyses of serum controls resulted in a within-assay coefficient of variation of 5.0% and a between-assay coefficient of variation of 10%. Regression analysis of 44 serum samples analyzed by this technique (y) and a reference procedure (x) were y = 0.98x + 3.6 (r = 0.98; x = ultraviolet spectroscopy) and y = 1.04x + 2.4 (r = 0.96; x = high-performance liquid chromatography). Clinical evaluation of the results indicates the immunoassay is sufficiently sensitive and selective for pentobarbital to allow accurate quantitation within the therapeutic range associated with high-dose therapy.
Zhao, Yang; Zhang, Xue Qing; Bian, Xiao Dong
2018-01-01
To investigate the early supplementary processes of fishre sources in the Bohai Sea, the geographically weighted regression (GWR) was introduced to the habitat suitability index (HSI) model. The Bohai Sea larval Japanese Halfbeak HSI GWR model was established with four environmental variables, including sea surface temperature (SST), sea surface salinity (SSS), water depth (DEP), and chlorophyll a concentration (Chl a). Results of the simulation showed that the four variables had different performances in August 2015. SST and Chl a were global variables, and had little impacts on HSI, with the regression coefficients of -0.027 and 0.006, respectively. SSS and DEP were local variables, and had larger impacts on HSI, while the average values of absolute values of their regression coefficients were 0.075 and 0.129, respectively. In the central Bohai Sea, SSS showed a negative correlation with HSI, and the most negative correlation coefficient was -0.3. In contrast, SSS was correlated positively but weakly with HSI in the three bays of Bohai Sea, and the largest correlation coefficient was 0.1. In particular, DEP and HSI were negatively correlated in the entire Bohai Sea, while they were more negatively correlated in the three bays of Bohai than in the central Bohai Sea, and the most negative correlation coefficient was -0.16 in the three bays. The Poisson regression coefficient of the HSI GWR model was 0.705, consistent with field measurements. Therefore, it could provide a new method for the research on fish habitats in the future.
Reimus, Paul W; Callahan, Timothy J; Ware, S Doug; Haga, Marc J; Counce, Dale A
2007-08-15
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ((3)HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient (D(m)/D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of (D(m)/D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log(D(m)/D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log(D(m)/D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
NASA Astrophysics Data System (ADS)
Reimus, Paul W.; Callahan, Timothy J.; Ware, S. Doug; Haga, Marc J.; Counce, Dale A.
2007-08-01
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ( 3HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient ( Dm/ D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of ( Dm/ D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log( Dm/ D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log( Dm/ D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL
Huang, Jian; Sun, Tingni; Ying, Zhiliang; Yu, Yi; Zhang, Cun-Hui
2013-01-01
We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities. PMID:24086091
ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL.
Huang, Jian; Sun, Tingni; Ying, Zhiliang; Yu, Yi; Zhang, Cun-Hui
2013-06-01
We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities.
Kolasa-Wiecek, Alicja
2015-04-01
The energy sector in Poland is the source of 81% of greenhouse gas (GHG) emissions. Poland, among other European Union countries, occupies a leading position with regard to coal consumption. Polish energy sector actively participates in efforts to reduce GHG emissions to the atmosphere, through a gradual decrease of the share of coal in the fuel mix and development of renewable energy sources. All evidence which completes the knowledge about issues related to GHG emissions is a valuable source of information. The article presents the results of modeling of GHG emissions which are generated by the energy sector in Poland. For a better understanding of the quantitative relationship between total consumption of primary energy and greenhouse gas emission, multiple stepwise regression model was applied. The modeling results of CO2 emissions demonstrate a high relationship (0.97) with the hard coal consumption variable. Adjustment coefficient of the model to actual data is high and equal to 95%. The backward step regression model, in the case of CH4 emission, indicated the presence of hard coal (0.66), peat and fuel wood (0.34), solid waste fuels, as well as other sources (-0.64) as the most important variables. The adjusted coefficient is suitable and equals R2=0.90. For N2O emission modeling the obtained coefficient of determination is low and equal to 43%. A significant variable influencing the amount of N2O emission is the peat and wood fuel consumption. Copyright © 2015. Published by Elsevier B.V.
2013-08-01
release; distribution unlimited. PA Number 412-TW-PA-13395 f generic function g acceleration due to gravity h altitude L aerodynamic lift force L Lagrange...cost m vehicle mass M Mach number n number of coefficients in polynomial regression p highest order of polynomial regression Q dynamic pressure R...Method (RPM); the collocation points are defined by the roots of Legendre -Gauss- Radau (LGR) functions.9 GPOPS also automatically refines the “mesh” by
Nikol'skii, A A
2017-11-01
Dependence of the sound-signal frequency on the animal body length was studied in 14 ground squirrel species (genus Spermophilus) of Eurasia. Regression analysis of the total sample yielded a low determination coefficient (R 2 = 26%), because the total sample proved to be heterogeneous in terms of signal frequency within the dimension classes of animals. When the total sample was divided into two groups according to signal frequency, two statistically significant models (regression equations) were obtained in which signal frequency depended on the body size at high determination coefficients (R 2 = 73 and 94% versus 26% for the total sample). Thus, the problem of correlation between animal body size and the frequency of their vocal signals does not have a unique solution.
[Quantitative determination of glass content in monazite glass-ceramics by IR technique].
He, Yong; Zhang, Bao-min
2003-04-01
Monazite glass-ceramics consist of both monazite and metaphoshate glass phases. The absorption bands of both phases do not overlap each other, and the absorption intensities of bands 1,275 and 616 cm-1 vary with the glass contents. The correlation coefficient between logarithmic absorbance ratio of the two bands and glass contents was r = 0.9975 and its regression equation was y = 48.356 + 25.93x. The absorbance ratio of bands 952 and 616 cm-1 also varied with different ratios of Ce2O3/La2O3 in synthetic monazites, with r = 0.9917 and a regression equation y = 0.2211 exp (0.0221x). High correlation coefficients show that the IR technique could find new application in the quantitative analysis of glass content in phosphate glass-ceramics.
Thompson, Ronald E.; Hoffman, Scott A.
2006-01-01
A suite of 28 streamflow statistics, ranging from extreme low to high flows, was computed for 17 continuous-record streamflow-gaging stations and predicted for 20 partial-record stations in Monroe County and contiguous counties in north-eastern Pennsylvania. The predicted statistics for the partial-record stations were based on regression analyses relating inter-mittent flow measurements made at the partial-record stations indexed to concurrent daily mean flows at continuous-record stations during base-flow conditions. The same statistics also were predicted for 134 ungaged stream locations in Monroe County on the basis of regression analyses relating the statistics to GIS-determined basin characteristics for the continuous-record station drainage areas. The prediction methodology for developing the regression equations used to estimate statistics was developed for estimating low-flow frequencies. This study and a companion study found that the methodology also has application potential for predicting intermediate- and high-flow statistics. The statistics included mean monthly flows, mean annual flow, 7-day low flows for three recurrence intervals, nine flow durations, mean annual base flow, and annual mean base flows for two recurrence intervals. Low standard errors of prediction and high coefficients of determination (R2) indicated good results in using the regression equations to predict the statistics. Regression equations for the larger flow statistics tended to have lower standard errors of prediction and higher coefficients of determination (R2) than equations for the smaller flow statistics. The report discusses the methodologies used in determining the statistics and the limitations of the statistics and the equations used to predict the statistics. Caution is indicated in using the predicted statistics for small drainage area situations. Study results constitute input needed by water-resource managers in Monroe County for planning purposes and evaluation of water-resources availability.
Genetic parameters for stayability to consecutive calvings in Zebu cattle.
Silva, D O; Santana, M L; Ayres, D R; Menezes, G R O; Silva, L O C; Nobre, P R C; Pereira, R J
2017-12-22
Longer-lived cows tend to be more profitable and the stayability trait is a selection criterion correlated to longevity. An alternative to the traditional approach to evaluate stayability is its definition based on consecutive calvings, whose main advantage is the more accurate evaluation of young bulls. However, no study using this alternative approach has been conducted for Zebu breeds. Therefore, the objective of this study was to compare linear random regression models to fit stayability to consecutive calvings of Guzerá, Nelore and Tabapuã cows and to estimate genetic parameters for this trait in the respective breeds. Data up to the eighth calving were used. The models included the fixed effects of age at first calving and year-season of birth of the cow and the random effects of contemporary group, additive genetic, permanent environmental and residual. Random regressions were modeled by orthogonal Legendre polynomials of order 1 to 4 (2 to 5 coefficients) for contemporary group, additive genetic and permanent environmental effects. Using Deviance Information Criterion as the selection criterion, the model with 4 regression coefficients for each effect was the most adequate for the Nelore and Tabapuã breeds and the model with 5 coefficients is recommended for the Guzerá breed. For Guzerá, heritabilities ranged from 0.05 to 0.08, showing a quadratic trend with a peak between the fourth and sixth calving. For the Nelore and Tabapuã breeds, the estimates ranged from 0.03 to 0.07 and from 0.03 to 0.08, respectively, and increased with increasing calving number. The additive genetic correlations exhibited a similar trend among breeds and were higher for stayability between closer calvings. Even between more distant calvings (second v. eighth), stayability showed a moderate to high genetic correlation, which was 0.77, 0.57 and 0.79 for the Guzerá, Nelore and Tabapuã breeds, respectively. For Guzerá, when the models with 4 or 5 regression coefficients were compared, the rank correlations between predicted breeding values for the intercept were always higher than 0.99, indicating the possibility of practical application of the least parameterized model. In conclusion, the model with 4 random regression coefficients is recommended for the genetic evaluation of stayability to consecutive calvings in Zebu cattle.
Microbial Transformation of Esters of Chlorinated Carboxylic Acids
Paris, D. F.; Wolfe, N. L.; Steen, W. C.
1984-01-01
Two groups of compounds were selected for microbial transformation studies. In the first group were carboxylic acid esters having a fixed aromatic moiety and an increasing length of the alkyl component. Ethyl esters of chlorine-substituted carboxylic acids were in the second group. Microorganisms from environmental waters and a pure culture of Pseudomonas putida U were used. The bacterial populations were monitored by plate counts, and disappearance of the parent compound was followed by gas-liquid chromatography as a function of time. The products of microbial hydrolysis were the respective carboxylic acids. Octanol-water partition coefficients (Kow) for the compounds were measured. These values spanned three orders of magnitude, whereas microbial transformation rate constants (kb) varied only 50-fold. The microbial rate constants of the carboxylic acid esters with a fixed aromatic moiety increased with an increasing length of alkyl substituents. The regression coefficient for the linear relationships between log kb and log Kow was high for group 1 compounds, indicating that these parameters correlated well. The regression coefficient for the linear relationships for group 2 compounds, however, was low, indicating that these parameters correlated poorly. PMID:16346459
Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality
ERIC Educational Resources Information Center
Algina, James; Keselman, H. J.; Penfield, Randall D.
2010-01-01
The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Thermal requirements of Dermanyssus gallinae (De Geer, 1778) (Acari: Dermanyssidae).
Tucci, Edna Clara; do Prado, Angelo P; de Araújo, Raquel Pires
2008-01-01
The thermal requirements for development of Dermanyssus gallinae were studied under laboratory conditions at 15, 20, 25, 30 and 35 degrees C, a 12h photoperiod and 60-85% RH. The thermal requirements for D. gallinae were as follows. Preoviposition: base temperature 3.4 degrees C, thermal constant (k) 562.85 degree-hours, determination coefficient (R(2)) 0.59, regression equation: Y= -0.006035 + 0.001777x. Egg: base temperature 10.60 degrees C, thermal constant (k) 689.65 degree-hours, determination coefficient (R(2)) 0.94, regression equation: Y= -0.015367 + 0.001450x. Larva: base temperature 9.82 degrees C, thermal constant (k) 464.91 degree-hours, determination coefficient (R(2)) 0.87, regression equation: Y= -0.021123 + 0.002151x. Protonymph: base temperature 10.17 degrees C, thermal constant (k) 504.49 degree-hours, determination coefficient (R(2)) 0.90, regression equation: Y= -0.020152 + 0.001982x. Deutonymph: base temperature 11.80 degrees C, thermal constant (k) 501.11 degree-hours, determination coefficient (R(2)) 0.99, regression equation: Y= -0.023555 + 0.001996x. The results obtained showed that 15 to 42 generations of Dermanyssus gallinae may occur during the year in the State of São Paulo, as estimated based on isotherm charts. Dermanyssus gallinae may develop continually in the State of São Paulo, with a population decrease in the winter. There were differences between the developmental stages of D. gallinae in relation to thermal requirements.
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
The Evaluation on the Cadmium Net Concentration for Soil Ecosystems.
Yao, Yu; Wang, Pei-Fang; Wang, Chao; Hou, Jun; Miao, Ling-Zhan
2017-03-12
Yixing, known as the "City of Ceramics", is facing a new dilemma: a raw material crisis. Cadmium (Cd) exists in extremely high concentrations in soil due to the considerable input of industrial wastewater into the soil ecosystem. The in situ technique of diffusive gradients in thin film (DGT), the ex situ static equilibrium approach (HAc, EDTA and CaCl2), and the dissolved concentration in soil solution, as well as microwave digestion, were applied to predict the Cd bioavailability of soil, aiming to provide a robust and accurate method for Cd bioavailability evaluation in Yixing. Moreover, the typical local cash crops-paddy and zizania aquatica-were selected for Cd accumulation, aiming to select the ideal plants with tolerance to the soil Cd contamination. The results indicated that the biomasses of the two applied plants were sufficiently sensitive to reflect the stark regional differences of different sampling sites. The zizania aquatica could effectively reduce the total Cd concentration, as indicated by the high accumulation coefficients. However, the fact that the zizania aquatica has extremely high transfer coefficients, and its stem, as the edible part, might accumulate large amounts of Cd, led to the conclusion that zizania aquatica was not an ideal cash crop in Yixing. Furthermore, the labile Cd concentrations which were obtained by the DGT technique and dissolved in the soil solution showed a significant correlation with the Cd concentrations of the biota accumulation. However, the ex situ methods and the microwave digestion-obtained Cd concentrations showed a poor correlation with the accumulated Cd concentration in plant tissue. Correspondingly, the multiple linear regression models were built for fundamental analysis of the performance of different methods available for Cd bioavailability evaluation. The correlation coefficients of DGT obtained by the improved multiple linear regression model have not significantly improved compared to the coefficients obtained by the simple linear regression model. The results revealed that DGT was a robust measurement, which could obtain the labile Cd concentrations independent of the physicochemical features' variation in the soil ecosystem. Consequently, these findings provide stronger evidence that DGT is an effective and ideal tool for labile Cd evaluation in Yixing.
The Evaluation on the Cadmium Net Concentration for Soil Ecosystems
Yao, Yu; Wang, Pei-Fang; Wang, Chao; Hou, Jun; Miao, Ling-Zhan
2017-01-01
Yixing, known as the “City of Ceramics”, is facing a new dilemma: a raw material crisis. Cadmium (Cd) exists in extremely high concentrations in soil due to the considerable input of industrial wastewater into the soil ecosystem. The in situ technique of diffusive gradients in thin film (DGT), the ex situ static equilibrium approach (HAc, EDTA and CaCl2), and the dissolved concentration in soil solution, as well as microwave digestion, were applied to predict the Cd bioavailability of soil, aiming to provide a robust and accurate method for Cd bioavailability evaluation in Yixing. Moreover, the typical local cash crops—paddy and zizania aquatica—were selected for Cd accumulation, aiming to select the ideal plants with tolerance to the soil Cd contamination. The results indicated that the biomasses of the two applied plants were sufficiently sensitive to reflect the stark regional differences of different sampling sites. The zizania aquatica could effectively reduce the total Cd concentration, as indicated by the high accumulation coefficients. However, the fact that the zizania aquatica has extremely high transfer coefficients, and its stem, as the edible part, might accumulate large amounts of Cd, led to the conclusion that zizania aquatica was not an ideal cash crop in Yixing. Furthermore, the labile Cd concentrations which were obtained by the DGT technique and dissolved in the soil solution showed a significant correlation with the Cd concentrations of the biota accumulation. However, the ex situ methods and the microwave digestion-obtained Cd concentrations showed a poor correlation with the accumulated Cd concentration in plant tissue. Correspondingly, the multiple linear regression models were built for fundamental analysis of the performance of different methods available for Cd bioavailability evaluation. The correlation coefficients of DGT obtained by the improved multiple linear regression model have not significantly improved compared to the coefficients obtained by the simple linear regression model. The results revealed that DGT was a robust measurement, which could obtain the labile Cd concentrations independent of the physicochemical features’ variation in the soil ecosystem. Consequently, these findings provide stronger evidence that DGT is an effective and ideal tool for labile Cd evaluation in Yixing. PMID:28287500
Informal Peer-Assisted Learning Groups Did Not Lead to Better Performance of Saudi Dental Students.
AbdelSalam, Maha; El Tantawi, Maha; Al-Ansari, Asim; AlAgl, Adel; Al-Harbi, Fahad
2017-01-01
To describe peer-assisted learning (PAL) groups formed by dental undergraduate students in a biomedical course and to investigate the association of individual and group characteristics with academic performance. In 2015, 92 fourth-year students (43 males and 49 females) in the College of Dentistry, University of Dammam, Saudi Arabia, were invited to form PAL groups to study a unit of a biomedical course. An examination was used to assess their knowledge after 2 weeks. In addition, a questionnaire and social network analysis were used to investigate (1) individual student attributes: gender, role, subject matter knowledge, grade in previous year, teaming with friends, previous communication with teammates, and content discussion, and (2) group attributes: group teacher's previous grade, number of colleagues with whom a student connected, teaming with friends, similarity of teammates' previous grades, and teacher having higher previous grades than other teammates. Regression analysis was used to assess the association of examination scores with individual and group attributes. The response rate was 80.4% (74 students: 36 males and 38 females). Students who previously scored grades A and B had higher examination scores than students with grades C/less (regression coefficient = 18.50 and 13.39) within the groups. Higher scores were not associated with working in groups including friends only (regression coefficient = 1.17) or when all students had similar previous grades (regression coefficient = 0.85). Students with previous high grades benefited to a greater extent from working in PAL groups. Similarity of teammates in PAL groups was not associated with better scores. © 2017 S. Karger AG, Basel.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
LANDSAT (MSS): Image demographic estimations
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Foresti, C.
1977-01-01
The author has identified the following significant results. Two sets of urban test sites, one with 35 cities and one with 70 cities, were selected in the State, Sao Paulo. A high degree of colinearity (0.96) was found between urban and areal measurements taken from aerial photographs and LANDSAT MSS imagery. High coefficients were observed when census data were regressed against aerial information (0.95) and LANDSAT data (0.92). The validity of population estimations was tested by regressing three urban variables, against three classes of cities. Results supported the effectiveness of LANDSAT to estimate large city populations with diminishing effectiveness as urban areas decrease in size.
Factor Scores, Structure Coefficients, and Communality Coefficients
ERIC Educational Resources Information Center
Goodwyn, Fara
2012-01-01
This paper presents heuristic explanations of factor scores, structure coefficients, and communality coefficients. Common misconceptions regarding these topics are clarified. In addition, (a) the regression (b) Bartlett, (c) Anderson-Rubin, and (d) Thompson methods for calculating factor scores are reviewed. Syntax necessary to execute all four…
2010-01-01
Background Whilst patellofemoral pain is one of the most common musculoskeletal disorders presenting to orthopaedic clinics, sports clinics, and general practices, factors contributing to its development in the absence of a defined arthropathy, such as osteoarthritis (OA), are unclear. The aim of this cross-sectional study was to describe the relationships between parameters of patellofemoral geometry (patella inclination, sulcus angle and patella height) and knee pain and patella cartilage volume. Methods 240 community-based adults aged 25-60 years were recruited to take part in a study of obesity and musculoskeletal health. Magnetic resonance imaging (MRI) of the dominant knee was used to determine the lateral condyle-patella angle, sulcus angle, and Insall-Salvati ratio, as well as patella cartilage and bone volumes. Pain was assessed by the Western Ontario and McMaster University Osteoarthritis Index (WOMAC) VA pain subscale. Results Increased lateral condyle-patella angle (increased medial patella inclination) was associated with a reduction in WOMAC pain score (Regression coefficient -1.57, 95% CI -3.05, -0.09) and increased medial patella cartilage volume (Regression coefficient 51.38 mm3, 95% CI 1.68, 101.08 mm3). Higher riding patella as indicated by increased Insall-Salvati ratio was associated with decreased medial patella cartilage volume (Regression coefficient -3187 mm3, 95% CI -5510, -864 mm3). There was a trend for increased lateral patella cartilage volume associated with increased (shallower) sulcus angle (Regression coefficient 43.27 mm3, 95% CI -2.43, 88.98 mm3). Conclusion These results suggest both symptomatic and structural benefits associated with a more medially inclined patella while a high-riding patella may be detrimental to patella cartilage. This provides additional theoretical support for the current use of corrective strategies for patella malalignment that are aimed at medial patella translation, although longitudinal studies will be needed to further substantiate this. PMID:20459700
Tanamas, Stephanie K; Teichtahl, Andrew J; Wluka, Anita E; Wang, Yuanyuan; Davies-Tuck, Miranda; Urquhart, Donna M; Jones, Graeme; Cicuttini, Flavia M
2010-05-10
Whilst patellofemoral pain is one of the most common musculoskeletal disorders presenting to orthopaedic clinics, sports clinics, and general practices, factors contributing to its development in the absence of a defined arthropathy, such as osteoarthritis (OA), are unclear.The aim of this cross-sectional study was to describe the relationships between parameters of patellofemoral geometry (patella inclination, sulcus angle and patella height) and knee pain and patella cartilage volume. 240 community-based adults aged 25-60 years were recruited to take part in a study of obesity and musculoskeletal health. Magnetic resonance imaging (MRI) of the dominant knee was used to determine the lateral condyle-patella angle, sulcus angle, and Insall-Salvati ratio, as well as patella cartilage and bone volumes. Pain was assessed by the Western Ontario and McMaster University Osteoarthritis Index (WOMAC) VA pain subscale. Increased lateral condyle-patella angle (increased medial patella inclination) was associated with a reduction in WOMAC pain score (Regression coefficient -1.57, 95% CI -3.05, -0.09) and increased medial patella cartilage volume (Regression coefficient 51.38 mm3, 95% CI 1.68, 101.08 mm3). Higher riding patella as indicated by increased Insall-Salvati ratio was associated with decreased medial patella cartilage volume (Regression coefficient -3187 mm3, 95% CI -5510, -864 mm3). There was a trend for increased lateral patella cartilage volume associated with increased (shallower) sulcus angle (Regression coefficient 43.27 mm3, 95% CI -2.43, 88.98 mm3). These results suggest both symptomatic and structural benefits associated with a more medially inclined patella while a high-riding patella may be detrimental to patella cartilage. This provides additional theoretical support for the current use of corrective strategies for patella malalignment that are aimed at medial patella translation, although longitudinal studies will be needed to further substantiate this.
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
NASA Astrophysics Data System (ADS)
Zhou, Yu; Chen, Shi
2016-02-01
In this paper, we investigate the high-frequency cross-correlation relationship between Chinese treasury futures contracts and treasury ETF. We analyze the logarithmic return of these two price series, from which we can conclude that both return series are not normally distributed and the futures markets have greater volatility. We find significant cross-correlation between these two series. We further confirm the relationship using the DCCA coefficient and the DMCA coefficient. We quantify the long-range cross-correlation with DCCA method, and we further show that the relationship is multifractal. An arbitrage algorithm based on DFA regression with stable return is proposed in the last part.
Li, Cun-Yu; Wu, Xin; Gu, Jia-Mei; Li, Hong-Yang; Peng, Guo-Ping
2018-04-01
Based on the molecular sieving and solution-diffusion effect in nanofiltration separation, the correlation between initial concentration and mass transfer coefficient of three typical phenolic acids from Salvia miltiorrhiza was fitted to analyze the relationship among mass transfer coefficient, molecular weight and concentration. The experiment showed a linear relationship between operation pressure and membrane flux. Meanwhile, the membrane flux was gradually decayed with the increase of solute concentration. On the basis of the molecular sieving and solution-diffusion effect, the mass transfer coefficient and initial concentration of three phenolic acids showed a power function relationship, and the regression coefficients were all greater than 0.9. The mass transfer coefficient and molecular weight of three phenolic acids were negatively correlated with each other, and the order from high to low is protocatechualdehyde >rosmarinic acid> salvianolic acid B. The separation mechanism of nanofiltration for phenolic acids was further clarified through the analysis of the correlation of molecular weight and nanofiltration mass transfer coefficient. The findings provide references for nanofiltration separation, especially for traditional Chinese medicine with phenolic acids. Copyright© by the Chinese Pharmaceutical Association.
Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies
1982-11-01
TG HDL -C Males T-C 50-80 MRW pɘ.05 pɘ.10 1HDL-C = high density lipoprotein cholesterol MRW...consider a more complete analy- sis, attempting to uncover the relationship between CHD and TG controlling for covariables such a high density ...for T-C can be re- duced, when among older individuals, elevated T-C may increase the capacity to carry cholesterol in the high density lipoprotein
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ozaki, Toshiro, E-mail: ganronbun@amail.plala.or.jp; Seki, Hiroshi; Shiina, Makoto
2009-09-15
The purpose of the present study was to elucidate a method for predicting the intrahepatic arteriovenous shunt rate from computed tomography (CT) images and biochemical data, instead of from arterial perfusion scintigraphy, because adverse exacerbated systemic effects may be induced in cases where a high shunt rate exists. CT and arterial perfusion scintigraphy were performed in patients with liver metastases from gastric or colorectal cancer. Biochemical data and tumor marker levels of 33 enrolled patients were measured. The results were statistically verified by multiple regression analysis. The total metastatic hepatic tumor volume (V{sub metastasized}), residual hepatic parenchyma volume (V{sub residual};more » calculated from CT images), and biochemical data were treated as independent variables; the intrahepatic arteriovenous (IHAV) shunt rate (calculated from scintigraphy) was treated as a dependent variable. The IHAV shunt rate was 15.1 {+-} 11.9%. Based on the correlation matrixes, the best correlation coefficient of 0.84 was established between the IHAV shunt rate and V{sub metastasized} (p < 0.01). In the multiple regression analysis with the IHAV shunt rate as the dependent variable, the coefficient of determination (R{sup 2}) was 0.75, which was significant at the 0.1% level with two significant independent variables (V{sub metastasized} and V{sub residual}). The standardized regression coefficients ({beta}) of V{sub metastasized} and V{sub residual} were significant at the 0.1 and 5% levels, respectively. Based on this result, we can obtain a predicted value of IHAV shunt rate (p < 0.001) using CT images. When a high shunt rate was predicted, beneficial and consistent clinical monitoring can be initiated in, for example, hepatic arterial infusion chemotherapy.« less
NASA Technical Reports Server (NTRS)
Trejo, Leonard J.; Shensa, Mark J.; Remington, Roger W. (Technical Monitor)
1998-01-01
This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many f ree parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation,-, algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance.
NASA Technical Reports Server (NTRS)
Trejo, L. J.; Shensa, M. J.
1999-01-01
This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many free parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance. Copyright 1999 Academic Press.
Testing for gene-environment interaction under exposure misspecification.
Sun, Ryan; Carroll, Raymond J; Christiani, David C; Lin, Xihong
2017-11-09
Complex interplay between genetic and environmental factors characterizes the etiology of many diseases. Modeling gene-environment (GxE) interactions is often challenged by the unknown functional form of the environment term in the true data-generating mechanism. We study the impact of misspecification of the environmental exposure effect on inference for the GxE interaction term in linear and logistic regression models. We first examine the asymptotic bias of the GxE interaction regression coefficient, allowing for confounders as well as arbitrary misspecification of the exposure and confounder effects. For linear regression, we show that under gene-environment independence and some confounder-dependent conditions, when the environment effect is misspecified, the regression coefficient of the GxE interaction can be unbiased. However, inference on the GxE interaction is still often incorrect. In logistic regression, we show that the regression coefficient is generally biased if the genetic factor is associated with the outcome directly or indirectly. Further, we show that the standard robust sandwich variance estimator for the GxE interaction does not perform well in practical GxE studies, and we provide an alternative testing procedure that has better finite sample properties. © 2017, The International Biometric Society.
ERIC Educational Resources Information Center
Mugrage, Beverly; And Others
Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…
Bärnighausen, Till; Tanser, Frank; Newell, Marie-Louise
2009-04-01
To understand the dynamics of the HIV epidemic and to plan HIV treatment and prevention programs, it is critical to know how HIV incidence in a population evolves over time. We used data from a large population-based longitudinal HIV surveillance in a rural community in South Africa to test whether HIV incidence in this population has changed in the period from 2003 through 2007. We observed 563 seroconversions in 8095 individuals over 16,256 person-years at risk, yielding an overall HIV incidence of 3.4 per 100 person-years (95% confidence interval 3.1-3.7). We included time-dependent period dummy variables (in half-yearly increments) in age-stratified Cox regressions in order to test for trends in HIV incidence. We first did regression analyses separately for women and men. In both regressions, the coefficients of all period dummy variables were individually insignificant (all p > or = 0.338) and jointly insignificant (p = 0.764 and p = 0.111, respectively). We then did regression analysis using the pooled data on women and men, controlling for sex and interactions between sex and age. Again, the coefficients of the eight period dummy variables were individually insignificant (all p > or = 0.387) and jointly insignificant (p = 0.701). We show for the first time that high levels of HIV incidence have been maintained without any sign of decline over the past 5 years in both women and men in a rural South African community with high HIV prevalence. It is unlikely that the HIV epidemic in rural South Africa can be reversed without new or intensified efforts to prevent HIV infection.
Space shuttle propulsion parameter estimation using optional estimation techniques
NASA Technical Reports Server (NTRS)
1983-01-01
A regression analyses on tabular aerodynamic data provided. A representative aerodynamic model for coefficient estimation. It also reduced the storage requirements for the "normal' model used to check out the estimation algorithms. The results of the regression analyses are presented. The computer routines for the filter portion of the estimation algorithm and the :"bringing-up' of the SRB predictive program on the computer was developed. For the filter program, approximately 54 routines were developed. The routines were highly subsegmented to facilitate overlaying program segments within the partitioned storage space on the computer.
Shrinkage regression-based methods for microarray missing value imputation.
Wang, Hsiuying; Chiu, Chia-Chun; Wu, Yi-Ching; Wu, Wei-Sheng
2013-01-01
Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.
Davies, Simon J.C.; Mulsant, Benoit H.; Flint, Alastair J.; Rothschild, Anthony J.; Whyte, Ellen M.; Meyers, Barnett S.
2014-01-01
Background There are conflicting results on the impact of anxiety on depression outcomes. The impact of anxiety has not been studied in major depression with psychotic features (“psychotic depression”). Aims We assessed the impact of specific anxiety symptoms and disorders on the outcomes of psychotic depression. Methods We analyzed data from the Study of Pharmacotherapy for Psychotic Depression that randomized 259 younger and older participants to either olanzapine plus placebo or olanzapine plus sertraline. We assessed the impact of specific anxiety symptoms from the Brief Psychiatric Rating Scale (“tension”, “anxiety” and “somatic concerns” and a composite anxiety score) and diagnoses (panic disorder and GAD) on psychotic depression outcomes using linear or logistic regression. Age, gender, education and benzodiazepine use (at baseline and end) were included as covariates. Results Anxiety symptoms at baseline and anxiety disorder diagnoses differentially impacted outcomes. On adjusted linear regression there was an association between improvement in depressive symptoms and both baseline “tension” (coefficient = 0.784; 95% CI: 0.169–1.400; p = 0.013) and the composite anxiety score (regression coefficient = 0.348; 95% CI: 0.064–0.632; p = 0.017). There was an interaction between “tension” and treatment group, with better responses in those randomized to combination treatment if they had high baseline anxiety scores (coefficient = 1.309; 95% CI: 0.105–2.514; p = 0.033). In contrast, panic disorder was associated with worse clinical outcomes (coefficient = −3.858; 95% CI: –7.281 to −0.434; p = 0.027) regardless of treatment. Conclusions Our results suggest that analysis of the impact of anxiety on depression outcome needs to differentiate psychic and somatic symptoms. PMID:24656524
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
The observation-based relationships between PM2.5 and AOD over China
NASA Astrophysics Data System (ADS)
Xin, Jinyuan; Gong, Chongshui; Liu, Zirui; Cong, Zhiyuan; Gao, Wenkang; Song, Tao; Pan, Yuepeng; Sun, Yang; Ji, Dongsheng; Wang, Lili; Tang, Guiqian; Wang, Yuesi
2016-09-01
This is the first investigation of the generalized linear regressions of PM2.5 and aerosol optical depth (AOD) with the Campaign on atmospheric Aerosol Research-China network over the large high-concentration aerosol region during the period from 2012 to 2013. The map of the PM2.5 and AOD levels showed large spatial differences in the aerosol concentrations and aerosol optical properties over China. The ranges of the annual mean PM2.5 and AOD were 10-117 µg/m3 and 0.12-1.11 from the clean regions to seriously polluted regions, from the almost "arctic" and the Tibetan Plateau to tropical environments. There were significant spatial agreements and correlations between the PM2.5 and AOD. However, the linear regression functions (PM2.5 = A*AOD + B) exhibited large differences in different regions and seasons. The slopes (A) were from 13 to 90, the intercepts (B) were from 0.8 to 33.3, and the correlation coefficients (R2) ranged from 0.06 to 0.75. The slopes (A) were much higher in the north (41-99) than in the south (13-64) because the extinction efficiency of hygroscopic aerosol was rapidly increasing with the increasing humidity from the dry north to the humid south. Meanwhile, the intercepts (B) were generally lower, and the correlation coefficients (R2) were much higher in the dry north than in the humid south. There was high consistency of AOD versus PM2.5 for all sites in three ranges of the atmospheric column precipitable water vapor (PWV). The segmented linear regression functions were y = 84.66x + 9.85 (PWV < 1.0), y = 69.47x + 11.87 (1.0 < PWV < 2.5), and y = 52.37x + 8.59 (PWV > 2.5). The correlation coefficients (R2) were high from 0.64 to 0.70 across China.
Innovating patient care delivery: DSRIP's interrupted time series analysis paradigm.
Shenoy, Amrita G; Begley, Charles E; Revere, Lee; Linder, Stephen H; Daiger, Stephen P
2017-12-08
Adoption of Medicaid Section 1115 waiver is one of the many ways of innovating healthcare delivery system. The Delivery System Reform Incentive Payment (DSRIP) pool, one of the two funding pools of the waiver has four categories viz. infrastructure development, program innovation and redesign, quality improvement reporting and lastly, bringing about population health improvement. A metric of the fourth category, preventable hospitalization (PH) rate was analyzed in the context of eight conditions for two time periods, pre-reporting years (2010-2012) and post-reporting years (2013-2015) for two hospital cohorts, DSRIP participating and non-participating hospitals. The study explains how DSRIP impacted Preventable Hospitalization (PH) rates of eight conditions for both hospital cohorts within two time periods. Eight PH rates were regressed as the dependent variable with time, intervention and post-DSRIP Intervention as independent variables. PH rates of eight conditions were then consolidated into one rate for regressing with the above independent variables to evaluate overall impact of DSRIP. An interrupted time series regression was performed after accounting for auto-correlation, stationarity and seasonality in the dataset. In the individual regression model, PH rates showed statistically significant coefficients for seven out of eight conditions in DSRIP participating hospitals. In the combined regression model, the coefficient of the PH rate showed a statistically significant decrease with negative p-values for regression coefficients in DSRIP participating hospitals compared to positive/increased p-values for regression coefficients in DSRIP non-participating hospitals. Several macro- and micro-level factors may have likely contributed DSRIP hospitals outperforming DSRIP non-participating hospitals. Healthcare organization/provider collaboration, support from healthcare professionals, DSRIP's design, state reimbursement and coordination in care delivery methods may have led to likely success of DSRIP. IV, a retrospective cohort study based on longitudinal data. Copyright © 2017 Elsevier Inc. All rights reserved.
[Developing Perceived Competence Scale (PCS) for Adolescents].
Özer, Arif; Gençtanirim Kurt, Dilek; Kizildağ, Seval; Demırtaş Zorbaz, Selen; Arici Şahın, Fatma; Acar, Tülin; Ergene, Tuncay
2016-01-01
In this study, Perceived Competence Scale was developed to measure high school students' perceived competence. Scale development process was verified on three different samples. Participants of the research are some high school students in 2011-2012 academic terms from Ankara. Participants' numbers are incorporated in exploratory factor analysis, confirmatory factor analysis and test-retest reliability respectively, as follows: 372, 668 and 75. Internal consistency coefficients (Cronbach's and stratified α) are calculated separately for each group. For data analysis Factor 8.02 and LISREL 8.70 package programs were used. According to results of the analyses, internal consistency coefficients (α) are .90 - .93 for academic competence, .82 - .86 for social competence in the samples that exploratory and confirmatory factor analysis performed. For the whole scale internal consistency coefficient (stratified α) is calculated as .91. As a result of test-retest reliability, adjusted correlation coefficients (r) are .94 for social competence and .90 for academic competence. In addition, to fit indexes and regression weights obtained from factor analysis, findings related convergent and discriminant validity, indicating that competence can be addressed in two dimensions which are academic (16 items) and social (14 items).
NASA Astrophysics Data System (ADS)
Cambra-López, María; Winkel, Albert; Mosquera, Julio; Ogink, Nico W. M.; Aarnink, André J. A.
2015-06-01
The objective of this study was to compare co-located real-time light scattering devices and equivalent gravimetric samplers in poultry and pig houses for PM10 mass concentration, and to develop animal-specific calibration factors for light scattering samplers. These results will contribute to evaluate the comparability of different sampling instruments for PM10 concentrations. Paired DustTrak light scattering device (DustTrak aerosol monitor, TSI, U.S.) and PM10 gravimetric cyclone sampler were used for measuring PM10 mass concentrations during 24 h periods (from noon to noon) inside animal houses. Sampling was conducted in 32 animal houses in the Netherlands, including broilers, broiler breeders, layers in floor and in aviary system, turkeys, piglets, growing-finishing pigs in traditional and low emission housing with dry and liquid feed, and sows in individual and group housing. A total of 119 pairs of 24 h measurements (55 for poultry and 64 for pigs) were recorded and analyzed using linear regression analysis. Deviations between samplers were calculated and discussed. In poultry, cyclone sampler and DustTrak data fitted well to a linear regression, with a regression coefficient equal to 0.41, an intercept of 0.16 mg m-3 and a correlation coefficient of 0.91 (excluding turkeys). Results in turkeys showed a regression coefficient equal to 1.1 (P = 0.49), an intercept of 0.06 mg m-3 (P < 0.0001) and a correlation coefficient of 0.98. In pigs, we found a regression coefficient equal to 0.61, an intercept of 0.05 mg m-3 and a correlation coefficient of 0.84. Measured PM10 concentrations using DustTraks were clearly underestimated (approx. by a factor 2) in both poultry and pig housing systems compared with cyclone pre-separators. Absolute, relative, and random deviations increased with concentration. DustTrak light scattering devices should be self-calibrated to investigate PM10 mass concentrations accurately in animal houses. We recommend linear regression equations as animal-specific calibration factors for DustTraks instead of manufacturer calibration factors, especially in heavily dusty environments such as animal houses.
Poor methodological quality and reporting standards of systematic reviews in burn care management.
Wasiak, Jason; Tyack, Zephanie; Ware, Robert; Goodwin, Nicholas; Faggion, Clovis M
2017-10-01
The methodological and reporting quality of burn-specific systematic reviews has not been established. The aim of this study was to evaluate the methodological quality of systematic reviews in burn care management. Computerised searches were performed in Ovid MEDLINE, Ovid EMBASE and The Cochrane Library through to February 2016 for systematic reviews relevant to burn care using medical subject and free-text terms such as 'burn', 'systematic review' or 'meta-analysis'. Additional studies were identified by hand-searching five discipline-specific journals. Two authors independently screened papers, extracted and evaluated methodological quality using the 11-item A Measurement Tool to Assess Systematic Reviews (AMSTAR) tool and reporting quality using the 27-item Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Characteristics of systematic reviews associated with methodological and reporting quality were identified. Descriptive statistics and linear regression identified features associated with improved methodological quality. A total of 60 systematic reviews met the inclusion criteria. Six of the 11 AMSTAR items reporting on 'a priori' design, duplicate study selection, grey literature, included/excluded studies, publication bias and conflict of interest were reported in less than 50% of the systematic reviews. Of the 27 items listed for PRISMA, 13 items reporting on introduction, methods, results and the discussion were addressed in less than 50% of systematic reviews. Multivariable analyses showed that systematic reviews associated with higher methodological or reporting quality incorporated a meta-analysis (AMSTAR regression coefficient 2.1; 95% CI: 1.1, 3.1; PRISMA regression coefficient 6·3; 95% CI: 3·8, 8·7) were published in the Cochrane library (AMSTAR regression coefficient 2·9; 95% CI: 1·6, 4·2; PRISMA regression coefficient 6·1; 95% CI: 3·1, 9·2) and included a randomised control trial (AMSTAR regression coefficient 1·4; 95%CI: 0·4, 2·4; PRISMA regression coefficient 3·4; 95% CI: 0·9, 5·8). The methodological and reporting quality of systematic reviews in burn care requires further improvement with stricter adherence by authors to the PRISMA checklist and AMSTAR tool. © 2016 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
Standardization of domestic frying processes by an engineering approach.
Franke, K; Strijowski, U
2011-05-01
An approach was developed to enable a better standardization of domestic frying of potato products. For this purpose, 5 domestic fryers differing in heating power and oil capacity were used. A very defined frying process using a highly standardized model product and a broad range of frying conditions was carried out in these fryers and the development of browning representing an important quality parameter was measured. Product-to-oil ratio, oil temperature, and frying time were varied. Quite different color changes were measured in the different fryers although the same frying process parameters were applied. The specific energy consumption for water evaporation (spECWE) during frying related to product amount was determined for all frying processes to define an engineering parameter for characterizing the frying process. A quasi-linear regression approach was applied to calculate this parameter from frying process settings and fryer properties. The high significance of the regression coefficients and a coefficient of determination close to unity confirmed the suitability of this approach. Based on this regression equation, curves for standard frying conditions (SFC curves) were calculated which describe the frying conditions required to obtain the same level of spECWE in the different domestic fryers. Comparison of browning results from the different fryers operated at conditions near the SFC curves confirmed the applicability of the approach. © 2011 Institute of Food Technologists®
Ito, Yukiko; Hattori, Reiko; Mase, Hiroki; Watanabe, Masako; Shiotani, Itaru
2008-12-01
Pollen information is indispensable for allergic individuals and clinicians. This study aimed to develop forecasting models for the total annual count of airborne pollen grains based on data monitored over the last 20 years at the Mie Chuo Medical Center, Tsu, Mie, Japan. Airborne pollen grains were collected using a Durham sampler. Total annual pollen count and pollen count from October to December (OD pollen count) of the previous year were transformed to logarithms. Regression analysis of the total pollen count was performed using variables such as the OD pollen count and the maximum temperature for mid-July of the previous year. Time series analysis revealed an alternate rhythm of the series of total pollen count. The alternate rhythm consisted of a cyclic alternation of an "on" year (high pollen count) and an "off" year (low pollen count). This rhythm was used as a dummy variable in regression equations. Of the three models involving the OD pollen count, a multiple regression equation that included the alternate rhythm variable and the interaction of this rhythm with OD pollen count showed a high coefficient of determination (0.844). Of the three models involving the maximum temperature for mid-July, those including the alternate rhythm variable and the interaction of this rhythm with maximum temperature had the highest coefficient of determination (0.925). An alternate pollen dispersal rhythm represented by a dummy variable in the multiple regression analysis plays a key role in improving forecasting models for the total annual sugi pollen count.
Solid harmonic wavelet scattering for predictions of molecule properties
NASA Astrophysics Data System (ADS)
Eickenberg, Michael; Exarchakis, Georgios; Hirn, Matthew; Mallat, Stéphane; Thiry, Louis
2018-06-01
We present a machine learning algorithm for the prediction of molecule properties inspired by ideas from density functional theory (DFT). Using Gaussian-type orbital functions, we create surrogate electronic densities of the molecule from which we compute invariant "solid harmonic scattering coefficients" that account for different types of interactions at different scales. Multilinear regressions of various physical properties of molecules are computed from these invariant coefficients. Numerical experiments show that these regressions have near state-of-the-art performance, even with relatively few training examples. Predictions over small sets of scattering coefficients can reach a DFT precision while being interpretable.
Michienzi, Alissa; Kron, Tomas; Callahan, Jason; Plumridge, Nikki; Ball, David; Everitt, Sarah
2017-04-01
Cone-beam computed tomography (CBCT) is a valuable image-guidance tool in radiation therapy (RT). This study was initiated to assess the accuracy of CBCT for quantifying non-small cell lung cancer (NSCLC) tumour volumes compared to the anatomical 'gold standard', CT. Tumour regression or progression on CBCT was also analysed. Patients with Stage I-III NSCLC, prescribed 60 Gy in 30 fractions RT with concurrent platinum-based chemotherapy, routine CBCT and enrolled in a prospective study of serial PET/CT (baseline, weeks two and four) were eligible. Time-matched CBCT and CT gross tumour volumes (GTVs) were manually delineated by a single observer on MIM software, and were analysed descriptively and using Pearson's correlation coefficient (r) and linear regression (R 2 ). Of 94 CT/CBCT pairs, 30 patients were eligible for inclusion. The mean (± SD) CT GTV vs CBCT GTV on the four time-matched pairs were 95 (±182) vs 98.8 (±160.3), 73.6 (±132.4) vs 70.7 (±96.6), 54.7 (±92.9) vs 61.0 (±98.8) and 61.3 (±53.3) vs 62.1 (±47.9) respectively. Pearson's correlation coefficient (r) was 0.98 (95% CI 0.97-0.99, ρ < 0.001). The mean (±SD) CT/CBCT Dice's similarity coefficient was 0.66 (±0.16). Of 289 CBCT scans, tumours in 27 (90%) patients regressed by a mean (±SD) rate of 1.5% (±0.75) per fraction. The mean (±SD) GTV regression was 43.1% (±23.1) from the first to final CBCT. Primary lung tumour volumes observed on CBCT and time-matched CT are highly correlated (although not identical), thereby validating observations of GTV regression on CBCT in NSCLC. © 2016 The Royal Australian and New Zealand College of Radiologists.
Optimized biogas-fermentation by neural network control.
Holubar, P; Zani, L; Hager, M; Fröschl, W; Radak, Z; Braun, R
2003-01-01
In this work several feed-forward back-propagation neural networks (FFBP) were trained in order to model, and subsequently control, methane production in anaerobic digesters. To produce data for the training of the neural nets, four anaerobic continuous stirred tank reactors (CSTR) were operated in steady-state conditions at organic loading rates (Br) of about 2 kg x m(-3) x d(-1) chemical oxygen demand (COD), and disturbed by pulse-like increase of the organic loading rate. For the pulses additional carbon sources were added to the basic feed (surplus- and primary sludge) to simulate cofermentation and to increase the COD. Measured parameters were: gas composition, methane production rate, volatile fatty acid concentration, pH, redox potential, volatile suspended solids and COD of feed and effluent. A hierarchical system of neural nets was developed and embedded in a Decision Support System (DSS). A 3-3-1 FFBP simulated the pH with a regression coefficient of 0.82. A 9-3-3 FFBP simulated the volatile fatty acid concentration in the sludge with a regression coefficient of 0.86. And a 9-3-2 FFBP simulated the gas production and gas composition with a regression coefficient of 0.90 and 0.80 respectively. A lab-scale anaerobic CSTR controlled by this tool was able to maintain a methane concentration of about 60% at a rather high gas production rate of between 5 to 5.6 m3 x m(-3) x d(-1).
Psychomotor development index in children younger than 6 years from Argentine provinces.
Lejarraga, Horacio; Kelmansky, Diana M; Masautis, Alicia; Nunes, Fernando
2018-04-01
To obtain a psychomotor development index (PDI) for each Argentine province. Using a national, probabilistic, and stratified sample of 13 323 male and female children younger than 6 years selected for the National Survey on Nutrition and Health (Encuesta Nacional de Nutrición y Salud, ENNyS 2004), we estimated the PDI per province based on compliance with 10 developmental milestones. The median age at attainment (median age) of each milestone was estimated adjusting a logistic regression. The PDI was estimated as 100* (1 + b), where "b" is the regression coefficient of y= a + b x, where "y" is the median age as per the national reference (x) minus the median age at attainment of a milestone. The theoretical value expected for the PDI was 100. The PDI per province ranged between 72.1 and 106.4. Most provinces showed a negative regression coefficient, which indicated a progressive increase of the delay in the age at attainment of milestones. The correlation coefficient between the PDI per province and infant mortality in 2005was extremely high: -0.85, suggesting that both indicators share similar biological and social determinants. The PDI was negative because the higher the mortality, the lower the PDI. We have now a positive health indicator available in Argentina: the psychomotor development index, which is a low-cost, easy to collect, and reliable tool that may be used in national health statistics. Sociedad Argentina de Pediatría.
Retrieval Algorithm for Broadband Albedo at the Top of the Atmosphere
NASA Astrophysics Data System (ADS)
Lee, Sang-Ho; Lee, Kyu-Tae; Kim, Bu-Yo; Zo, ll-Sung; Jung, Hyun-Seok; Rim, Se-Hun
2018-05-01
The objective of this study is to develop an algorithm that retrieves the broadband albedo at the top of the atmosphere (TOA albedo) for radiation budget and climate analysis of Earth's atmosphere using Geostationary Korea Multi-Purse Satellite/Advanced Meteorological Imager (GK-2A/AMI) data. Because the GK-2A satellite will launch in 2018, we used data from the Japanese weather satellite Himawari-8 and onboard sensor Advanced Himawari Imager (AHI), which has similar sensor properties and observation area to those of GK-2A. TOA albedo was retrieved based on reflectance and regression coefficients of shortwave channels 1 to 6 of AHI. The regression coefficient was calculated using the results of the radiative transfer model (SBDART) and ridge regression. The SBDART used simulations of the correlation between TOA albedo and reflectance of each channel according to each atmospheric conditions (solar zenith angle, viewing zenith angle, relative azimuth angle, surface type, and absence/presence of clouds). The TOA albedo from Himawari-8/AHI were compared to that from the National Aeronautics and Space Administration (NASA) satellite Terra with onboard sensor Clouds and the Earth's Radiant Energy System (CERES). The correlation coefficients between the two datasets from the week containing the first day of every month between 1st August 2015 and 1st July 2016 were high, ranging between 0.934 and 0.955, with the root mean square error in the 0.053-0.068 range.
ERIC Educational Resources Information Center
Aypay, Ahmet
2010-01-01
The purpose of this study is to examine the ICT usage and academic achievement of Turkish students in PISA 2006 data. The sample of the study included 4942 students from 160 schools. Frequencies, independent samples t-tests, ANOVAs, pearson correlation coefficients, exploratory factor analysis, and regression analysis were used. A high percentage…
Williams-Sether, Tara; Gross, Tara A.
2016-02-09
Seasonal mean daily flow data from 119 U.S. Geological Survey streamflow-gaging stations in North Dakota; the surrounding states of Montana, Minnesota, and South Dakota; and the Canadian provinces of Manitoba and Saskatchewan with 10 or more years of unregulated flow record were used to develop regression equations for flow duration, n-day high flow and n-day low flow using ordinary least-squares and Tobit regression techniques. Regression equations were developed for seasonal flow durations at the 10th, 25th, 50th, 75th, and 90th percent exceedances; the 1-, 7-, and 30-day seasonal mean high flows for the 10-, 25-, and 50-year recurrence intervals; and the 1-, 7-, and 30-day seasonal mean low flows for the 2-, 5-, and 10-year recurrence intervals. Basin and climatic characteristics determined to be significant explanatory variables in one or more regression equations included drainage area, percentage of basin drainage area that drains to isolated lakes and ponds, ruggedness number, stream length, basin compactness ratio, minimum basin elevation, precipitation, slope ratio, stream slope, and soil permeability. The adjusted coefficient of determination for the n-day high-flow regression equations ranged from 55.87 to 94.53 percent. The Chi2 values for the duration regression equations ranged from 13.49 to 117.94, whereas the Chi2 values for the n-day low-flow regression equations ranged from 4.20 to 49.68.
Galloway, Joel M.
2014-01-01
The Red River of the North (hereafter referred to as “Red River”) Basin is an important hydrologic region where water is a valuable resource for the region’s economy. Continuous water-quality monitors have been operated by the U.S. Geological Survey, in cooperation with the North Dakota Department of Health, Minnesota Pollution Control Agency, City of Fargo, City of Moorhead, City of Grand Forks, and City of East Grand Forks at the Red River at Fargo, North Dakota, from 2003 through 2012 and at Grand Forks, N.Dak., from 2007 through 2012. The purpose of the monitoring was to provide a better understanding of the water-quality dynamics of the Red River and provide a way to track changes in water quality. Regression equations were developed that can be used to estimate concentrations and loads for dissolved solids, sulfate, chloride, nitrate plus nitrite, total phosphorus, and suspended sediment using explanatory variables such as streamflow, specific conductance, and turbidity. Specific conductance was determined to be a significant explanatory variable for estimating dissolved solids concentrations at the Red River at Fargo and Grand Forks. The regression equations provided good relations between dissolved solid concentrations and specific conductance for the Red River at Fargo and at Grand Forks, with adjusted coefficients of determination of 0.99 and 0.98, respectively. Specific conductance, log-transformed streamflow, and a seasonal component were statistically significant explanatory variables for estimating sulfate in the Red River at Fargo and Grand Forks. Regression equations provided good relations between sulfate concentrations and the explanatory variables, with adjusted coefficients of determination of 0.94 and 0.89, respectively. For the Red River at Fargo and Grand Forks, specific conductance, streamflow, and a seasonal component were statistically significant explanatory variables for estimating chloride. For the Red River at Grand Forks, a time component also was a statistically significant explanatory variable for estimating chloride. The regression equations for chloride at the Red River at Fargo provided a fair relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.66 and the equation for the Red River at Grand Forks provided a relatively good relation between chloride concentrations and the explanatory variables, with an adjusted coefficient of determination of 0.77. Turbidity and streamflow were statistically significant explanatory variables for estimating nitrate plus nitrite concentrations at the Red River at Fargo and turbidity was the only statistically significant explanatory variable for estimating nitrate plus nitrite concentrations at Grand Forks. The regression equation for the Red River at Fargo provided a relatively poor relation between nitrate plus nitrite concentrations, turbidity, and streamflow, with an adjusted coefficient of determination of 0.46. The regression equation for the Red River at Grand Forks provided a fair relation between nitrate plus nitrite concentrations and turbidity, with an adjusted coefficient of determination of 0.73. Some of the variability that was not explained by the equations might be attributed to different sources contributing nitrates to the stream at different times. Turbidity, streamflow, and a seasonal component were statistically significant explanatory variables for estimating total phosphorus at the Red River at Fargo and Grand Forks. The regression equation for the Red River at Fargo provided a relatively fair relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.74. The regression equation for the Red River at Grand Forks provided a good relation between total phosphorus concentrations, turbidity, streamflow, and season, with an adjusted coefficient of determination of 0.87. For the Red River at Fargo, turbidity and streamflow were statistically significant explanatory variables for estimating suspended-sediment concentrations. For the Red River at Grand Forks, turbidity was the only statistically significant explanatory variable for estimating suspended-sediment concentration. The regression equation at the Red River at Fargo provided a good relation between suspended-sediment concentration, turbidity, and streamflow, with an adjusted coefficient of determination of 0.95. The regression equation for the Red River at Grand Forks provided a good relation between suspended-sediment concentration and turbidity, with an adjusted coefficient of determination of 0.96.
Morikawa, Go; Suzuka, Chihiro; Shoji, Atsushi; Shibusawa, Yoichi; Yanagida, Akio
2016-01-05
A high-throughput method for determining the octanol/water partition coefficient (P(o/w)) of a large variety of compounds exhibiting a wide range in hydrophobicity was established. The method combines a simple shake-flask method with a novel two-phase solvent system comprising an acetonitrile-phosphate buffer (0.1 M, pH 7.4)-1-octanol (25:25:4, v/v/v; AN system). The AN system partition coefficients (K(AN)) of 51 standard compounds for which log P(o/w) (at pH 7.4; log D) values had been reported were determined by single two-phase partitioning in test tubes, followed by measurement of the solute concentration in both phases using an automatic flow injection-ultraviolet detection system. The log K(AN) values were closely related to reported log D values, and the relationship could be expressed by the following linear regression equation: log D=2.8630 log K(AN) -0.1497(n=51). The relationship reveals that log D values (+8 to -8) for a large variety of highly hydrophobic and/or hydrophilic compounds can be estimated indirectly from the narrow range of log K(AN) values (+3 to -3) determined using the present method. Furthermore, log K(AN) values for highly polar compounds for which no log D values have been reported, such as amino acids, peptides, proteins, nucleosides, and nucleotides, can be estimated using the present method. The wide-ranging log D values (+5.9 to -7.5) of these molecules were estimated for the first time from their log K(AN) values and the above regression equation. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yoshida, Kenichiro; Nishidate, Izumi; Ojima, Nobutoshi; Iwata, Kayoko
2014-01-01
To quantitatively evaluate skin chromophores over a wide region of curved skin surface, we propose an approach that suppresses the effect of the shading-derived error in the reflectance on the estimation of chromophore concentrations, without sacrificing the accuracy of that estimation. In our method, we use multiple regression analysis, assuming the absorbance spectrum as the response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as the predictor variables. The concentrations of melanin and total hemoglobin are determined from the multiple regression coefficients using compensation formulae (CF) based on the diffuse reflectance spectra derived from a Monte Carlo simulation. To suppress the shading-derived error, we investigated three different combinations of multiple regression coefficients for the CF. In vivo measurements with the forearm skin demonstrated that the proposed approach can reduce the estimation errors that are due to shading-derived errors in the reflectance. With the best combination of multiple regression coefficients, we estimated that the ratio of the error to the chromophore concentrations is about 10%. The proposed method does not require any measurements or assumptions about the shape of the subjects; this is an advantage over other studies related to the reduction of shading-derived errors.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
NASA Astrophysics Data System (ADS)
Wheeler, David C.; Waller, Lance A.
2009-03-01
In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
NASA Astrophysics Data System (ADS)
Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.
2018-03-01
Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
Wang, W; Ma, C Y; Chen, W; Ma, H Y; Zhang, H; Meng, Y Y; Ni, Y; Ma, L B
2016-08-19
Determining correlations between certain traits of economic importance constitutes an essential component of selective activities. In this study, our aim was to provide effective indicators for breeding programs of Lateolabrax maculatus, an important aquaculture species in China. We analyzed correlations between 20 morphometric traits and body weight, using correlation and path analyses. The results indicated that the correlations among all 21 traits were highly significant, with the highest correlation coefficient identified between total length and body weight. The path analysis indicated that total length (X 1 ), body width (X 5 ), distance from first dorsal fin origin to anal fin origin (X 10 ), snout length (X 16 ), eye diameter (X 17 ), eye cross (X 18 ), and slanting distance from snout tip to first dorsal fin origin (X 19 ) significantly affected body weight (Y) directly. The following multiple-regression equation was obtained using stepwise multiple-regression analysis: Y = -472.108 + 1.065X 1 + 7.728X 5 + 1.973X 10 - 7.024X 16 - 4.400X 17 - 3.338X 18 + 2.138X 19 , with an adjusted multiple-correlation coefficient of 0.947. Body width had the largest determinant coefficient, as well as the highest positive direct correlation with body weight. At the same time, high indirect effects with six other morphometric traits on L. maculatus body weight, through body width, were identified. Hence, body width could be a key factor that efficiently indicates significant effects on body weight in L. maculatus.
SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.
Weaver, Bruce; Wuensch, Karl L
2013-09-01
Several procedures that use summary data to test hypotheses about Pearson correlations and ordinary least squares regression coefficients have been described in various books and articles. To our knowledge, however, no single resource describes all of the most common tests. Furthermore, many of these tests have not yet been implemented in popular statistical software packages such as SPSS and SAS. In this article, we describe all of the most common tests and provide SPSS and SAS programs to perform them. When they are applicable, our code also computes 100 × (1 - α)% confidence intervals corresponding to the tests. For testing hypotheses about independent regression coefficients, we demonstrate one method that uses summary data and another that uses raw data (i.e., Potthoff analysis). When the raw data are available, the latter method is preferred, because use of summary data entails some loss of precision due to rounding.
NASA Astrophysics Data System (ADS)
Zhai, Mengting; Chen, Yan; Li, Jing; Zhou, Jun
2017-12-01
The molecular electrongativity distance vector (MEDV-13) was used to describe the molecular structure of benzyl ether diamidine derivatives in this paper, Based on MEDV-13, The three-parameter (M 3, M 15, M 47) QSAR model of insecticidal activity (pIC 50) for 60 benzyl ether diamidine derivatives was constructed by leaps-and-bounds regression (LBR) . The traditional correlation coefficient (R) and the cross-validation correlation coefficient (R CV ) were 0.975 and 0.971, respectively. The robustness of the regression model was validated by Jackknife method, the correlation coefficient R were between 0.971 and 0.983. Meanwhile, the independent variables in the model were tested to be no autocorrelation. The regression results indicate that the model has good robust and predictive capabilities. The research would provide theoretical guidance for the development of new generation of anti African trypanosomiasis drugs with efficiency and low toxicity.
Zheng, Qi; Peng, Limin
2016-01-01
Quantile regression provides a flexible platform for evaluating covariate effects on different segments of the conditional distribution of response. As the effects of covariates may change with quantile level, contemporaneously examining a spectrum of quantiles is expected to have a better capacity to identify variables with either partial or full effects on the response distribution, as compared to focusing on a single quantile. Under this motivation, we study a general adaptively weighted LASSO penalization strategy in the quantile regression setting, where a continuum of quantile index is considered and coefficients are allowed to vary with quantile index. We establish the oracle properties of the resulting estimator of coefficient function. Furthermore, we formally investigate a BIC-type uniform tuning parameter selector and show that it can ensure consistent model selection. Our numerical studies confirm the theoretical findings and illustrate an application of the new variable selection procedure. PMID:28008212
Overcoming multicollinearity in multiple regression using correlation coefficient
NASA Astrophysics Data System (ADS)
Zainodin, H. J.; Yap, S. J.
2013-09-01
Multicollinearity happens when there are high correlations among independent variables. In this case, it would be difficult to distinguish between the contributions of these independent variables to that of the dependent variable as they may compete to explain much of the similar variance. Besides, the problem of multicollinearity also violates the assumption of multiple regression: that there is no collinearity among the possible independent variables. Thus, an alternative approach is introduced in overcoming the multicollinearity problem in achieving a well represented model eventually. This approach is accomplished by removing the multicollinearity source variables on the basis of the correlation coefficient values based on full correlation matrix. Using the full correlation matrix can facilitate the implementation of Excel function in removing the multicollinearity source variables. It is found that this procedure is easier and time-saving especially when dealing with greater number of independent variables in a model and a large number of all possible models. Hence, in this paper detailed insight of the procedure is shown, compared and implemented.
Bootstrap evaluation of a young Douglas-fir height growth model for the Pacific Northwest
Nicholas R. Vaughn; Eric C. Turnblom; Martin W. Ritchie
2010-01-01
We evaluated the stability of a complex regression model developed to predict the annual height growth of young Douglas-fir. This model is highly nonlinear and is fit in an iterative manner for annual growth coefficients from data with multiple periodic remeasurement intervals. The traditional methods for such a sensitivity analysis either involve laborious math or...
Yoneoka, Daisuke; Henmi, Masayuki
2017-06-01
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.
Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C
2014-03-01
To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Yao, Xin; Niu, Yandong; Li, Youzhi; Zou, Dongsheng; Ding, Xiaohui; Bian, Hualin
2018-05-09
Bioaccumulation of five heavy metals (Cd, Cu, Mn, Pb, and Zn) in six plant organs (panicle, leaf, stem, root, rhizome, and bud) of the emergent and perennial plant species, Miscanthus sacchariflorus, were investigated to estimate the plant's potential for accumulating heavy metals in the wetlands of Dongting Lake. We found the highest Cd concentrations in the panicles and leaves; while the highest Cu and Mn were observed in the roots, the highest Pb in the panicles, and the highest Zn in the panicles and buds. In contrast, the lowest Cd concentrations were detected in the stem, roots, and buds; the lowest Cu concentrations in the leaves and stems; the lowest Mn concentrations in the panicles, rhizomes, and buds; the lowest Pb concentrations in the stems; and the lowest Zn concentrations in the leaves, stems, and rhizomes. Mean Cu concentration in the plant showed a positive regression coefficient with plot elevation, soil organic matter content, and soil Cu concentration, whereas it showed a negative regression coefficient with soil moisture and electrolyte leakage. Mean Mn concentration showed positive and negative regression coefficients with soil organic matter and soil moisture, respectively. Mean Pb concentration exhibited positive regression coefficient with plot elevation and soil total P concentration, and Zn concentration showed a positive regression coefficient with soil available P and total P concentrations. However, there was no significant regression coefficient between mean Cd concentration in the plant and the investigated environmental parameters. Stems and roots were the main organs involved in heavy metal accumulation from the environment. The mean quantities of heavy metals accumulated in the plant tissues were 2.2 mg Cd, 86.7 mg Cu, 290.3 mg Mn, 15.9 mg Pb, and 307 mg Zn per square meter. In the Dongting Lake wetlands, 0.7 × 10 3 kg Cd, 22.9 × 10 3 kg Cu, 77.5 × 10 3 kg Mn, 3.1 × 10 3 kg Pb, and 95.9 × 10 3 kg Zn per year were accumulated by aboveground organs and removed from the lake through harvesting for paper manufacture.
NASA Astrophysics Data System (ADS)
Hammud, Hassan H.; Ghannoum, Amer; Masoud, Mamdouh S.
2006-02-01
Sixteen Schiff bases obtained from the condensation of benzaldehyde or salicylaldehyde with various amines (aniline, 4-carboxyaniline, phenylhydrazine, 2,4-dinitrophenylhydrazine, ethylenediamine, hydrazine, o-phenylenediamine and 2,6-pyridinediamine) are studied with UV-vis spectroscopy to observe the effect of solvents, substituents and other structural factors on the spectra. The bands involving different electronic transitions are interpreted. Computerized analysis and multiple regression techniques were applied to calculate the regression and correlation coefficients based on the equation that relates peak position λmax to the solvent parameters that depend on the H-bonding ability, refractive index and dielectric constant of solvents.
ERIC Educational Resources Information Center
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
Expression profiles of loneliness-associated genes for survival prediction in cancer patients.
You, Liang-Fu; Yeh, Jia-Rong; Su, Mu-Chun
2014-01-01
Influence of loneliness on human survival has been established epidemiologically, but genomic research remains undeveloped. We identified 34 loneliness-associated genes which were statistically significant for high- lonely and low-lonely individuals. With the univariate Cox proportional hazards regression model, we obtained corresponding regression coefficients for loneliness-associated genes fo individual cancer patients. Furthermore, risk scores could be generated with the combination of gene expression level multiplied by corresponding regression coefficients of loneliness-associated genes. We verified that high-risk score cancer patients had shorter mean survival time than their low-risk score counterparts. Then we validated the loneliness-associated gene signature in three independent brain cancer cohorts with Kaplan-Meier survival curves (n=77, 85 and 191), significantly separable by log-rank test with hazard ratios (HR) >1 and p-values <0.0001 (HR=2.94, 3.82, and 1.78). Moreover, we validated the loneliness-associated gene signature in bone cancer (HR=5.10, p-value=4.69e-3), lung cancer (HR=2.86, p-value=4.71e-5), ovarian cancer (HR=1.97, p-value=3.11e-5), and leukemia (HR=2.06, p-value=1.79e-4) cohorts. The last lymphoma cohort proved to have an HR=3.50, p-value=1.15e-7. Loneliness- associated genes had good survival prediction for cancer patients, especially bone cancer patients. Our study provided the first indication that expression of loneliness-associated genes are related to survival time of cancer patients.
Generic Feature Selection with Short Fat Data
Clarke, B.; Chu, J.-H.
2014-01-01
SUMMARY Consider a regression problem in which there are many more explanatory variables than data points, i.e., p ≫ n. Essentially, without reducing the number of variables inference is impossible. So, we group the p explanatory variables into blocks by clustering, evaluate statistics on the blocks and then regress the response on these statistics under a penalized error criterion to obtain estimates of the regression coefficients. We examine the performance of this approach for a variety of choices of n, p, classes of statistics, clustering algorithms, penalty terms, and data types. When n is not large, the discrimination over number of statistics is weak, but computations suggest regressing on approximately [n/K] statistics where K is the number of blocks formed by a clustering algorithm. Small deviations from this are observed when the blocks of variables are of very different sizes. Larger deviations are observed when the penalty term is an Lq norm with high enough q. PMID:25346546
Quantitative prediction of ionization effect on human skin permeability.
Baba, Hiromi; Ueno, Yusuke; Hashida, Mitsuru; Yamashita, Fumiyoshi
2017-04-30
Although skin permeability of an active ingredient can be severely affected by its ionization in a dose solution, most of the existing prediction models cannot predict such impacts. To provide reliable predictors, we curated a novel large dataset of in vitro human skin permeability coefficients for 322 entries comprising chemically diverse permeants whose ionization fractions can be calculated. Subsequently, we generated thousands of computational descriptors, including LogD (octanol-water distribution coefficient at a specific pH), and analyzed the dataset using nonlinear support vector regression (SVR) and Gaussian process regression (GPR) combined with greedy descriptor selection. The SVR model was slightly superior to the GPR model, with externally validated squared correlation coefficient, root mean square error, and mean absolute error values of 0.94, 0.29, and 0.21, respectively. These models indicate that Log D is effective for a comprehensive prediction of ionization effects on skin permeability. In addition, the proposed models satisfied the statistical criteria endorsed in recent model validation studies. These models can evaluate virtually generated compounds at any pH; therefore, they can be used for high-throughput evaluations of numerous active ingredients and optimization of their skin permeability with respect to permeant ionization. Copyright © 2017 Elsevier B.V. All rights reserved.
Multicollinearity and Regression Analysis
NASA Astrophysics Data System (ADS)
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
Parametric regression model for survival data: Weibull regression model as an example
2016-01-01
Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
[Effects of carbon components of fine particulate matter (PM2.5) on atherogenic index of plasma].
Fan, Jiao; Qin, Xiaolei; Xue, Xiaodan; Han, Bin; Bai, Zhipeng; Tang, Naijun; Zhang, Liwen
2014-01-01
To evaluate associations between carbon constituents of fine particulate matter (PM2.5) and atherogenic index of plasma (AIP). We collected subjects from two communities by a system sampling, and 112 people aged over 60 years old without cardiovascular disease were recruited. The levels of cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C) of objects, and personal exposure to PM2.5 were measured on December, 2011. Total carbon (TC), organic carbon (OC) and elemental carbon (EC) of PM2.5 were detected and AIP was calculated according to its definition. The value of AIP among the 112 subjects was 0.05 ± 0.26. Personal exposure concentration of PM2.5 and its carbon components (TC,OC and EC) were (164.75 ± 110.67), (53.86 ± 29.65), (44.93 ± 26.37) and (9.49 ± 5.75) µg/m(3), respectively. The Pearson analysis showed the linear relationship between TC,OC,EC and AIP, all significant positive correlations. The correlation coefficients were TC (r = 0.307, P < 0.05),OC (r = 0.287, P < 0.05) and EC (r = 0.252, P < 0.05), respectively. The multiple logistic regression analysis showed that when the AIP risk categories were selected as dependent variable and low risk group as reference group, the regression coefficient of TC,OC and EC was separately 1.03 (95%CI:1.01-1.05), 1.03 (95%CI:1.01-1.05), 1.12 (95%CI:1.02-1.22) in the high risk group; while there was no statistical significance of the regression coefficient and OR in the middle risk group. There was stable associations between the carbon constituents (TC,OC and EC) of fine Particulate Matter (PM2.5) and AIP. The findings suggested that carbon components of PM2.5 should be considered as risk factors of atherogenic.
Bell, Michelle L.; de Sousa Zanotti Stagliorio Coelho, Micheline; Leon Guo, Yue-Liang; Guo, Yuming; Goodman, Patrick; Hashizume, Masahiro; Honda, Yasushi; Kim, Ho; Lavigne, Eric; Michelozzi, Paola; Hilario Nascimento Saldiva, Paulo; Schwartz, Joel; Scortichini, Matteo; Sera, Francesco; Tobias, Aurelio; Tong, Shilu; Wu, Chang-fu; Zanobetti, Antonella; Zeka, Ariana; Gasparrini, Antonio
2017-01-01
Background: In many places, daily mortality has been shown to increase after days with particularly high or low temperatures, but such daily time-series studies cannot identify whether such increases reflect substantial life shortening or short-term displacement of deaths (harvesting). Objectives: To clarify this issue, we estimated the association between annual mortality and annual summaries of heat and cold in 278 locations from 12 countries. Methods: Indices of annual heat and cold were used as predictors in regressions of annual mortality in each location, allowing for trends over time and clustering of annual count anomalies by country and pooling estimates using meta-regression. We used two indices of annual heat and cold based on preliminary standard daily analyses: a) mean annual degrees above/below minimum mortality temperature (MMT), and b) estimated fractions of deaths attributed to heat and cold. The first index was simpler and matched previous related research; the second was added because it allowed the interpretation that coefficients equal to 0 and 1 are consistent with none (0) or all (1) of the deaths attributable in daily analyses being displaced by at least 1 y. Results: On average, regression coefficients of annual mortality on heat and cold mean degrees were 1.7% [95% confidence interval (CI): 0.3, 3.1] and 1.1% (95% CI: 0.6, 1.6) per degree, respectively, and daily attributable fractions were 0.8 (95% CI: 0.2, 1.3) and 1.1 (95% CI: 0.9, 1.4). The proximity of the latter coefficients to 1.0 provides evidence that most deaths found attributable to heat and cold in daily analyses were brought forward by at least 1 y. Estimates were broadly robust to alternative model assumptions. Conclusions: These results provide strong evidence that most deaths associated in daily analyses with heat and cold are displaced by at least 1 y. https://doi.org/10.1289/EHP1756 PMID:29084393
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
NASA Astrophysics Data System (ADS)
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.
Soil sail content estimation in the yellow river delta with satellite hyperspectral data
Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang
2008-01-01
Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.
The solar wind effect on cosmic rays and solar activity
NASA Technical Reports Server (NTRS)
Fujimoto, K.; Kojima, H.; Murakami, K.
1985-01-01
The relation of cosmic ray intensity to solar wind velocity is investigated, using neutron monitor data from Kiel and Deep River. The analysis shows that the regression coefficient of the average intensity for a time interval to the corresponding average velocity is negative and that the absolute effect increases monotonously with the interval of averaging, tau, that is, from -0.5% per 100km/s for tau = 1 day to -1.1% per 100km/s for tau = 27 days. For tau 27 days the coefficient becomes almost constant independently of the value of tau. The analysis also shows that this tau-dependence of the regression coefficiently is varying with the solar activity.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
NASA Astrophysics Data System (ADS)
Reddy, Ramakrushna; Nair, Rajesh R.
2013-10-01
This work deals with a methodology applied to seismic early warning systems which are designed to provide real-time estimation of the magnitude of an event. We will reappraise the work of Simons et al. (2006), who on the basis of wavelet approach predicted a magnitude error of ±1. We will verify and improve upon the methodology of Simons et al. (2006) by applying an SVM statistical learning machine on the time-scale wavelet decomposition methods. We used the data of 108 events in central Japan with magnitude ranging from 3 to 7.4 recorded at KiK-net network stations, for a source-receiver distance of up to 150 km during the period 1998-2011. We applied a wavelet transform on the seismogram data and calculating scale-dependent threshold wavelet coefficients. These coefficients were then classified into low magnitude and high magnitude events by constructing a maximum margin hyperplane between the two classes, which forms the essence of SVMs. Further, the classified events from both the classes were picked up and linear regressions were plotted to determine the relationship between wavelet coefficient magnitude and earthquake magnitude, which in turn helped us to estimate the earthquake magnitude of an event given its threshold wavelet coefficient. At wavelet scale number 7, we predicted the earthquake magnitude of an event within 2.7 seconds. This means that a magnitude determination is available within 2.7 s after the initial onset of the P-wave. These results shed light on the application of SVM as a way to choose the optimal regression function to estimate the magnitude from a few seconds of an incoming seismogram. This would improve the approaches from Simons et al. (2006) which use an average of the two regression functions to estimate the magnitude.
Interpretation of the Coefficients in the Fit y = at + bx + c
ERIC Educational Resources Information Center
Farnsworth, David L.
2006-01-01
The goals of this note are to derive formulas for the coefficients a and b in the least-squares regression plane y = at + bx + c for observations (t[subscript]i,x[subscript]i,y[subscript]i), i = 1, 2, ..., n, and to present meanings for the coefficients a and b. In this note, formulas for the coefficients a and b in the least-squares fit are…
van Wesenbeeck, Ian; Driver, Jeffrey; Ross, John
2008-04-01
Volatilization of chemicals can be an important form of dissipation in the environment. Rates of evaporative losses from plant and soil surfaces are useful for estimating the potential for food-related dietary residues and operator and bystander exposure, and can be used as source functions for screening models that predict off-site movement of volatile materials. A regression of evaporation on vapor pressure from three datasets containing 82 pesticidal active ingredients and co-formulants, ranging in vapor pressure from 0.0001 to >30,000 Pa was developed for this purpose with a regression correlation coefficient of 0.98.
A semi-nonparametric Poisson regression model for analyzing motor vehicle crash data.
Ye, Xin; Wang, Ke; Zou, Yajie; Lord, Dominique
2018-01-01
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.
Oziminski, Wojciech P; Krygowski, Tadeusz M
2011-03-01
Electronic structure of 22 monosubstituted derivatives of benzene and exocyclically substituted fulvene with substituents: B(OH)(2), BH(2), CCH, CF(3), CH(3), CHCH(2), CHO, Cl, CMe(3), CN, COCH(3), CONH(2), COOH, F, NH(2), NMe(2), NO, NO(2), OCH(3), OH, SiH(3), SiMe(3) were studied theoretically by means of Natural Bond Orbital analysis. It is shown, that sum of π-electron population of carbon atoms of the fulvene and benzene rings, pEDA(F) and pEDA(B), respectively correlate well with Hammett substituent constants [Formula in text] and aromaticity index NICS. The substituent effect acting on pi-electron occupation at carbon atoms of the fulvene ring is significantly stronger than in the case of benzene. Electron occupations of ring carbon atoms (except C1) in fulvene plotted against each other give linear regressions with high correlation coefficients. The same is true for ortho- and para-carbon atoms in benzene. Positive slopes of the regressions indicate similar for fulvene and benzene kind of substituent effect - mostly resonance in nature. Only the regressions of occupation at the carbon atom in meta- position of benzene against ortho- and para-positions gives negative slopes and low correlation coefficients.
ERIC Educational Resources Information Center
Waller, Niels; Jones, Jeff
2011-01-01
We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Interpreting Bivariate Regression Coefficients: Going beyond the Average
ERIC Educational Resources Information Center
Halcoussis, Dennis; Phillips, G. Michael
2010-01-01
Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…
Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results
ERIC Educational Resources Information Center
Warne, Russell T.
2011-01-01
Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Shui, Wei; DU, Yong; Chen, Yi Ping; Jian, Xiao Mei; Fan, Bing Xiong
2017-04-18
Anxi County, specializing in tea cultivation, was taken as a case in this research. Pearson correlation analysis, ordinary least squares model (OLS) and geographically weighted regression model (GWR) were used to select four primary influence factors of specialization in tea cultivation (i.e., the average elevation, net income per capita, proportion of agricultural population, and the distance from roads) by analyzing the specialization degree of each town of Anxi County. Meanwhile, the spatial patterns of specialization in tea cultivation of Anxi County were evaluated. The results indicated that specialization in tea cultivation of Anxi County showed an obvious spatial auto-correlation, and a spatial pattern with "low-middle-high" circle structure, which was similar to Von Thünen's circle structure model, appeared from the county town to its surrounding region. Meanwhile, GWR (0.624) had a better fitting degree than OLS (0.595), and GWR could reasonably expound the spatial data. Contrary to the agricultural location theory of Von Thünen's model, which indicated that distance from market was a determination factor, the specialization degree of tea cultivation in Anxi was mainly decided by natural conditions of mountain area, instead of the social factors. Specialization degree of tea cultivation was positively correlated with the average elevation, net income per capita and the proportion of agricultural population, while a negative correlation was found between the distance from roads and specialization degree of tea cultivation. Coefficients of regression between the specialization degree of tea cultivation and two factors (i.e., the average elevation and net income per capita) showed a spatial pattern of higher level in the north direction and lower level in the south direction. On the contrary, the regression coefficients for the proportion of agricultural population increased from south to north of Anxi County. Furthermore, regression coefficient for the distance from roads showed a spatial pattern of higher level in the northeast direction and lower level in the southwest direction of Anxi County.
40 CFR 53.34 - Test procedure for methods for PM10 and Class I methods for PM2.5.
Code of Federal Regulations, 2011 CFR
2011-07-01
... linear regression parameters (slope, intercept, and correlation coefficient) describing the relationship... correlation coefficient. (2) To pass the test for comparability, the slope, intercept, and correlation...
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
NASA Technical Reports Server (NTRS)
Stolzer, Alan J.; Halford, Carl
2007-01-01
In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.
NASA Astrophysics Data System (ADS)
Mitra, Ashis; Majumdar, Prabal Kumar; Bannerjee, Debamalya
2013-03-01
This paper presents a comparative analysis of two modeling methodologies for the prediction of air permeability of plain woven handloom cotton fabrics. Four basic fabric constructional parameters namely ends per inch, picks per inch, warp count and weft count have been used as inputs for artificial neural network (ANN) and regression models. Out of the four regression models tried, interaction model showed very good prediction performance with a meager mean absolute error of 2.017 %. However, ANN models demonstrated superiority over the regression models both in terms of correlation coefficient and mean absolute error. The ANN model with 10 nodes in the single hidden layer showed very good correlation coefficient of 0.982 and 0.929 and mean absolute error of only 0.923 and 2.043 % for training and testing data respectively.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Correlation and prediction of dynamic human isolated joint strength from lean body mass
NASA Technical Reports Server (NTRS)
Pandya, Abhilash K.; Hasson, Scott M.; Aldridge, Ann M.; Maida, James C.; Woolford, Barbara J.
1992-01-01
A relationship between a person's lean body mass and the amount of maximum torque that can be produced with each isolated joint of the upper extremity was investigated. The maximum dynamic isolated joint torque (upper extremity) on 14 subjects was collected using a dynamometer multi-joint testing unit. These data were reduced to a table of coefficients of second degree polynomials, computed using a least squares regression method. All the coefficients were then organized into look-up tables, a compact and convenient storage/retrieval mechanism for the data set. Data from each joint, direction and velocity, were normalized with respect to that joint's average and merged into files (one for each curve for a particular joint). Regression was performed on each one of these files to derive a table of normalized population curve coefficients for each joint axis, direction, and velocity. In addition, a regression table which included all upper extremity joints was built which related average torque to lean body mass for an individual. These two tables are the basis of the regression model which allows the prediction of dynamic isolated joint torques from an individual's lean body mass.
Sabetghadam, Samaneh; Ahmadi-Givi, Farhang
2014-01-01
Light extinction, which is the extent of attenuation of light signal for every distance traveled by light in the absence of special weather conditions (e.g., fog and rain), can be expressed as the sum of scattering and absorption effects of aerosols. In this paper, diurnal and seasonal variations of the extinction coefficient are investigated for the urban areas of Tehran from 2007 to 2009. Cases of visibility impairment that were concurrent with reports of fog, mist, precipitation, or relative humidity above 90% are filtered. The mean value and standard deviation of daily extinction are 0.49 and 0.39 km(-1), respectively. The average is much higher than that in many other large cities in the world, indicating the rather poor air quality over Tehran. The extinction coefficient shows obvious diurnal variations in each season, with a peak in the morning that is more pronounced in the wintertime. Also, there is a very slight increasing trend in the annual variations of atmospheric extinction coefficient, which suggests that air quality has regressed since 2007. The horizontal extinction coefficient decreased from January to July in each year and then increased between July and December, with the maximum value in the winter. Diurnal variation of extinction is often associated with small values for low relative humidity (RH), but increases significantly at higher RH. Annual correlation analysis shows that there is a positive correlation between the extinction coefficient and RH, CO, PM10, SO2, and NO2 concentration, while negative correlation exists between the extinction and T, WS, and O3, implying their unfavorable impact on extinction variation. The extinction budget was derived from multiple regression equations using the regression coefficients. On average, 44% of the extinction is from suspended particles, 3% is from air molecules, about 5% is from NO2 absorption, 0.35% is from RH, and approximately 48% is unaccounted for, which may represent errors in the data as well as contribution of other atmospheric constituents omitted from the analysis. Stronger regression equation is achieved in the summer, meaning that the extinction is more predictable in this season using pollutant concentrations.
Li, Zhenghua; Cheng, Fansheng; Xia, Zhining
2011-01-01
The chemical structures of 114 polycyclic aromatic sulfur heterocycles (PASHs) have been studied by molecular electronegativity-distance vector (MEDV). The linear relationships between gas chromatographic retention index and the MEDV have been established by a multiple linear regression (MLR) model. The results of variable selection by stepwise multiple regression (SMR) and the powerful predictive abilities of the optimization model appraised by leave-one-out cross-validation showed that the optimization model with the correlation coefficient (R) of 0.994 7 and the cross-validated correlation coefficient (Rcv) of 0.994 0 possessed the best statistical quality. Furthermore, when the 114 PASHs compounds were divided into calibration and test sets in the ratio of 2:1, the statistical analysis showed our models possesses almost equal statistical quality, the very similar regression coefficients and the good robustness. The quantitative structure-retention relationship (QSRR) model established may provide a convenient and powerful method for predicting the gas chromatographic retention of PASHs.
Cerebrospinal fluid norepinephrine and cognition in subjects across the adult age span
Wang, Lucy Y.; Murphy, Richard R.; Hanscom, Brett; Li, Ge; Millard, Steven P.; Petrie, Eric C.; Galasko, Douglas R.; Sikkema, Carl; Raskind, Murray A.; Wilkinson, Charles W.; Peskind, Elaine R.
2013-01-01
Adequate central nervous system noradrenergic activity enhances cognition, but excessive noradrenergic activity may have adverse effects on cognition. Previous studies have also demonstrated that noradrenergic activity is higher in older than younger adults. We aimed to determine relationships between cerebrospinal fluid (CSF) norepinephrine (NE) concentration and cognitive performance by using data from a CSF bank that includes samples from 258 cognitively normal participants aged 21–100 years. After adjusting for age, gender, education, and ethnicity, higher CSF NE levels (units of 100 pg/mL) are associated with poorer performance on tests of attention, processing speed, and executive function (Trail Making A: regression coefficient 1.5, standard error [SE] 0.77, p = 0.046; Trail Making B: regression coefficient 5.0, SE 2.2, p = 0.024; Stroop Word-Color Interference task: regression coefficient 6.1, SE 2.0, p = 0.003). Findings are consistent with the earlier literature relating excess noradrenergic activity with cognitive impairment. PMID:23639207
Cerebrospinal fluid norepinephrine and cognition in subjects across the adult age span.
Wang, Lucy Y; Murphy, Richard R; Hanscom, Brett; Li, Ge; Millard, Steven P; Petrie, Eric C; Galasko, Douglas R; Sikkema, Carl; Raskind, Murray A; Wilkinson, Charles W; Peskind, Elaine R
2013-10-01
Adequate central nervous system noradrenergic activity enhances cognition, but excessive noradrenergic activity may have adverse effects on cognition. Previous studies have also demonstrated that noradrenergic activity is higher in older than younger adults. We aimed to determine relationships between cerebrospinal fluid (CSF) norepinephrine (NE) concentration and cognitive performance by using data from a CSF bank that includes samples from 258 cognitively normal participants aged 21-100 years. After adjusting for age, gender, education, and ethnicity, higher CSF NE levels (units of 100 pg/mL) are associated with poorer performance on tests of attention, processing speed, and executive function (Trail Making A: regression coefficient 1.5, standard error [SE] 0.77, p = 0.046; Trail Making B: regression coefficient 5.0, SE 2.2, p = 0.024; Stroop Word-Color Interference task: regression coefficient 6.1, SE 2.0, p = 0.003). Findings are consistent with the earlier literature relating excess noradrenergic activity with cognitive impairment. Published by Elsevier Inc.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
A non-linear regression method for CT brain perfusion analysis
NASA Astrophysics Data System (ADS)
Bennink, E.; Oosterbroek, J.; Viergever, M. A.; Velthuis, B. K.; de Jong, H. W. A. M.
2015-03-01
CT perfusion (CTP) imaging allows for rapid diagnosis of ischemic stroke. Generation of perfusion maps from CTP data usually involves deconvolution algorithms providing estimates for the impulse response function in the tissue. We propose the use of a fast non-linear regression (NLR) method that we postulate has similar performance to the current academic state-of-art method (bSVD), but that has some important advantages, including the estimation of vascular permeability, improved robustness to tracer-delay, and very few tuning parameters, that are all important in stroke assessment. The aim of this study is to evaluate the fast NLR method against bSVD and a commercial clinical state-of-art method. The three methods were tested against a published digital perfusion phantom earlier used to illustrate the superiority of bSVD. In addition, the NLR and clinical methods were also tested against bSVD on 20 clinical scans. Pearson correlation coefficients were calculated for each of the tested methods. All three methods showed high correlation coefficients (>0.9) with the ground truth in the phantom. With respect to the clinical scans, the NLR perfusion maps showed higher correlation with bSVD than the perfusion maps from the clinical method. Furthermore, the perfusion maps showed that the fast NLR estimates are robust to tracer-delay. In conclusion, the proposed fast NLR method provides a simple and flexible way of estimating perfusion parameters from CT perfusion scans, with high correlation coefficients. This suggests that it could be a better alternative to the current clinical and academic state-of-art methods.
Zhang, Hualing
2014-03-01
To learn characteristics and their mutual relations of self-esteem, self-harmony and interpersonal-harmony of university students, in order to provide the basis for mental health education. With a stratified cluster random sampling method, a questionnaire survey was conducted in 820 university students from 16 classes of four universities, chosen from 30 universities in Anhui Province. Meanwhile, Rosenberg Self-esteem Scale, Self-harmony Scale and Interpersonal-harmony Diagnostic Scale were used for assessment. Self-esteem of university students has an average score of (30.71 +/- 4.77), higher than median thoery 25, and there existed statistical significance in the dimensions of gender (P = 0.004), origin (P = 0.038) and only-child (P = 0.005). University students' self-harmony has an average score of (98.66 +/- 8.69), among which there were 112 students in the group of low score, counting for 13.7%, 442 in that of middle score, counting for 53.95%, 265 in that of high score, counting for 32.33%. And there existed no statistical significance in the total-score of self-harmony and score differences from most of subscales in the dimention of gender and origin, but satistical significance did exist in the dimention of only-child (P = 0.004). It was statistically significant (P = 0.006) on the "stereotype" subscales, on the differences between university students from urban areas and rural areas. Every dimension of self-esteem and self -harmony and interpersonal harmony was correlated and statistically significant. Multiple regression analysis found that when there was a variable in self-esteem, the amount of the variable of self-harmony for explaination of interpersonal conversation dropped from 22.6% to 12%, and standard regression coefficient changing from 0.087 to 0.035. The trouble of interpersonal dating fell from 27.6% to 13.1%, the standard regression coefficient changing from 0.104 to 0.019. The bother of treating people fell from 30.9% to 15%, and the standard regression coefficient changing from 0.079 to 0.020. The problem of heterosexual contact fell from 23.4% to 17.3%, and the standard regression coefficient changing from 0.095 to 0.024. Self-esteem was a mediator variable between self-harmony and interpersonal-harmony. By cultivating university students' level of self-esteem to achieve their self-harmony and interpersonal-harmony, university students' mental health level can be improved.
Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.
ERIC Educational Resources Information Center
Algina, James; Olejnik, Stephen
2000-01-01
Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)
Refraction data survey: 2nd generation correlation of myopia.
Greene, Peter R; Medina, Antonio
2016-10-01
The objective herein is to provide refraction data, myopia progression rate, prevalence, and 1st and 2nd generation correlations, relevant to whether myopia is random or inherited. First- and second-generation ocular refraction data are assembled from N = 34 families, average of 2.8 children per family. From this group, data are available from N = 165 subjects. Inter-generation regressions are performed on all the data sets, including correlation coefficient r, and myopia prevalence [%]. Prevalence of myopia is [M] = 38.5 %. Prevalence of high myopes with |R| >6 D is [M-] = 20.5 %. Average refraction is = -7.52 D ± 1.31 D (N = 33). Regression parameters are calculated for all the data sets, yielding correlation coefficients in the range r = 0.48-0.72 for some groups of myopes and high myopes, fathers to daughters, and mothers to sons. Also of interest, some categories show essentially no correlation, -0.20 < r < 0.20, indicating that the refractive errors occur randomly. Time series results show myopia diopter rates = -0.50 D/year.
Nakatsuka, Haruo; Chiba, Keiko; Watanabe, Takao; Sawatari, Hideyuki; Seki, Takako
2016-11-01
Iodine intake by adults in farming districts in Northeastern Japan was evaluated by two methods: (1) government-approved food composition tables based calculation and (2) instrumental measurement. The correlation between these two values and a regression model for the calibration of calculated values was presented. Iodine intake was calculated, using the values in the Japan Standard Tables of Food Composition (FCT), through the analysis of duplicate samples of complete 24-h food consumption for 90 adult subjects. In cases where the value for iodine content was not available in the FCT, it was assumed to be zero for that food item (calculated values). Iodine content was also measured by ICP-MS (measured values). Calculated and measured values rendered geometric means (GM) of 336 and 279 μg/day, respectively. There was no statistically significant (p > 0.05) difference between calculated and measured values. The correlation coefficient was 0.646 (p < 0.05). With this high correlation coefficient, a simple regression line can be applied to estimate measured value from calculated value. A survey of the literature suggests that the values in this study were similar to values that have been reported to date for Japan, and higher than those for other countries in Asia. Iodine intake of Japanese adults was 336 μg/day (GM, calculated) and 279 μg/day (GM, measured). Both values correlated so well, with a correlation coefficient of 0.646, that a regression model (Y = 130.8 + 1.9479X, where X and Y are measured and calculated values, respectively) could be used to calibrate calculated values.
NASA Astrophysics Data System (ADS)
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s-1. Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit ( r 2 > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Ben Shabat, Yael; Shitzer, Avraham
2012-07-01
Facial heat exchange convection coefficients were estimated from experimental data in cold and windy ambient conditions applicable to wind chill calculations. Measured facial temperature datasets, that were made available to this study, originated from 3 separate studies involving 18 male and 6 female subjects. Most of these data were for a -10°C ambient environment and wind speeds in the range of 0.2 to 6 m s(-1). Additional single experiments were for -5°C, 0°C and 10°C environments and wind speeds in the same range. Convection coefficients were estimated for all these conditions by means of a numerical facial heat exchange model, applying properties of biological tissues and a typical facial diameter of 0.18 m. Estimation was performed by adjusting the guessed convection coefficients in the computed facial temperatures, while comparing them to measured data, to obtain a satisfactory fit (r(2) > 0.98, in most cases). In one of the studies, heat flux meters were additionally used. Convection coefficients derived from these meters closely approached the estimated values for only the male subjects. They differed significantly, by about 50%, when compared to the estimated female subjects' data. Regression analysis was performed for just the -10°C ambient temperature, and the range of experimental wind speeds, due to the limited availability of data for other ambient temperatures. The regressed equation was assumed in the form of the equation underlying the "new" wind chill chart. Regressed convection coefficients, which closely duplicated the measured data, were consistently higher than those calculated by this equation, except for one single case. The estimated and currently used convection coefficients are shown to diverge exponentially from each other, as wind speed increases. This finding casts considerable doubts on the validity of the convection coefficients that are used in the computation of the "new" wind chill chart and their applicability to humans in cold and windy environments.
Theodoratou, Evropi; Zhang, Jian Shayne F.; Kolcic, Ivana; Davis, Andrew M.; Bhopal, Sunil; Nair, Harish; Chan, Kit Yee; Liu, Li; Johnson, Hope; Rudan, Igor; Campbell, Harry
2011-01-01
Background Pneumonia is the leading cause of child deaths globally. The aims of this study were to: a) estimate the number and global distribution of pneumonia deaths for children 1–59 months for 2008 for countries with low (<85%) or no coverage of death certification using single-cause regression models and b) compare these country estimates with recently published ones based on multi-cause regression models. Methods and Findings For 35 low child-mortality countries with <85% coverage of death certification, a regression model based on vital registration data of low child-mortality and >85% coverage of death certification countries was used. For 87 high child-mortality countries pneumonia death estimates were obtained by applying a regression model developed from published and unpublished verbal autopsy data from high child-mortality settings. The total number of 1–59 months pneumonia deaths for the year 2008 for these 122 countries was estimated to be 1.18 M (95% CI 0.77 M–1.80 M), which represented 23.27% (95% CI 17.15%–32.75%) of all 1–59 month child deaths. The country level estimation correlation coefficient between these two methods was 0.40. Interpretation Although the overall number of post-neonatal pneumonia deaths was similar irrespective to the method of estimation used, the country estimate correlation coefficient was low, and therefore country-specific estimates should be interpreted with caution. Pneumonia remains the leading cause of child deaths and is greatest in regions of poverty and high child-mortality. Despite the concerns about gender inequity linked with childhood mortality we could not estimate sex-specific pneumonia mortality rates due to the inadequate data. Life-saving interventions effective in preventing and treating pneumonia mortality exist but few children in high pneumonia disease burden regions are able to access them. To achieve the United Nations Millennium Development Goal 4 target to reduce child deaths by two-thirds in year 2015 will require the scale-up of access to these effective pneumonia interventions. PMID:21966425
Campbell, J Elliott; Moen, Jeremie C; Ney, Richard A; Schnoor, Jerald L
2008-03-01
Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively.
Radon-222 concentrations in ground water and soil gas on Indian reservations in Wisconsin
DeWild, John F.; Krohelski, James T.
1995-01-01
For sites with wells finished in the sand and gravel aquifer, the coefficient of determination (R2) of the regression of concentration of radon-222 in ground water as a function of well depth is 0.003 and the significance level is 0.32, which indicates that there is not a statistically significant relation between radon-222 concentrations in ground water and well depth. The coefficient of determination of the regression of radon-222 in ground water and soil gas is 0.19 and the root mean square error of the regression line is 271 picocuries per liter. Even though the significance level (0.036) indicates a statistical relation, the root mean square error of the regression is so large that the regression equation would not give reliable predictions. Because of an inadequate number of samples, similar statistical analyses could not be performed for sites with wells finished in the crystalline and sedimentary bedrock aquifers.
Integrative Analysis of High-throughput Cancer Studies with Contrasted Penalization
Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Shia, BenChang; Ma, Shuangge
2015-01-01
In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms “classic” meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance. PMID:24395534
NASA Astrophysics Data System (ADS)
Adbul-Munaim, Ali Mazin; Reuter, Marco; Koch, Martin; Watson, Dennis G.
2015-07-01
Terahertz-time-domain spectroscopy (THz-TDS) in the range of 0.5-2.0 THz was evaluated for distinguishing among gasoline engine oils of three different grades (SAE 5W-20, 10W-40, and 20W-50) from the same manufacturer. Absorption coefficient showed limited potential and only distinguished ( p < 0.05) the 20W-50 grade from the other two grades in the 1.7-2.0-THz range. Refractive index data demonstrated relatively flat and consistently spaced curves for the three oil grades. ANOVA results confirmed a highly significant difference ( p < 0.0001) in refractive index among each of the three oils across the 0.5-2.0-THz range. Linear regression was applied to refractive index data at 0.25-THz intervals from 0.5 to 2.0 THz to predict kinematic viscosity. All seven linear regression models, intercepts, and refractive index coefficients were highly significant ( p < 0.0001). All models had a similar fit with R 2 ranging from 0.9773 to 0.9827 and RMSE ranging from 6.33 to 7.75. The refractive indices at 1.25 THz produced the best fit. The refractive indices of these oil samples were promising for identification and distinction of oil grades.
Development and evaluation of an electromagnetic hypersensitivity questionnaire for Japanese people
Tokiya, Mikiko; Mizuki, Masami; Miyata, Mikio; Kanatani, Kumiko T.; Takagi, Airi; Tsurikisawa, Naomi; Kame, Setsuko; Katoh, Takahiko; Tsujiuchi, Takuya; Kumano, Hiroaki
2016-01-01
The purpose of the present study was to evaluate the validity and reliability of a Japanese version of an electromagnetic hypersensitivity (EHS) questionnaire, originally developed by Eltiti et al. in the United Kingdom. Using this Japanese EHS questionnaire, surveys were conducted on 1306 controls and 127 self‐selected EHS subjects in Japan. Principal component analysis of controls revealed eight principal symptom groups, namely, nervous, skin‐related, head‐related, auditory and vestibular, musculoskeletal, allergy‐related, sensory, and heart/chest‐related. The reliability of the Japanese EHS questionnaire was confirmed by high to moderate intraclass correlation coefficients in a test–retest analysis, and high Cronbach's α coefficients (0.853–0.953) from each subscale. A comparison of scores of each subscale between self‐selected EHS subjects and age‐ and sex‐matched controls using bivariate logistic regression analysis, Mann–Whitney U‐ and χ 2 tests, verified the validity of the questionnaire. This study demonstrated that the Japanese EHS questionnaire is reliable and valid, and can be used for surveillance of EHS individuals in Japan. Furthermore, based on multiple logistic regression and receiver operating characteristic analyses, we propose specific preliminary criteria for screening EHS individuals in Japan. Bioelectromagnetics. 37:353–372, 2016. © 2016 The Authors. Bioelectromagnetics Published by Wiley Periodicals, Inc. PMID:27324106
ERIC Educational Resources Information Center
Kane, Michael T.; Mroch, Andrew A.
2010-01-01
In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…
Incremental Net Effects in Multiple Regression
ERIC Educational Resources Information Center
Lipovetsky, Stan; Conklin, Michael
2005-01-01
A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-01-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
NASA Astrophysics Data System (ADS)
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-12-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-06-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.
Aziz, Shamsul Akmar Ab; Nuawi, Mohd Zaki; Nor, Mohd Jailani Mohd
2015-01-01
The objective of this study was to present a new method for determination of hand-arm vibration (HAV) in Malaysian Army (MA) three-tonne truck steering wheels based on changes in vehicle speed using regression model and the statistical analysis method known as Integrated Kurtosis-Based Algorithm for Z-Notch Filter Technique Vibro (I-kaz Vibro). The test was conducted for two different road conditions, tarmac and dirt roads. HAV exposure was measured using a Brüel & Kjær Type 3649 vibration analyzer, which is capable of recording HAV exposures from steering wheels. The data was analyzed using I-kaz Vibro to determine the HAV values in relation to varying speeds of a truck and to determine the degree of data scattering for HAV data signals. Based on the results obtained, HAV experienced by drivers can be determined using the daily vibration exposure A(8), I-kaz Vibro coefficient (Ƶ(v)(∞)), and the I-kaz Vibro display. The I-kaz Vibro displays also showed greater scatterings, indicating that the values of Ƶ(v)(∞) and A(8) were increasing. Prediction of HAV exposure was done using the developed regression model and graphical representations of Ƶ(v)(∞). The results of the regression model showed that Ƶ(v)(∞) increased when the vehicle speed and HAV exposure increased. For model validation, predicted and measured noise exposures were compared, and high coefficient of correlation (R(2)) values were obtained, indicating that good agreement was obtained between them. By using the developed regression model, we can easily predict HAV exposure from steering wheels for HAV exposure monitoring.
Hatta, Takeshi; Kato, Kimiko; Hotta, Chie; Higashikawa, Mari; Iwahara, Akihiko; Hatta, Taketoshi; Hatta, Junko; Fujiwara, Kazumi; Nagahara, Naoko; Ito, Emi; Hamajima, Nobuyuki
2017-01-01
The validity of Bucur and Madden's (2010) proposal that an age-related decline is particularly pronounced in executive function measures rather than in elementary perceptual speed measures was examined via the Yakumo Study longitudinal database. Their proposal suggests that cognitive load differentially affects cognitive abilities in older adults. To address their proposal, linear regression coefficients of 104 participants were calculated individually for the digit cancellation task 1 (D-CAT1), where participants search for a given single digit, and the D-CAT3, where they search for 3 digits simultaneously. Therefore, it can be conjectured that the D-CAT1 represents primarily elementary perceptual speed and low-visual search load task. whereas the D-CAT3 represents primarily executive function and high-visual search load task. Regression coefficients from age 65 to 75 for the D-CAT3 showed a significantly steeper decline than that for the D-CAT1, and a large number of participants showed this tendency. These results support the proposal by Brcur and Madden (2010) and suggest that the degree of cognitive load affects age-related cognitive decline.
Distributed Monitoring of the R(sup 2) Statistic for Linear Regression
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.
2011-01-01
The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.
Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
Remote sensing of PM2.5 from ground-based optical measurements
NASA Astrophysics Data System (ADS)
Li, S.; Joseph, E.; Min, Q.
2014-12-01
Remote sensing of particulate matter concentration with aerodynamic diameter smaller than 2.5 um(PM2.5) by using ground-based optical measurements of aerosols is investigated based on 6 years of hourly average measurements of aerosol optical properties, PM2.5, ceilometer backscatter coefficients and meteorological factors from Howard University Beltsville Campus facility (HUBC). The accuracy of quantitative retrieval of PM2.5 using aerosol optical depth (AOD) is limited due to changes in aerosol size distribution and vertical distribution. In this study, ceilometer backscatter coefficients are used to provide vertical information of aerosol. It is found that the PM2.5-AOD ratio can vary largely for different aerosol vertical distributions. The ratio is also sensitive to mode parameters of bimodal lognormal aerosol size distribution when the geometric mean radius for the fine mode is small. Using two Angstrom exponents calculated at three wavelengths of 415, 500, 860nm are found better representing aerosol size distributions than only using one Angstrom exponent. A regression model is proposed to assess the impacts of different factors on the retrieval of PM2.5. Compared to a simple linear regression model, the new model combining AOD and ceilometer backscatter can prominently improve the fitting of PM2.5. The contribution of further introducing Angstrom coefficients is apparent. Using combined measurements of AOD, ceilometer backscatter, Angstrom coefficients and meteorological parameters in the regression model can get a correlation coefficient of 0.79 between fitted and expected PM2.5.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Bootstrap Methods: A Very Leisurely Look.
ERIC Educational Resources Information Center
Hinkle, Dennis E.; Winstead, Wayland H.
The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…
NASA Technical Reports Server (NTRS)
Hoepffner, Nicolas; Sathyendranath, Shubha
1993-01-01
The contributions of detrital particles and phytoplankton to total light absorption are retrieved by nonlinear regression on the absorption spectra of total particles from various oceanic regions. The model used explains more than 96% of the variance in the observed particle absorption spectra. The resulting absorption spectra of phytoplankton are then decomposed into several Gaussian bands reflecting absorption by phytoplankton pigments. Such a decomposition, combined with high-performance liquid chromatography data on phytoplankton pigment concentrations, allows the computation of specific absorption coefficients for chlorophylls a, b, and c and carotenoids. The spectral values of these in vivo absorption coefficients are then discussed, considering the effects of secondary pigments which were not measured quantitatively. We show that these coefficients can be used to reconstruct the absorption spectra of phytoplankton at various locations and depths. Discrepancies that do occur at some stations are explained in terms of particle size effect. These coefficients can be used to determine the concentrations of phytoplankton pigments in the water, given the absorption spectrum of total particles.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dierauf, Timothy; Kurtz, Sarah; Riley, Evan
This paper provides a recommended method for evaluating the AC capacity of a photovoltaic (PV) generating station. It also presents companion guidance on setting the facilitys capacity guarantee value. This is a principles-based approach that incorporates plant fundamental design parameters such as loss factors, module coefficients, and inverter constraints. This method has been used to prove contract guarantees for over 700 MW of installed projects. The method is transparent, and the results are deterministic. In contrast, current industry practices incorporate statistical regression where the empirical coefficients may only characterize the collected data. Though these methods may work well when extrapolationmore » is not required, there are other situations where the empirical coefficients may not adequately model actual performance.This proposed Fundamentals Approach method provides consistent results even where regression methods start to lose fidelity.« less
A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form
ERIC Educational Resources Information Center
Cai, Li; Hayes, Andrew F.
2008-01-01
When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…
NASA Astrophysics Data System (ADS)
Keat, Sim Chong; Chun, Beh Boon; San, Lim Hwee; Jafri, Mohd Zubir Mat
2015-04-01
Climate change due to carbon dioxide (CO2) emissions is one of the most complex challenges threatening our planet. This issue considered as a great and international concern that primary attributed from different fossil fuels. In this paper, regression model is used for analyzing the causal relationship among CO2 emissions based on the energy consumption in Malaysia using time series data for the period of 1980-2010. The equations were developed using regression model based on the eight major sources that contribute to the CO2 emissions such as non energy, Liquefied Petroleum Gas (LPG), diesel, kerosene, refinery gas, Aviation Turbine Fuel (ATF) and Aviation Gasoline (AV Gas), fuel oil and motor petrol. The related data partly used for predict the regression model (1980-2000) and partly used for validate the regression model (2001-2010). The results of the prediction model with the measured data showed a high correlation coefficient (R2=0.9544), indicating the model's accuracy and efficiency. These results are accurate and can be used in early warning of the population to comply with air quality standards.
Nattee, Cholwich; Khamsemanan, Nirattaya; Lawtrakul, Luckhana; Toochinda, Pisanu; Hannongbua, Supa
2017-01-01
Malaria is still one of the most serious diseases in tropical regions. This is due in part to the high resistance against available drugs for the inhibition of parasites, Plasmodium, the cause of the disease. New potent compounds with high clinical utility are urgently needed. In this work, we created a novel model using a regression tree to study structure-activity relationships and predict the inhibition constant, K i of three different antimalarial analogues (Trimethoprim, Pyrimethamine, and Cycloguanil) based on their molecular descriptors. To the best of our knowledge, this work is the first attempt to study the structure-activity relationships of all three analogues combined. The most relevant descriptors and appropriate parameters of the regression tree are harvested using extremely randomized trees. These descriptors are water accessible surface area, Log of the aqueous solubility, total hydrophobic van der Waals surface area, and molecular refractivity. Out of all possible combinations of these selected parameters and descriptors, the tree with the strongest coefficient of determination is selected to be our prediction model. Predicted K i values from the proposed model show a strong coefficient of determination, R 2 =0.996, to experimental K i values. From the structure of the regression tree, compounds with high accessible surface area of all hydrophobic atoms (ASA_H) and low aqueous solubility of inhibitors (Log S) generally possess low K i values. Our prediction model can also be utilized as a screening test for new antimalarial drug compounds which may reduce the time and expenses for new drug development. New compounds with high predicted K i should be excluded from further drug development. It is also our inference that a threshold of ASA_H greater than 575.80 and Log S less than or equal to -4.36 is a sufficient condition for a new compound to possess a low K i . Copyright © 2016 Elsevier Inc. All rights reserved.
Belief in complementary and alternative medicine is related to age and paranormal beliefs in adults.
Van den Bulck, Jan; Custers, Kathleen
2010-04-01
The use of complementary and alternative medicine (CAM) is widespread, even among people who use conventional medicine. Positive beliefs about CAM are common among physicians and medical students. Little is known about the beliefs regarding CAM among the general public. Among science students, belief in CAM was predicted by belief in the paranormal. In a cross-sectional study, 712 randomly selected adults (>18 years old) responded to the CAM Health Belief Questionnaire (CHBQ) and a paranormal beliefs scale. CAM beliefs were very prevalent in this sample of adult Flemish men and women. Zero-order correlations indicated that belief in CAM was associated with age (r = 0.173 P < 0.001) level of education (r = -0.079 P = 0.039) social desirability (r = -0.119 P = 0.002) and paranormal belief (r = 0.365 P < 0.001). In a multivariate model, two variables predicted CAM beliefs. Support for CAM increased with age (regression coefficient: 0.01; 95% confidence interval (CI): 0.006 to 0.014), but the strongest relationship existed between support for CAM and beliefs in the paranormal. Paranormal beliefs accounted for 14% of the variance of the CAM beliefs (regression coefficient: 0.376; 95%: CI 0.30-0.44). The level of education (regression coefficient: 0.06; 95% CI: -0.014-0.129) and social desirability (regression coefficient: -0.023; 95% CI: -0.048-0.026) did not make a significant contribution to the explained variance (<0.1%, P = 0.867). Support of CAM was very prevalent in this Flemish adult population. CAM beliefs were strongly associated with paranormal beliefs.
Control of interior surface materials for speech privacy in high-speed train cabins.
Jang, H S; Lim, H; Jeon, J Y
2017-05-01
The effect of interior materials with various absorption coefficients on speech privacy was investigated in a 1:10 scale model of one high-speed train cabin geometry. The speech transmission index (STI) and privacy distance (r P ) were measured in the train cabin to quantify speech privacy. Measurement cases were selected for the ceiling, sidewall, and front and back walls and were classified as high-, medium- and low-absorption coefficient cases. Interior materials with high absorption coefficients yielded a low r P , and the ceiling had the largest impact on both the STI and r P among the interior elements. Combinations of the three cases were measured, and the maximum reduction in r P by the absorptive surfaces was 2.4 m, which exceeds the space between two rows of chairs in the high-speed train. Additionally, the contribution of the interior elements to speech privacy was analyzed using recorded impulse responses and a multiple regression model for r P using the equivalent absorption area. The analysis confirmed that the ceiling was the most important interior element for improving speech privacy. These results can be used to find the relative decrease in r P in the acoustic design of interior materials to improve speech privacy in train cabins. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Pabari, Ritesh M; Ramtoola, Zebunnissa
2012-07-01
A two factor, three level (3(2)) face centred, central composite design (CCD) was applied to investigate the main and interaction effects of tablet diameter and compression force (CF) on hardness, disintegration time (DT) and porosity of mannitol based orodispersible tablets (ODTs). Tablet diameters of 10, 13 and 15 mm, and CF of 10, 15 and 20 kN were studied. Results of multiple linear regression analysis show that both the tablet diameter and CF influence tablet characteristics. A negative value of regression coefficient for tablet diameter showed an inverse relationship with hardness and DT. A positive value of regression coefficient for CF indicated an increase in hardness and DT with increasing CF as a result of the decrease in tablet porosity. Interestingly, at the larger tablet diameter of 15 mm, while hardness increased and porosity decreased with an increase in CF, the DT was resistant to change. The optimised combination was a tablet of 15 mm diameter compressed at 15 kN showing a rapid DT of 37.7s and high hardness of 71.4N. Using these parameters, ODTs containing ibuprofen showed no significant change in DT (ANOVA; p>0.05) irrespective of the hydrophobicity of the ibuprofen. Copyright © 2012 Elsevier B.V. All rights reserved.
Effects of integration time on in-water radiometric profiles.
D'Alimonte, Davide; Zibordi, Giuseppe; Kajiyama, Tamito
2018-03-05
This work investigates the effects of integration time on in-water downward irradiance E d , upward irradiance E u and upwelling radiance L u profile data acquired with free-fall hyperspectral systems. Analyzed quantities are the subsurface value and the diffuse attenuation coefficient derived by applying linear and non-linear regression schemes. Case studies include oligotrophic waters (Case-1), as well as waters dominated by Colored Dissolved Organic Matter (CDOM) and Non-Algal Particles (NAP). Assuming a 24-bit digitization, measurements resulting from the accumulation of photons over integration times varying between 8 and 2048ms are evaluated at depths corresponding to: 1) the beginning of each integration interval (Fst); 2) the end of each integration interval (Lst); 3) the averages of Fst and Lst values (Avg); and finally 4) the values weighted accounting for the diffuse attenuation coefficient of water (Wgt). Statistical figures show that the effects of integration time can bias results well above 5% as a function of the depth definition. Results indicate the validity of the Wgt depth definition and the fair applicability of the Avg one. Instead, both the Fst and Lst depths should not be adopted since they may introduce pronounced biases in E u and L u regression products for highly absorbing waters. Finally, the study reconfirms the relevance of combining multiple radiometric casts into a single profile to increase precision of regression products.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan
2016-05-01
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders. Copyright © 2016 Elsevier B.V. All rights reserved.
Characteristics of low-slope streams that affect O2 transfer rates
Parker, Gene W.; Desimone, Leslie A.
1991-01-01
Multiple-regression techniques were used to derive the reaeration coefficients estimating equation for low sloped streams: K2 = 3.83 MBAS-0.41 SL0.20 H-0.76, where K2 is the reaeration coefficient in base e units per day; MBAS is the methylene blue active substances concentration in milligrams per liter; SL is the water-surface slope in foot per foot; and H is the mean-flow depth in feet. Fourteen hydraulic, physical, and water-quality characteristics were regressed against 29 measured-reaeration coefficients for low-sloped (water surface slopes less than 0.002 foot per foot) streams in Massachusetts and New York. Reaeration coefficients measured from May 1985 to October 1988 ranged from 0.2 to 11.0 base e units per day for 29 low-sloped tracer studies. Concentration of methylene blue active substances is significant because it is thought to be an indicator of concentration of surfactants which could change the surface tension at the air-water interface.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Fuel Regression Rate Behavior of CAMUI Hybrid Rocket
NASA Astrophysics Data System (ADS)
Kaneko, Yudai; Itoh, Mitsunori; Kakikura, Akihito; Mori, Kazuhiro; Uejima, Kenta; Nakashima, Takuji; Wakita, Masashi; Totani, Tsuyoshi; Oshima, Nobuyuki; Nagata, Harunori
A series of static firing tests was conducted to investigate the fuel regression characteristics of a Cascaded Multistage Impinging-jet (CAMUI) type hybrid rocket motor. A CAMUI type hybrid rocket uses the combination of liquid oxygen and a fuel grain made of polyethylene as a propellant. The collision distance divided by the port diameter, H/D, was varied to investigate the effect of the grain geometry on the fuel regression rate. As a result, the H/D geometry has little effect on the regression rate near the stagnation point, where the heat transfer coefficient is high. On the contrary, the fuel regression rate decreases near the circumference of the forward-end face and the backward-end face of fuel blocks. Besides the experimental approaches, a method of computational fluid dynamics clarified the heat transfer distribution on the grain surface with various H/D geometries. The calculation shows the decrease of the flow velocity due to the increase of H/D on the area where the fuel regression rate decreases with the increase of H/D. To estimate the exact fuel consumption, which is necessary to design a fuel grain, real-time measurement by an ultrasonic pulse-echo method was performed.
Censored quantile regression with recursive partitioning-based weights
Wey, Andrew; Wang, Lan; Rudser, Kyle
2014-01-01
Censored quantile regression provides a useful alternative to the Cox proportional hazards model for analyzing survival data. It directly models the conditional quantile of the survival time and hence is easy to interpret. Moreover, it relaxes the proportionality constraint on the hazard function associated with the popular Cox model and is natural for modeling heterogeneity of the data. Recently, Wang and Wang (2009. Locally weighted censored quantile regression. Journal of the American Statistical Association 103, 1117–1128) proposed a locally weighted censored quantile regression approach that allows for covariate-dependent censoring and is less restrictive than other censored quantile regression methods. However, their kernel smoothing-based weighting scheme requires all covariates to be continuous and encounters practical difficulty with even a moderate number of covariates. We propose a new weighting approach that uses recursive partitioning, e.g. survival trees, that offers greater flexibility in handling covariate-dependent censoring in moderately high dimensions and can incorporate both continuous and discrete covariates. We prove that this new weighting scheme leads to consistent estimation of the quantile regression coefficients and demonstrate its effectiveness via Monte Carlo simulations. We also illustrate the new method using a widely recognized data set from a clinical trial on primary biliary cirrhosis. PMID:23975800
Estimation of PM2.5 and PM10 using ground-based AOD measurements during KORUS-AQ campaign
NASA Astrophysics Data System (ADS)
Koo, J. H.; Kim, J.; Kim, S.; Go, S.; Lee, S.; Lee, H.; Mok, J.; Hong, J.; Lee, J.; Eck, T. F.; Holben, B. N.
2017-12-01
During the KORUS-AQ campaign (2 May - 12 June, 2016), aerosol optical depth (AOD) was obtained at multiple channels using various ground-based instruments at Yonsei University, Seoul: AERONET sunphotometer, SKYNET skyradiometer, Brewer spectrophotometer, and multi-filter rotating shadowband radiometer (MFRSR). At the same location, planetary boundary layer (PBL) height and vertical profile of backscattering coefficients also can be obtained based on the celiometer measurements. Using celiometer products and various AODs, we try to estimate the amount of particular matter (PM2.5 and PM10) and validate with in-situ surface PM2.5 and PM10 measurements from AIRKOREA network. Direct comparison between PM2.5 and AOD reveals that the ultraviolet(UV) channel AOD has better correlations, due to the higher sensitivity of short wavelength to the fine-mode particle. In contrast, PM10 shows the highest correlation with the near-infrared(NIR) AOD. Next, we extract the boundary-layer portion of AOD using either PBL height or vertical profile of backscattering coefficients to compare with PM2.5 and PM10. Both results enhance the correlation, but consideration of weighting factor calculated from backscattering coefficients shows larger contribution to the correlation increase. Finally, we performed the multiple linear regression to estimate PM2.5 and PM10 using AODs. Consideration of meteorology (temperature, wind speed, and relative humidity) can enhance the correlation and also O3 and NO2 consideration highly contributes to the high correlation. This finding implies the importance to consider the ambient condition of secondary aerosol formation related to the PM2.5 variation. Multiple regression model finally finds the correlation 0.7-0.8, and diminishes the wavelength-dependent correlation patterns.
Measurement of effective air diffusion coefficients for trichloroethene in undisturbed soil cores.
Bartelt-Hunt, Shannon L; Smith, James A
2002-06-01
In this study, we measure effective diffusion coefficients for trichloroethene in undisturbed soil samples taken from Picatinny Arsenal, New Jersey. The measured effective diffusion coefficients ranged from 0.0053 to 0.0609 cm2/s over a range of air-filled porosity of 0.23-0.49. The experimental data were compared to several previously published relations that predict diffusion coefficients as a function of air-filled porosity and porosity. A multiple linear regression analysis was developed to determine if a modification of the exponents in Millington's [Science 130 (1959) 100] relation would better fit the experimental data. The literature relations appeared to generally underpredict the effective diffusion coefficient for the soil cores studied in this work. Inclusion of a particle-size distribution parameter, d10, did not significantly improve the fit of the linear regression equation. The effective diffusion coefficient and porosity data were used to recalculate estimates of diffusive flux through the subsurface made in a previous study performed at the field site. It was determined that the method of calculation used in the previous study resulted in an underprediction of diffusive flux from the subsurface. We conclude that although Millington's [Science 130 (1959) 100] relation works well to predict effective diffusion coefficients in homogeneous soils with relatively uniform particle-size distributions, it may be inaccurate for many natural soils with heterogeneous structure and/or non-uniform particle-size distributions.
Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection
Goldsmith, Jeff; Huang, Lei; Crainiceanu, Ciprian M.
2013-01-01
We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in less than one page in the Appendix. We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white matter microstructure at every voxel of the corpus callosum for hundreds of subjects. PMID:24729670
Changes in the timing of snowmelt and streamflow in Colorado: A response to recent warming
Clow, David W.
2010-01-01
Trends in the timing of snowmelt and associated runoff in Colorado were evaluated for the 1978-2007 water years using the regional Kendall test (RKT) on daily snow-water equivalent (SWE) data from snowpack telemetry (SNOTEL) sites and daily streamflow data from headwater streams. The RKT is a robust, nonparametric test that provides an increased power of trend detection by grouping data from multiple sites within a given geographic region. The RKT analyses indicated strong, pervasive trends in snowmelt and streamflow timing, which have shifted toward earlier in the year by a median of 2-3 weeks over the 29-yr study period. In contrast, relatively few statistically significant trends were detected using simple linear regression. RKT analyses also indicated that November-May air temperatures increased by a median of 0.9 degrees C decade-1, while 1 April SWE and maximum SWE declined by a median of 4.1 and 3.6 cm decade-1, respectively. Multiple linear regression models were created, using monthly air temperatures, snowfall, latitude, and elevation as explanatory variables to identify major controlling factors on snowmelt timing. The models accounted for 45% of the variance in snowmelt onset, and 78% of the variance in the snowmelt center of mass (when half the snowpack had melted). Variations in springtime air temperature and SWE explained most of the interannual variability in snowmelt timing. Regression coefficients for air temperature were negative, indicating that warm temperatures promote early melt. Regression coefficients for SWE, latitude, and elevation were positive, indicating that abundant snowfall tends to delay snowmelt, and snowmelt tends to occur later at northern latitudes and high elevations. Results from this study indicate that even the mountains of Colorado, with their high elevations and cold snowpacks, are experiencing substantial shifts in the timing of snowmelt and snowmelt runoff toward earlier in the year.
NASA Astrophysics Data System (ADS)
Aheyeva, Viktoryia; Gruzdev, Aleksandr; Grishaev, Mikhail
Data of ground-based measurements of NO2 column contents are analyzed to study winter-spring NO2 anomalies associated with negative anomalies in column ozone and stratospheric temperature. Episodes of significant decrease in column NO2 contents in the winter-spring period of 2011 in the northern hemisphere (NH) were detected at European and Siberian stations of Zvenigorod (55.7°N, Moscow Region) and Tomsk (56.5°N, West Siberia) in the middle latitudes, Harestua (60.2°N), Sodankyla (67.4°N, both in North Europe), and Zhigansk (66.8°N, East Siberia) in the high latitudes, and at the Arctic station of Scoresbysund (70.5°N, Greenland). All the stations, except Tomsk, are a part of the Network of the Detection of Atmospheric Composition Change (NDACC), and the data are accesses at http://ndacc.org. The decrease in NO2 is generally accompanied by total ozone and stratospheric temperature decrease and is shown to be caused by the transport of stratospheric air from the region of the ozone hole observed that season in the Arctic. Overpass total ozone data from Giovanni service and radiosonde data were used for the analysis. Although negative NO2 anomalies due to the transport from the Arctic were also observed in some other years, the anomalies in 2011 reached record magnitudes. A significant positive correlation has been found between variations in NO2 and ozone columns as well as NO2 column and stratospheric temperature during the winter-spring period of 2011, whereas the correlation is much weaker in years without Arctic ozone depletion. The correlation becomes even stronger if only episodes with significant NO2 decrease are considered. For example the correlation coefficients between NO2 and ozone columns deviations are about 0.9 for Zvenigorod and Scoresbysund. Correlation coefficients between variations in column NO2 and total ozone and stratospheric temperature as well as coefficients of regression of NO2 on ozone and temperature in the winter-spring period of 2011 for the Siberian stations are less than those for European stations. For comparison analysis, data of column NO2, total ozone and stratospheric temperature at the southern hemisphere (SH) stations of Dumont D’Urville (66.7°S, the Antarctic), Macquarie Island (54.5°S) and Kerguelen Island (49.3°S) (all stations are NDACC stations) were used. Correlation and regression coefficients between variations in column NO2 and total ozone as well as in column NO2 and stratospheric temperature for the winter-spring periods at the SH stations depend on the phase of the quasi-biennial oscillation (QBO) in the 30 hPa equatorial wind velocity. The correlation coefficients and the coefficients of regression of NO2 on ozone and temperature for the west QBO phase are large compared to those for the east phase. The 2011 Arctic ozone hole was observed during the west phase of the 30 hPa QBO. The calculated correlation coefficients at the NH stations for the winter-spring period of 2011 associated with the Arctic ozone hole are close to similar coefficients at the SH stations in winter-spring periods for the west QBO phase. The regression coefficients at the NH stations are less than those at the SH stations for the west QBO phase but greater than similar coefficients for the east phase. We can conclude that physico-chemical processes specific for ozone hole conditions cause spatial correlation between distribution of stratospheric NO2 and distributions of total ozone and temperature in polar and adjacent regions, which is generally stronger for stronger ozone deficit in a polar region. This results in significant time correlation between NO2, ozone and temperature at observation sites due to transport processes.
New insights into old methods for identifying causal rare variants.
Wang, Haitian; Huang, Chien-Hsun; Lo, Shaw-Hwa; Zheng, Tian; Hu, Inchi
2011-11-29
The advance of high-throughput next-generation sequencing technology makes possible the analysis of rare variants. However, the investigation of rare variants in unrelated-individuals data sets faces the challenge of low power, and most methods circumvent the difficulty by using various collapsing procedures based on genes, pathways, or gene clusters. We suggest a new way to identify causal rare variants using the F-statistic and sliced inverse regression. The procedure is tested on the data set provided by the Genetic Analysis Workshop 17 (GAW17). After preliminary data reduction, we ranked markers according to their F-statistic values. Top-ranked markers were then subjected to sliced inverse regression, and those with higher absolute coefficients in the most significant sliced inverse regression direction were selected. The procedure yields good false discovery rates for the GAW17 data and thus is a promising method for future study on rare variants.
Alam, Sarfaraz; Khan, Feroz
2014-01-01
Due to the high mortality rate in India, the identification of novel molecules is important in the development of novel and potent anticancer drugs. Xanthones are natural constituents of plants in the families Bonnetiaceae and Clusiaceae, and comprise oxygenated heterocycles with a variety of biological activities along with an anticancer effect. To explore the anticancer compounds from xanthone derivatives, a quantitative structure activity relationship (QSAR) model was developed by the multiple linear regression method. The structure–activity relationship represented by the QSAR model yielded a high activity–descriptors relationship accuracy (84%) referred by regression coefficient (r2=0.84) and a high activity prediction accuracy (82%). Five molecular descriptors – dielectric energy, group count (hydroxyl), LogP (the logarithm of the partition coefficient between n-octanol and water), shape index basic (order 3), and the solvent-accessible surface area – were significantly correlated with anticancer activity. Using this QSAR model, a set of virtually designed xanthone derivatives was screened out. A molecular docking study was also carried out to predict the molecular interaction between proposed compounds and deoxyribonucleic acid (DNA) topoisomerase IIα. The pharmacokinetics parameters, such as absorption, distribution, metabolism, excretion, and toxicity, were also calculated, and later an appraisal of synthetic accessibility of organic compounds was carried out. The strategy used in this study may provide understanding in designing novel DNA topoisomerase IIα inhibitors, as well as for other cancer targets. PMID:24516330
The Geometry of Enhancement in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…
ERIC Educational Resources Information Center
Tong, Fuhui
2006-01-01
Background: An extensive body of researches has favored the use of regression over other parametric analyses that are based on OVA. In case of noteworthy regression results, researchers tend to explore magnitude of beta weights for the respective predictors. Purpose: The purpose of this paper is to examine both beta weights and structure…
Evaluation of laser cutting process with auxiliary gas pressure by soft computing approach
NASA Astrophysics Data System (ADS)
Lazov, Lyubomir; Nikolić, Vlastimir; Jovic, Srdjan; Milovančević, Miloš; Deneva, Heristina; Teirumenieka, Erika; Arsic, Nebojsa
2018-06-01
Evaluation of the optimal laser cutting parameters is very important for the high cut quality. This is highly nonlinear process with different parameters which is the main challenge in the optimization process. Data mining methodology is one of most versatile method which can be used laser cutting process optimization. Support vector regression (SVR) procedure is implemented since it is a versatile and robust technique for very nonlinear data regression. The goal in this study was to determine the optimal laser cutting parameters to ensure robust condition for minimization of average surface roughness. Three cutting parameters, the cutting speed, the laser power, and the assist gas pressure, were used in the investigation. As a laser type TruLaser 1030 technological system was used. Nitrogen as an assisted gas was used in the laser cutting process. As the data mining method, support vector regression procedure was used. Data mining prediction accuracy was very high according the coefficient (R2) of determination and root mean square error (RMSE): R2 = 0.9975 and RMSE = 0.0337. Therefore the data mining approach could be used effectively for determination of the optimal conditions of the laser cutting process.
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Guenole, Nigel; Brown, Anna
2014-01-01
We report a Monte Carlo study examining the effects of two strategies for handling measurement non-invariance – modeling and ignoring non-invariant items – on structural regression coefficients between latent variables measured with item response theory models for categorical indicators. These strategies were examined across four levels and three types of non-invariance – non-invariant loadings, non-invariant thresholds, and combined non-invariance on loadings and thresholds – in simple, partial, mediated and moderated regression models where the non-invariant latent variable occupied predictor, mediator, and criterion positions in the structural regression models. When non-invariance is ignored in the latent predictor, the focal group regression parameters are biased in the opposite direction to the difference in loadings and thresholds relative to the referent group (i.e., lower loadings and thresholds for the focal group lead to overestimated regression parameters). With criterion non-invariance, the focal group regression parameters are biased in the same direction as the difference in loadings and thresholds relative to the referent group. While unacceptable levels of parameter bias were confined to the focal group, bias occurred at considerably lower levels of ignored non-invariance than was previously recognized in referent and focal groups. PMID:25278911
Harada, Sei; Hirayama, Akiyoshi; Chan, Queenie; Kurihara, Ayako; Fukai, Kota; Iida, Miho; Kato, Suzuka; Sugiyama, Daisuke; Kuwabara, Kazuyo; Takeuchi, Ayano; Akiyama, Miki; Okamura, Tomonori; Ebbels, Timothy M D; Elliott, Paul; Tomita, Masaru; Sato, Asako; Suzuki, Chizuru; Sugimoto, Masahiro; Soga, Tomoyoshi; Takebayashi, Toru
2018-01-01
Cohort studies with metabolomics data are becoming more widespread, however, large-scale studies involving 10,000s of participants are still limited, especially in Asian populations. Therefore, we started the Tsuruoka Metabolomics Cohort Study enrolling 11,002 community-dwelling adults in Japan, and using capillary electrophoresis-mass spectrometry (CE-MS) and liquid chromatography-mass spectrometry. The CE-MS method is highly amenable to absolute quantification of polar metabolites, however, its reliability for large-scale measurement is unclear. The aim of this study is to examine reproducibility and validity of large-scale CE-MS measurements. In addition, the study presents absolute concentrations of polar metabolites in human plasma, which can be used in future as reference ranges in a Japanese population. Metabolomic profiling of 8,413 fasting plasma samples were completed using CE-MS, and 94 polar metabolites were structurally identified and quantified. Quality control (QC) samples were injected every ten samples and assessed throughout the analysis. Inter- and intra-batch coefficients of variation of QC and participant samples, and technical intraclass correlation coefficients were estimated. Passing-Bablok regression of plasma concentrations by CE-MS on serum concentrations by standard clinical chemistry assays was conducted for creatinine and uric acid. In QC samples, coefficient of variation was less than 20% for 64 metabolites, and less than 30% for 80 metabolites out of the 94 metabolites. Inter-batch coefficient of variation was less than 20% for 81 metabolites. Estimated technical intraclass correlation coefficient was above 0.75 for 67 metabolites. The slope of Passing-Bablok regression was estimated as 0.97 (95% confidence interval: 0.95, 0.98) for creatinine and 0.95 (0.92, 0.96) for uric acid. Compared to published data from other large cohort measurement platforms, reproducibility of metabolites common to the platforms was similar to or better than in the other studies. These results show that our CE-MS platform is suitable for conducting large-scale epidemiological studies.
High-Order Model and Dynamic Filtering for Frame Rate Up-Conversion.
Bao, Wenbo; Zhang, Xiaoyun; Chen, Li; Ding, Lianghui; Gao, Zhiyong
2018-08-01
This paper proposes a novel frame rate up-conversion method through high-order model and dynamic filtering (HOMDF) for video pixels. Unlike the constant brightness and linear motion assumptions in traditional methods, the intensity and position of the video pixels are both modeled with high-order polynomials in terms of time. Then, the key problem of our method is to estimate the polynomial coefficients that represent the pixel's intensity variation, velocity, and acceleration. We propose to solve it with two energy objectives: one minimizes the auto-regressive prediction error of intensity variation by its past samples, and the other minimizes video frame's reconstruction error along the motion trajectory. To efficiently address the optimization problem for these coefficients, we propose the dynamic filtering solution inspired by video's temporal coherence. The optimal estimation of these coefficients is reformulated into a dynamic fusion of the prior estimate from pixel's temporal predecessor and the maximum likelihood estimate from current new observation. Finally, frame rate up-conversion is implemented using motion-compensated interpolation by pixel-wise intensity variation and motion trajectory. Benefited from the advanced model and dynamic filtering, the interpolated frame has much better visual quality. Extensive experiments on the natural and synthesized videos demonstrate the superiority of HOMDF over the state-of-the-art methods in both subjective and objective comparisons.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
van Mierlo, Trevor; Hyatt, Douglas; Ching, Andrew T
2016-01-01
Digital Health Social Networks (DHSNs) are common; however, there are few metrics that can be used to identify participation inequality. The objective of this study was to investigate whether the Gini coefficient, an economic measure of statistical dispersion traditionally used to measure income inequality, could be employed to measure DHSN inequality. Quarterly Gini coefficients were derived from four long-standing DHSNs. The combined data set included 625,736 posts that were generated from 15,181 actors over 18,671 days. The range of actors (8-2323), posts (29-28,684), and Gini coefficients (0.15-0.37) varied. Pearson correlations indicated statistically significant associations between number of actors and number of posts (0.527-0.835, p < .001), and Gini coefficients and number of posts (0.342-0.725, p < .001). However, the association between Gini coefficient and number of actors was only statistically significant for the addiction networks (0.619 and 0.276, p < .036). Linear regression models had positive but mixed R 2 results (0.333-0.527). In all four regression models, the association between Gini coefficient and posts was statistically significant ( t = 3.346-7.381, p < .002). However, unlike the Pearson correlations, the association between Gini coefficient and number of actors was only statistically significant in the two mental health networks ( t = -4.305 and -5.934, p < .000). The Gini coefficient is helpful in measuring shifts in DHSN inequality. However, as a standalone metric, the Gini coefficient does not indicate optimal numbers or ratios of actors to posts, or effective network engagement. Further, mixed-methods research investigating quantitative performance metrics is required.
[New method of mixed gas infrared spectrum analysis based on SVM].
Bai, Peng; Xie, Wen-Jun; Liu, Jun-Hua
2007-07-01
A new method of infrared spectrum analysis based on support vector machine (SVM) for mixture gas was proposed. The kernel function in SVM was used to map the seriously overlapping absorption spectrum into high-dimensional space, and after transformation, the high-dimensional data could be processed in the original space, so the regression calibration model was established, then the regression calibration model with was applied to analyze the concentration of component gas. Meanwhile it was proved that the regression calibration model with SVM also could be used for component recognition of mixture gas. The method was applied to the analysis of different data samples. Some factors such as scan interval, range of the wavelength, kernel function and penalty coefficient C that affect the model were discussed. Experimental results show that the component concentration maximal Mean AE is 0.132%, and the component recognition accuracy is higher than 94%. The problems of overlapping absorption spectrum, using the same method for qualitative and quantitative analysis, and limit number of training sample, were solved. The method could be used in other mixture gas infrared spectrum analyses, promising theoretic and application values.
Tribological behaviour and statistical experimental design of sintered iron-copper based composites
NASA Astrophysics Data System (ADS)
Popescu, Ileana Nicoleta; Ghiţă, Constantin; Bratu, Vasile; Palacios Navarro, Guillermo
2013-11-01
The sintered iron-copper based composites for automotive brake pads have a complex composite composition and should have good physical, mechanical and tribological characteristics. In this paper, we obtained frictional composites by Powder Metallurgy (P/M) technique and we have characterized them by microstructural and tribological point of view. The morphology of raw powders was determined by SEM and the surfaces of obtained sintered friction materials were analyzed by ESEM, EDS elemental and compo-images analyses. One lot of samples were tested on a "pin-on-disc" type wear machine under dry sliding conditions, at applied load between 3.5 and 11.5 × 10-1 MPa and 12.5 and 16.9 m/s relative speed in braking point at constant temperature. The other lot of samples were tested on an inertial test stand according to a methodology simulating the real conditions of dry friction, at a contact pressure of 2.5-3 MPa, at 300-1200 rpm. The most important characteristics required for sintered friction materials are high and stable friction coefficient during breaking and also, for high durability in service, must have: low wear, high corrosion resistance, high thermal conductivity, mechanical resistance and thermal stability at elevated temperature. Because of the tribological characteristics importance (wear rate and friction coefficient) of sintered iron-copper based composites, we predicted the tribological behaviour through statistical analysis. For the first lot of samples, the response variables Yi (represented by the wear rate and friction coefficient) have been correlated with x1 and x2 (the code value of applied load and relative speed in braking points, respectively) using a linear factorial design approach. We obtained brake friction materials with improved wear resistance characteristics and high and stable friction coefficients. It has been shown, through experimental data and obtained linear regression equations, that the sintered composites wear rate increases with increasing applied load and relative speed, but in the same conditions, the frictional coefficients slowly decrease.
The Study of Rain Specific Attenuation for the Prediction of Satellite Propagation in Malaysia
NASA Astrophysics Data System (ADS)
Mandeep, J. S.; Ng, Y. Y.; Abdullah, H.; Abdullah, M.
2010-06-01
Specific attenuation is the fundamental quantity in the calculation of rain attenuation for terrestrial path and slant paths representing as rain attenuation per unit distance (dB/km). Specific attenuation is an important element in developing the predicted rain attenuation model. This paper deals with the empirical determination of the power law coefficients which allow calculating the specific attenuation in dB/km from the knowledge of the rain rate in mm/h. The main purpose of the paper is to obtain the coefficients of k and α of power law relationship between specific attenuation. Three years (from 1st January 2006 until 31st December 2008) rain gauge and beacon data taken from USM, Nibong Tebal have been used to do the empirical procedure analysis of rain specific attenuation. The data presented are semi-empirical in nature. A year-to-year variation of the coefficients has been indicated and the empirical measured data was compared with ITU-R provided regression coefficient. The result indicated that the USM empirical measured data was significantly vary from ITU-R predicted value. Hence, ITU-R recommendation for regression coefficients of rain specific attenuation is not suitable for predicting rain attenuation at Malaysia.
NASA Astrophysics Data System (ADS)
Bitew, M. M.; Goodrich, D. C.; Demaria, E.; Heilman, P.; Kautz, M. A.
2017-12-01
Walnut Gulch is a semi-arid environment experimental watershed and Long Term Agro-ecosystem Research (LTAR) site managed by USDA-ARS Southwest Watershed Research Center for which high-resolution long-term hydro-climatic data are available across its 150 km2 drainage area. In this study, we present the analysis of 50 years of continuous hourly rainfall data to evaluate runoff control and generation processes for improving the QA-QC plans of Walnut Gulch to create high-quality data set that is critical for reducing water balance uncertainties. Multiple linear regression models were developed to relate rainfall properties, runoff characteristics and watershed properties. The rainfall properties were summarized to event based total depth, maximum intensity, duration, the location of the storm center with respect to the outlet, and storm size normalized to watershed area. We evaluated the interaction between the runoff and rainfall and runoff as antecedent moisture condition (AMC), antecedent runoff condition (ARC) and, runoff depth and duration for each rainfall events. We summarized each of the watershed properties such as contributing area, slope, shape, channel length, stream density, channel flow area, and percent of the area of retention stock ponds for each of the nested catchments in Walnut Gulch. The evaluation of the model using basic and categorical statistics showed good predictive skill throughout the watersheds. The model produced correlation coefficients ranging from 0.4-0.94, Nash efficiency coefficients up to 0.77, and Kling-Gupta coefficients ranging from 0.4 to 0.98. The model predicted 92% of all runoff generations and 98% of no-runoff across all sub-watersheds in Walnut Gulch. The regression model also indicated good potential to complement the QA-QC procedures in place for Walnut Gulch dataset publications developed over the years since the 1960s through identification of inconsistencies in rainfall and runoff relations.
NASA Astrophysics Data System (ADS)
Jing, Ran; Gong, Zhaoning; Zhao, Wenji; Pu, Ruiliang; Deng, Lei
2017-12-01
Above-bottom biomass (ABB) is considered as an important parameter for measuring the growth status of aquatic plants, and is of great significance for assessing health status of wetland ecosystems. In this study, Structure from Motion (SfM) technique was used to rebuild the study area with high overlapped images acquired by an unmanned aerial vehicle (UAV). We generated orthoimages and SfM dense point cloud data, from which vegetation indices (VIs) and SfM point cloud variables including average height (HAVG), standard deviation of height (HSD) and coefficient of variation of height (HCV) were extracted. These VIs and SfM point cloud variables could effectively characterize the growth status of aquatic plants, and thus they could be used to develop a simple linear regression model (SLR) and a stepwise linear regression model (SWL) with field measured ABB samples of aquatic plants. We also utilized a decision tree method to discriminate different types of aquatic plants. The experimental results indicated that (1) the SfM technique could effectively process high overlapped UAV images and thus be suitable for the reconstruction of fine texture feature of aquatic plant canopy structure; and (2) an SWL model based on point cloud variables: HAVG, HSD, HCV and two VIs: NGRDI, ExGR as independent variables has produced the best predictive result of ABB of aquatic plants in the study area, with a coefficient of determination of 0.84 and a relative root mean square error of 7.13%. In this analysis, a novel method for the quantitative inversion of a growth parameter (i.e., ABB) of aquatic plants in wetlands was demonstrated.
McAuley, Paul A; Hsu, Fang-Chi; Loman, Kurt K; Carr, J Jeffrey; Budoff, Matthew J; Szklo, Moyses; Sharrett, A Richey; Ding, Jingzhong
2011-09-01
Insulin resistance is linked to general and abdominal obesity, but its relation to hepatic lipid content and pericardial adipose tissue is less clear. The purpose of this study was to examine cross-sectional associations of liver attenuation, pericardial adipose tissue, BMI, and waist circumference with insulin resistance. We measured liver attenuation and pericardial adipose tissue using the existing cardiac computed tomography scans in 5,291 individuals free of clinical cardiovascular disease and diabetes in the Multi-Ethnic Study of Atherosclerosis (MESA) during the study's baseline visit (2000-2002). Low liver attenuation was defined as the lowest quartile and high pericardial adipose tissue as the upper quartile of volume (cm(3)). We used standard clinical definitions for obesity and abdominal obesity. Insulin resistance was assessed by the homeostasis model assessment of insulin resistance (HOMA(IR)) index. In multivariate linear regression with all adiposity measures in the model simultaneously, all adiposity measures were significantly (P < 0.0001) associated with insulin resistance: regression coefficients (±s.e.) were 0.31 (±0.02) for low liver attenuation, 0.27 (±0.02) for high pericardial adipose tissue, 0.27 (±0.02) for obesity, and 0.32 (±0.02) for abdominal obesity. We found significant differences (P = 0.003) between standardized liver attenuation and insulin resistance by ethnicity: regression coefficients per 1 s.d. increment were 0.10 ± 0.01 for whites, 0.11 ± 0.02 for Chinese, 0.08 ± 0.2 for blacks, and 0.14 ± 0.01 for Hispanics. Liver attenuation and pericardial adipose tissue were associated with insulin resistance, independent of BMI and waist circumference.
Merchantable sawlog and bole-length equations for the Northeastern United States
Daniel A. Yaussy; Martin E. Dale; Martin E. Dale
1991-01-01
A modified Richards growth model is used to develop species-specific coefficients for equations estimating the merchantable sawlog and bole lengths of trees from 25 species groups common to the Northeastern United States. These regression coefficients have been incorporated into the growth-and-yield simulation software, NE-TWIGS.
Mangrove canopy density analysis using Sentinel-2A imagery satellite data
NASA Astrophysics Data System (ADS)
Wachid, M. N.; Hapsara, R. P.; Cahyo, R. D.; Wahyu, G. N.; Syarif, A. M.; Umarhadi, D. A.; Fitriani, A. N.; Ramadhanningrum, D. P.; Widyatmanti, W.
2017-06-01
Teluk Jor has alluvium surface sediment that came from volcanic materials. Sea wave that relatively calm and the closed beach shape support the existence of mangrove forest at Teluk Jor. Sentinel-2A imagery has a good spatial and spectral resolution for mangrove density study. The regression between samples and the NDVI values of Sentinel-2A used to analyze the mangrove canopy density. Mangrove canopy density was identified using field survey with transect method. The regression analysis shows field data and NDVI value has correlation R=0.7739 and coefficient of determination R2=0.5989. The result of the analysis shows area of low density 397,900 m2, moderate density 336,200 m2, the high density has 110,300 m2 and very high density has 500 m2. This research also found that mangrove genus in Teluk Jor consists of Rhizopora, Ceriops, Aegiceras and Sonneratia.
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Huang, L; Fantke, P; Ernstoff, A; Jolliet, O
2017-11-01
Indoor releases of organic chemicals encapsulated in solid materials are major contributors to human exposures and are directly related to the internal diffusion coefficient in solid materials. Existing correlations to estimate the diffusion coefficient are only valid for a limited number of chemical-material combinations. This paper develops and evaluates a quantitative property-property relationship (QPPR) to predict diffusion coefficients for a wide range of organic chemicals and materials. We first compiled a training dataset of 1103 measured diffusion coefficients for 158 chemicals in 32 consolidated material types. Following a detailed analysis of the temperature influence, we developed a multiple linear regression model to predict diffusion coefficients as a function of chemical molecular weight (MW), temperature, and material type (adjusted R 2 of .93). The internal validations showed the model to be robust, stable and not a result of chance correlation. The external validation against two separate prediction datasets demonstrated the model has good predicting ability within its applicability domain (Rext2>.8), namely MW between 30 and 1178 g/mol and temperature between 4 and 180°C. By covering a much wider range of organic chemicals and materials, this QPPR facilitates high-throughput estimates of human exposures for chemicals encapsulated in solid materials. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
A hierarchical estimator development for estimation of tire-road friction coefficient
Zhang, Xudong; Göhlich, Dietmar
2017-01-01
The effect of vehicle active safety systems is subject to the friction force arising from the contact of tires and the road surface. Therefore, an adequate knowledge of the tire-road friction coefficient is of great importance to achieve a good performance of these control systems. This paper presents a tire-road friction coefficient estimation method for an advanced vehicle configuration, four-motorized-wheel electric vehicles, in which the longitudinal tire force is easily obtained. A hierarchical structure is adopted for the proposed estimation design. An upper estimator is developed based on unscented Kalman filter to estimate vehicle state information, while a hybrid estimation method is applied as the lower estimator to identify the tire-road friction coefficient using general regression neural network (GRNN) and Bayes' theorem. GRNN aims at detecting road friction coefficient under small excitations, which are the most common situations in daily driving. GRNN is able to accurately create a mapping from input parameters to the friction coefficient, avoiding storing an entire complex tire model. As for large excitations, the estimation algorithm is based on Bayes' theorem and a simplified “magic formula” tire model. The integrated estimation method is established by the combination of the above-mentioned estimators. Finally, the simulations based on a high-fidelity CarSim vehicle model are carried out on different road surfaces and driving maneuvers to verify the effectiveness of the proposed estimation method. PMID:28178332
A hierarchical estimator development for estimation of tire-road friction coefficient.
Zhang, Xudong; Göhlich, Dietmar
2017-01-01
The effect of vehicle active safety systems is subject to the friction force arising from the contact of tires and the road surface. Therefore, an adequate knowledge of the tire-road friction coefficient is of great importance to achieve a good performance of these control systems. This paper presents a tire-road friction coefficient estimation method for an advanced vehicle configuration, four-motorized-wheel electric vehicles, in which the longitudinal tire force is easily obtained. A hierarchical structure is adopted for the proposed estimation design. An upper estimator is developed based on unscented Kalman filter to estimate vehicle state information, while a hybrid estimation method is applied as the lower estimator to identify the tire-road friction coefficient using general regression neural network (GRNN) and Bayes' theorem. GRNN aims at detecting road friction coefficient under small excitations, which are the most common situations in daily driving. GRNN is able to accurately create a mapping from input parameters to the friction coefficient, avoiding storing an entire complex tire model. As for large excitations, the estimation algorithm is based on Bayes' theorem and a simplified "magic formula" tire model. The integrated estimation method is established by the combination of the above-mentioned estimators. Finally, the simulations based on a high-fidelity CarSim vehicle model are carried out on different road surfaces and driving maneuvers to verify the effectiveness of the proposed estimation method.
Mutlu, Selime; Kahraman, Kevser; Öztürk, Serpil
2017-02-01
The effects of microwave irradiation on resistant starch (RS) formation and functional properties in high-amylose corn starch, Hylon VII, by applying microwave-storing cycles and drying processes were investigated. The Response Surface Methodology (RSM) was used to optimize the reaction conditions, microwave time (2-4min) and power (20-100%), for RS formation. The starch:water (1:10) mixtures were cooked and autoclaved and then different microwave-storing cycles and drying (oven or freeze drying) processes were applied. The RS contents of the samples increased with increasing microwave-storing cycle. The highest RS (43.4%) was obtained by oven drying after 3 cycles of microwave treatment at 20% power for 2min. The F, p (<0.05) and R 2 values indicated that the selected models were consistent. Linear equations were obtained for oven-dried samples applied by 1 and 3 cycles of microwave with regression coefficients of 0.65 and 0.62, respectively. Quadratic equation was obtained for freeze-dried samples applied by 3 cycles of microwave with a regression coefficient of 0.83. The solubility, water binding capacity (WBC) and RVA viscosity values of the microwave applied samples were higher than those of native Hylon VII. The WBC and viscosity values of the freeze-dried samples were higher than those of the oven-dried ones. Copyright © 2016 Elsevier B.V. All rights reserved.
Machine Learning Estimation of Atom Condensed Fukui Functions.
Zhang, Qingyou; Zheng, Fangfang; Zhao, Tanfeng; Qu, Xiaohui; Aires-de-Sousa, João
2016-02-01
To enable the fast estimation of atom condensed Fukui functions, machine learning algorithms were trained with databases of DFT pre-calculated values for ca. 23,000 atoms in organic molecules. The problem was approached as the ranking of atom types with the Bradley-Terry (BT) model, and as the regression of the Fukui function. Random Forests (RF) were trained to predict the condensed Fukui function, to rank atoms in a molecule, and to classify atoms as high/low Fukui function. Atomic descriptors were based on counts of atom types in spheres around the kernel atom. The BT coefficients assigned to atom types enabled the identification (93-94 % accuracy) of the atom with the highest Fukui function in pairs of atoms in the same molecule with differences ≥0.1. In whole molecules, the atom with the top Fukui function could be recognized in ca. 50 % of the cases and, on the average, about 3 of the top 4 atoms could be recognized in a shortlist of 4. Regression RF yielded predictions for test sets with R(2) =0.68-0.69, improving the ability of BT coefficients to rank atoms in a molecule. Atom classification (as high/low Fukui function) was obtained with RF with sensitivity of 55-61 % and specificity of 94-95 %. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Saad, Karen Ruggeri; Colombo, Alexandra S; João, Silvia M Amado
2009-01-01
The purpose of this study was to investigate the reliability and validity of photogrammetry in measuring the lateral spinal inclination angles. Forty subjects (32 female and 8 males) with a mean age of 23.4 +/- 11.2 years had their scoliosis evaluated by radiographs of their trunk, determined by the Cobb angle method, and by photogrammetry. The statistical methods used included Cronbach alpha, Pearson/Spearman correlation coefficients, and regression analyses. The Cronbach alpha values showed that the photogrammetric measures showed high internal consistency, which indicated that the sample was bias free. The radiograph method showed to be more precise with intrarater reliabilities of 0.936, 0.975, and 0.945 for the thoracic, lumbar, and thoracolumbar curves, respectively, and interrater reliabilities of 0.942 and 0.879 for the angular measures of the thoracic and thoracolumbar segments, respectively. The regression analyses revealed a high determination coefficient although limited to the adjusted linear model between the radiographic and photographic measures. It was found that with more severe scoliosis, the lateral curve measures obtained with the photogrammetry were for the thoracic and lumbar regions (R = 0.619 and 0.551). The photogrammetric measures were found to be reproducible in this study and could be used as supplementary information to decrease the number of radiographs necessary for the monitoring of scoliosis.
Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta
2017-04-01
Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
[Sincerity of effort: isokinetic evaluation of knee extension].
Colombo, R; Demaiti, G; Sartorio, F; Orlandini, D; Vercelli, S; Ferriero, G
2008-01-01
The aim of this study was to find a reliable method to evaluate the sincerity of the muscular maximal effort performed in a dynamometric isokinetic test of knee flexion-extension. The coefficient of variation of the peak torque (CV) and 3 new indices were analysed: (1) the average coefficient of variation calculated on the complete peak torque curve (CVM); (2) the slope of the regression line in an endurance test (PRR); (3) the correlation coefficient of the peak torques in the same endurance test (CCR). Twenty healthy subjects underwent assessment in two different trials, maximal (MX) and 50% submaximal (SMX), with 20 minutes of rest between trials. Each trial consisted of 4 tests, each of 3 repetitions, at angular speed of 30, 180, 30, and 180 degrees/s, respectively, and 1 test of 15 repetitions at 240 degrees/s. Our findings confirmed the ability of CV to detect a high percentage of sincere efforts: at 30 degrees/s Sensibility (Sns)=100% and Specificity (Spc)=70%; at 180 degrees/s Sns=75%, Spc=95%. The 3 new indices here proposed showed high characteristics of Sns and Spc, generally better than those of CV. CVM showed at 180 degrees/s Sns=90% and Spc=100%, while at 30 degrees/s Sns=90%, Spc=75%. PRR was the best index identifying all the efforts, except one (Sns=100%, Spc=95%). The CCR coefficient showed Sns and Spc values both of 90%.
Application of Temperature Sensitivities During Iterative Strain-Gage Balance Calibration Analysis
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2011-01-01
A new method is discussed that may be used to correct wind tunnel strain-gage balance load predictions for the influence of residual temperature effects at the location of the strain-gages. The method was designed for the iterative analysis technique that is used in the aerospace testing community to predict balance loads from strain-gage outputs during a wind tunnel test. The new method implicitly applies temperature corrections to the gage outputs during the load iteration process. Therefore, it can use uncorrected gage outputs directly as input for the load calculations. The new method is applied in several steps. First, balance calibration data is analyzed in the usual manner assuming that the balance temperature was kept constant during the calibration. Then, the temperature difference relative to the calibration temperature is introduced as a new independent variable for each strain--gage output. Therefore, sensors must exist near the strain--gages so that the required temperature differences can be measured during the wind tunnel test. In addition, the format of the regression coefficient matrix needs to be extended so that it can support the new independent variables. In the next step, the extended regression coefficient matrix of the original calibration data is modified by using the manufacturer specified temperature sensitivity of each strain--gage as the regression coefficient of the corresponding temperature difference variable. Finally, the modified regression coefficient matrix is converted to a data reduction matrix that the iterative analysis technique needs for the calculation of balance loads. Original calibration data and modified check load data of NASA's MC60D balance are used to illustrate the new method.
Acevedo-Mendoza, Wilmer F; Buitrago Gómez, Diana Paola; Atehortua-Otero, Miguel Ángel; Páez, Miguel Ángel; Jiménez-Rincón, Manuela; Lagos-Grisales, Guillermo J; Rodríguez-Morales, Alfonso J
2017-03-01
Bacterial meningitis is an important cause of infectious neurological morbidity and mortality. Its incidence has decreased with the introduction of vaccination programmes against preventable agents. However, low-income and middle-income countries with poor access to health care still have a significant burden of the disease. Thus, the relationship between the Gini coefficient and H. influenzae and M. tuberculosis meningitis incidence in Colombia, during 2008-2011, was assessed. In this ecological study, the Gini coefficient was obtained from the Colombian Department of Statistics, incidence rates were calculated (cases/1,000,000 pop) and linear regressions were performed using the Gini coefficient, to assess the relationship between the latter and the incidence of meningitis. It was observed that when inequality increases in the Colombian departments, the incidence of meningitis also increases, with a significant association in the models (p<0.01) for both M. tuberculosis (r²=0.2382; p<0.001) and H. influenzae (r²=0.2509; p<0.001). This research suggests that high Gini coefficient values influence the incidence of Mycobacterium tuberculosis and Haemophilus influenzae meningitis, showing that social inequality is critical to disease occurrence. Early detection, supervised treatment, vaccination coverage, access to health care are efficient control strategies.
Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B
2016-09-01
Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that genetic gain for body weight can be achieved by selection. Also, selection for body weight at 42 days of age can be maintained as a selection criterion. © 2016 Poultry Science Association Inc.
NASA Astrophysics Data System (ADS)
Bloomfield, J. P.; Allen, D. J.; Griffiths, K. J.
2009-06-01
SummaryLinear regression methods can be used to quantify geological controls on baseflow index (BFI). This is illustrated using an example from the Thames Basin, UK. Two approaches have been adopted. The areal extents of geological classes based on lithostratigraphic and hydrogeological classification schemes have been correlated with BFI for 44 'natural' catchments from the Thames Basin. When regression models are built using lithostratigraphic classes that include a constant term then the model is shown to have some physical meaning and the relative influence of the different geological classes on BFI can be quantified. For example, the regression constants for two such models, 0.64 and 0.69, are consistent with the mean observed BFI (0.65) for the Thames Basin, and the signs and relative magnitudes of the regression coefficients for each of the lithostratigraphic classes are consistent with the hydrogeology of the Basin. In addition, regression coefficients for the lithostratigraphic classes scale linearly with estimates of log 10 hydraulic conductivity for each lithological class. When a regression is built using a hydrogeological classification scheme with no constant term, the model does not have any physical meaning, but it has a relatively high adjusted R2 value and because of the continuous coverage of the hydrogeological classification scheme, the model can be used for predictive purposes. A model calibrated on the 44 'natural' catchments and using four hydrogeological classes (low-permeability surficial deposits, consolidated aquitards, fractured aquifers and intergranular aquifers) is shown to perform as well as a model based on a hydrology of soil types (BFIHOST) scheme in predicting BFI in the Thames Basin. Validation of this model using 110 other 'variably impacted' catchments in the Basin shows that there is a correlation between modelled and observed BFI. Where the observed BFI is significantly higher than modelled BFI the deviations can be explained by an exogenous factor, catchment urban area. It is inferred that this is may be due influences from sewage discharge, mains leakage, and leakage from septic tanks.
Sullivan, Sarah; Lewis, Glyn; Mohr, Christine; Herzig, Daniela; Corcoran, Rhiannon; Drake, Richard; Evans, Jonathan
2014-01-01
There is some cross-sectional evidence that theory of mind ability is associated with social functioning in those with psychosis but the direction of this relationship is unknown. This study investigates the longitudinal association between both theory of mind and psychotic symptoms and social functioning outcome in first-episode psychosis. Fifty-four people with first-episode psychosis were followed up at 6 and 12 months. Random effects regression models were used to estimate the stability of theory of mind over time and the association between baseline theory of mind and psychotic symptoms and social functioning outcome. Neither baseline theory of mind ability (regression coefficients: Hinting test 1.07 95% CI -0.74, 2.88; Visual Cartoon test -2.91 95% CI -7.32, 1.51) nor baseline symptoms (regression coefficients: positive symptoms -0.04 95% CI -1.24, 1.16; selected negative symptoms -0.15 95% CI -2.63, 2.32) were associated with social functioning outcome. There was evidence that theory of mind ability was stable over time, (regression coefficients: Hinting test 5.92 95% CI -6.66, 8.92; Visual Cartoon test score 0.13 95% CI -0.17, 0.44). Neither baseline theory of mind ability nor psychotic symptoms are associated with social functioning outcome. Further longitudinal work is needed to understand the origin of social functioning deficits in psychosis.
Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.
ERIC Educational Resources Information Center
Kromrey, Jeffrey D.; Hines, Constance V.
1995-01-01
The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…
Enhance-Synergism and Suppression Effects in Multiple Regression
ERIC Educational Resources Information Center
Lipovetsky, Stan; Conklin, W. Michael
2004-01-01
Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…
Statistical downscaling modeling with quantile regression using lasso to estimate extreme rainfall
NASA Astrophysics Data System (ADS)
Santri, Dewi; Wigena, Aji Hamim; Djuraidah, Anik
2016-02-01
Rainfall is one of the climatic elements with high diversity and has many negative impacts especially extreme rainfall. Therefore, there are several methods that required to minimize the damage that may occur. So far, Global circulation models (GCM) are the best method to forecast global climate changes include extreme rainfall. Statistical downscaling (SD) is a technique to develop the relationship between GCM output as a global-scale independent variables and rainfall as a local- scale response variable. Using GCM method will have many difficulties when assessed against observations because GCM has high dimension and multicollinearity between the variables. The common method that used to handle this problem is principal components analysis (PCA) and partial least squares regression. The new method that can be used is lasso. Lasso has advantages in simultaneuosly controlling the variance of the fitted coefficients and performing automatic variable selection. Quantile regression is a method that can be used to detect extreme rainfall in dry and wet extreme. Objective of this study is modeling SD using quantile regression with lasso to predict extreme rainfall in Indramayu. The results showed that the estimation of extreme rainfall (extreme wet in January, February and December) in Indramayu could be predicted properly by the model at quantile 90th.
Siebers, Nina; Kruse, Jens; Eckhardt, Kai-Uwe; Hu, Yongfeng; Leinweber, Peter
2012-07-01
Cadmium (Cd) has a high toxicity and resolving its speciation in soil is challenging but essential for estimating the environmental risk. In this study partial least-square (PLS) regression was tested for its capability to deconvolute Cd L(3)-edge X-ray absorption near-edge structure (XANES) spectra of multi-compound mixtures. For this, a library of Cd reference compound spectra and a spectrum of a soil sample were acquired. A good coefficient of determination (R(2)) of Cd compounds in mixtures was obtained for the PLS model using binary and ternary mixtures of various Cd reference compounds proving the validity of this approach. In order to describe complex systems like soil, multi-compound mixtures of a variety of Cd compounds must be included in the PLS model. The obtained PLS regression model was then applied to a highly Cd-contaminated soil revealing Cd(3)(PO(4))(2) (36.1%), Cd(NO(3))(2)·4H(2)O (24.5%), Cd(OH)(2) (21.7%), CdCO(3) (17.1%) and CdCl(2) (0.4%). These preliminary results proved that PLS regression is a promising approach for a direct determination of Cd speciation in the solid phase of a soil sample.
Soares, M P; Gaya, L G; Lorentz, L H; Batistel, F; Rovadoscki, G A; Ticiani, E; Zabot, V; Di Domenico, Q; Madureira, A P; Pértile, S F N
2011-09-06
Artificial insemination has been used to improve production in Brazilian dairy cattle; however, this can lead to problems due to increased inbreeding. To evaluate the effect of the magnitude of inbreeding coefficients on predicted transmitting abilities (PTAs) for milk traits of Holstein and Jersey breeds, data on 392 Holstein and 92 Jersey sires used in Brazil were tabulated. The second-degree polynomial equations and points of maximum or minimal response were estimated to establish the regression equation of the variables as a function of the inbreeding coefficients. The mean inbreeding coefficient of the Holstein bulls was 5.10%; this did not significantly affect the PTA for percent milk fat, protein percentage and protein (P = 0.479, 0.058 and 0.087, respectively). However, the PTAs for milk yield and fat decreased significantly after reaching inbreeding coefficients of 6.43 (P = 0.034) and 5.75 (P = 0.007), respectively. The mean inbreeding coefficient of Jersey bulls was 6.45%; the PTAs for milk yield, fat and protein, in pounds, decreased significantly after reaching inbreeding coefficients of 15.04, 9.83 and 12.82% (P < 0.001, P = 0.002, and P = 0.001, respectively). The linear regression was only significant for fat and protein percentages in the Jersey breed (P = 0.002 and P = 0.005, respectively). The PTAs of Holstein sires were more affected by smaller magnitudes of inbreeding coefficients than those of Jersey sires. It is necessary to monitor the inbreeding coefficients of sires used for artificial insemination in breeding schemes in Brazil, since the low genetic variability of the available sires may lead to reduced production.
Analyzing degradation data with a random effects spline regression model
Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip
2017-03-17
This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.
Analyzing degradation data with a random effects spline regression model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip
This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.
Influence of soil pH on the sorption of ionizable chemicals: modeling advances.
Franco, Antonio; Fu, Wenjing; Trapp, Stefan
2009-03-01
The soil-water distribution coefficient of ionizable chemicals (K(d)) depends on the soil acidity, mainly because the pH governs speciation. Using pH-specific K(d) values normalized to organic carbon (K(OC)) from the literature, a method was developed to estimate the K(OC) of monovalent organic acids and bases. The regression considers pH-dependent speciation and species-specific partition coefficients, calculated from the dissociation constant (pK(a)) and the octanol-water partition coefficient of the neutral molecule (log P(n)). Probably because of the lower pH near the organic colloid-water interface, the optimal pH to model dissociation was lower than the bulk soil pH. The knowledge of the soil pH allows calculation of the fractions of neutral and ionic molecules in the system, thus improving the existing regression for acids. The same approach was not successful with bases, for which the impact of pH on the total sorption is contrasting. In fact, the shortcomings of the model assumptions affect the predictive power for acids and for bases differently. We evaluated accuracy and limitations of the regressions for their use in the environmental fate assessment of ionizable chemicals.
The Bayesian group lasso for confounded spatial data
Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.
2017-01-01
Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
The Use of Structure Coefficients to Address Multicollinearity in Sport and Exercise Science
ERIC Educational Resources Information Center
Yeatts, Paul E.; Barton, Mitch; Henson, Robin K.; Martin, Scott B.
2017-01-01
A common practice in general linear model (GLM) analyses is to interpret regression coefficients (e.g., standardized ß weights) as indicators of variable importance. However, focusing solely on standardized beta weights may provide limited or erroneous information. For example, ß weights become increasingly unreliable when predictor variables are…
Delgado, J; Liao, J C
1992-01-01
The methodology previously developed for determining the Flux Control Coefficients [Delgado & Liao (1992) Biochem. J. 282, 919-927] is extended to the calculation of metabolite Concentration Control Coefficients. It is shown that the transient metabolite concentrations are related by a few algebraic equations, attributed to mass balance, stoichiometric constraints, quasi-equilibrium or quasi-steady states, and kinetic regulations. The coefficients in these relations can be estimated using linear regression, and can be used to calculate the Control Coefficients. The theoretical basis and two examples are discussed. Although the methodology is derived based on the linear approximation of enzyme kinetics, it yields reasonably good estimates of the Control Coefficients for systems with non-linear kinetics. PMID:1497632
Revisiting crash spatial heterogeneity: A Bayesian spatially varying coefficients approach.
Xu, Pengpeng; Huang, Helai; Dong, Ni; Wong, S C
2017-01-01
This study was performed to investigate the spatially varying relationships between crash frequency and related risk factors. A Bayesian spatially varying coefficients model was elaborately introduced as a methodological alternative to simultaneously account for the unstructured and spatially structured heterogeneity of the regression coefficients in predicting crash frequencies. The proposed method was appealing in that the parameters were modeled via a conditional autoregressive prior distribution, which involved a single set of random effects and a spatial correlation parameter with extreme values corresponding to pure unstructured or pure spatially correlated random effects. A case study using a three-year crash dataset from the Hillsborough County, Florida, was conducted to illustrate the proposed model. Empirical analysis confirmed the presence of both unstructured and spatially correlated variations in the effects of contributory factors on severe crash occurrences. The findings also suggested that ignoring spatially structured heterogeneity may result in biased parameter estimates and incorrect inferences, while assuming the regression coefficients to be spatially clustered only is probably subject to the issue of over-smoothness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Miller, Justin B; Axelrod, Bradley N; Schutte, Christian
2012-01-01
The recent release of the Wechsler Memory Scale Fourth Edition contains many improvements from a theoretical and administration perspective, including demographic corrections using the Advanced Clinical Solutions. Although the administration time has been reduced from previous versions, a shortened version may be desirable in certain situations given practical time limitations in clinical practice. The current study evaluated two- and three-subtest estimations of demographically corrected Immediate and Delayed Memory index scores using both simple arithmetic prorating and regression models. All estimated values were significantly associated with observed index scores. Use of Lin's Concordance Correlation Coefficient as a measure of agreement showed a high degree of precision and virtually zero bias in the models, although the regression models showed a stronger association than prorated models. Regression-based models proved to be more accurate than prorated estimates with less dispersion around observed values, particularly when using three subtest regression models. Overall, the present research shows strong support for estimating demographically corrected index scores on the WMS-IV in clinical practice with an adequate performance using arithmetically prorated models and a stronger performance using regression models to predict index scores.
Kandala, Sridhar; Nolan, Dan; Laumann, Timothy O.; Power, Jonathan D.; Adeyemo, Babatunde; Harms, Michael P.; Petersen, Steven E.; Barch, Deanna M.
2016-01-01
Abstract Like all resting-state functional connectivity data, the data from the Human Connectome Project (HCP) are adversely affected by structured noise artifacts arising from head motion and physiological processes. Functional connectivity estimates (Pearson's correlation coefficients) were inflated for high-motion time points and for high-motion participants. This inflation occurred across the brain, suggesting the presence of globally distributed artifacts. The degree of inflation was further increased for connections between nearby regions compared with distant regions, suggesting the presence of distance-dependent spatially specific artifacts. We evaluated several denoising methods: censoring high-motion time points, motion regression, the FMRIB independent component analysis-based X-noiseifier (FIX), and mean grayordinate time series regression (MGTR; as a proxy for global signal regression). The results suggest that FIX denoising reduced both types of artifacts, but left substantial global artifacts behind. MGTR significantly reduced global artifacts, but left substantial spatially specific artifacts behind. Censoring high-motion time points resulted in a small reduction of distance-dependent and global artifacts, eliminating neither type. All denoising strategies left differences between high- and low-motion participants, but only MGTR substantially reduced those differences. Ultimately, functional connectivity estimates from HCP data showed spatially specific and globally distributed artifacts, and the most effective approach to address both types of motion-correlated artifacts was a combination of FIX and MGTR. PMID:27571276
Temperature-viscosity models reassessed.
Peleg, Micha
2017-05-04
The temperature effect on viscosity of liquid and semi-liquid foods has been traditionally described by the Arrhenius equation, a few other mathematical models, and more recently by the WLF and VTF (or VFT) equations. The essence of the Arrhenius equation is that the viscosity is proportional to the absolute temperature's reciprocal and governed by a single parameter, namely, the energy of activation. However, if the absolute temperature in K in the Arrhenius equation is replaced by T + b where both T and the adjustable b are in °C, the result is a two-parameter model, which has superior fit to experimental viscosity-temperature data. This modified version of the Arrhenius equation is also mathematically equal to the WLF and VTF equations, which are known to be equal to each other. Thus, despite their dissimilar appearances all three equations are essentially the same model, and when used to fit experimental temperature-viscosity data render exactly the same very high regression coefficient. It is shown that three new hybrid two-parameter mathematical models, whose formulation bears little resemblance to any of the conventional models, can also have excellent fit with r 2 ∼ 1. This is demonstrated by comparing the various models' regression coefficients to published viscosity-temperature relationships of 40% sucrose solution, soybean oil, and 70°Bx pear juice concentrate at different temperature ranges. Also compared are reconstructed temperature-viscosity curves using parameters calculated directly from 2 or 3 data points and fitted curves obtained by nonlinear regression using a larger number of experimental viscosity measurements.
The Relationship Between Surface Curvature and Abdominal Aortic Aneurysm Wall Stress.
de Galarreta, Sergio Ruiz; Cazón, Aitor; Antón, Raúl; Finol, Ender A
2017-08-01
The maximum diameter (MD) criterion is the most important factor when predicting risk of rupture of abdominal aortic aneurysms (AAAs). An elevated wall stress has also been linked to a high risk of aneurysm rupture, yet is an uncommon clinical practice to compute AAA wall stress. The purpose of this study is to assess whether other characteristics of the AAA geometry are statistically correlated with wall stress. Using in-house segmentation and meshing algorithms, 30 patient-specific AAA models were generated for finite element analysis (FEA). These models were subsequently used to estimate wall stress and maximum diameter and to evaluate the spatial distributions of wall thickness, cross-sectional diameter, mean curvature, and Gaussian curvature. Data analysis consisted of statistical correlations of the aforementioned geometry metrics with wall stress for the 30 AAA inner and outer wall surfaces. In addition, a linear regression analysis was performed with all the AAA wall surfaces to quantify the relationship of the geometric indices with wall stress. These analyses indicated that while all the geometry metrics have statistically significant correlations with wall stress, the local mean curvature (LMC) exhibits the highest average Pearson's correlation coefficient for both inner and outer wall surfaces. The linear regression analysis revealed coefficients of determination for the outer and inner wall surfaces of 0.712 and 0.516, respectively, with LMC having the largest effect on the linear regression equation with wall stress. This work underscores the importance of evaluating AAA mean wall curvature as a potential surrogate for wall stress.
Xue, Dan; Yin, Jingyuan
2014-05-01
In this study, we explored the potential applications of the Ozone Monitoring Instrument (OMI) satellite sensor in air pollution research. The OMI planetary boundary layer sulfur dioxide (SO2_PBL) column density and daily average surface SO2 concentration of Shanghai from 2004 to 2012 were analyzed. After several consecutive years of increase, the surface SO2 concentration finally declined in 2007. It was higher in winter than in other seasons. The coefficient between daily average surface SO2 concentration and SO2_PBL was only 0.316. But SO2_PBL was found to be a highly significant predictor of the surface SO2 concentration using the simple regression model. Five meteorological factors were considered in this study, among them, temperature, dew point, relative humidity, and wind speed were negatively correlated with surface SO2 concentration, while pressure was positively correlated. Furthermore, it was found that dew point was a more effective predictor than temperature. When these meteorological factors were used in multiple regression, the determination coefficient reached 0.379. The relationship of the surface SO2 concentration and meteorological factors was seasonally dependent. In summer and autumn, the regression model performed better than in spring and winter. The surface SO2 concentration predicting method proposed in this study can be easily adapted for other regions, especially most useful for those having no operational air pollution forecasting services or having sparse ground monitoring networks.
Qidwai, Tabish; Yadav, Dharmendra K; Khan, Feroz; Dhawan, Sangeeta; Bhakuni, R S
2012-01-01
This work presents the development of quantitative structure activity relationship (QSAR) model to predict the antimalarial activity of artemisinin derivatives. The structures of the molecules are represented by chemical descriptors that encode topological, geometric, and electronic structure features. Screening through QSAR model suggested that compounds A24, A24a, A53, A54, A62 and A64 possess significant antimalarial activity. Linear model is developed by the multiple linear regression method to link structures to their reported antimalarial activity. The correlation in terms of regression coefficient (r(2)) was 0.90 and prediction accuracy of model in terms of cross validation regression coefficient (rCV(2)) was 0.82. This study indicates that chemical properties viz., atom count (all atoms), connectivity index (order 1, standard), ring count (all rings), shape index (basic kappa, order 2), and solvent accessibility surface area are well correlated with antimalarial activity. The docking study showed high binding affinity of predicted active compounds against antimalarial target Plasmepsins (Plm-II). Further studies for oral bioavailability, ADMET and toxicity risk assessment suggest that compound A24, A24a, A53, A54, A62 and A64 exhibits marked antimalarial activity comparable to standard antimalarial drugs. Later one of the predicted active compound A64 was chemically synthesized, structure elucidated by NMR and in vivo tested in multidrug resistant strain of Plasmodium yoelii nigeriensis infected mice. The experimental results obtained agreed well with the predicted values.
Teaching Students Not to Dismiss the Outermost Observations in Regressions
ERIC Educational Resources Information Center
Kasprowicz, Tomasz; Musumeci, Jim
2015-01-01
One econometric rule of thumb is that greater dispersion in observations of the independent variable improves estimates of regression coefficients and therefore produces better results, i.e., lower standard errors of the estimates. Nevertheless, students often seem to mistrust precisely the observations that contribute the most to this greater…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mele, L.M.; Prodan, P.F.
1983-04-01
Hydrologic data were collected and analyzed for three coal refuse disposal sites in southern Illinois. The disposal sites were associated with underground mines and consisted of piles of coarse waste (gob) and slurry areas where fine waste rejected from coal washing was deposited. Prereclamation data were available for the Superior washer site in Macoupin County and the New Kathleen site in Perry County. Post-reclamation data were available for the Staunton 1 site in Macoupin County and the New Kathleen site. Data analyzed from each phase (i.e., pre- or post-reclamation) were limited to one year. Storm event runoff coefficients were calculatedmore » for each site. Average runoff coefficients were compared for sites within the same reclamation phase to determine the effects of topographical parameters such as gob pile slope and percentage of drainage basin covered by the gob pile. Average runoff coefficients were then compared for pre- and post-reclamation data. Multiple regression analyses were performed on rainfall-runoff data for each site to determine the significance of independent variables other than rainfall in determining runoff. A generalized regression equation corrected data for topographical differences and included only those independent variables that were significant at all sites. Regression coefficients were compared for pre- and post-reclamation sites. The results of rainfall-runoff analysis indicate that the runoff coefficient increases because of reclamation. It is hypothesized that this effect is due to the placement of a soil cover that is less permeable than gob or slurry and occurs despite reduction in slope and the establishment of vegetation.« less
van Mil, Anke C C M; Greyling, Arno; Zock, Peter L; Geleijnse, Johanna M; Hopman, Maria T; Mensink, Ronald P; Reesink, Koen D; Green, Daniel J; Ghiadoni, Lorenzo; Thijssen, Dick H
2016-09-01
Brachial artery flow-mediated dilation (FMD) is a popular technique to examine endothelial function in humans. Identifying volunteer and methodological factors related to variation in FMD is important to improve measurement accuracy and applicability. Volunteer-related and methodology-related parameters were collected in 672 volunteers from eight affiliated centres worldwide who underwent repeated measures of FMD. All centres adopted contemporary expert-consensus guidelines for FMD assessment. After calculating the coefficient of variation (%) of the FMD for each individual, we constructed quartiles (n = 168 per quartile). Based on two regression models (volunteer-related factors and methodology-related factors), statistically significant components of these two models were added to a final regression model (calculated as β-coefficient and R). This allowed us to identify factors that independently contributed to the variation in FMD%. Median coefficient of variation was 17.5%, with healthy volunteers demonstrating a coefficient of variation 9.3%. Regression models revealed age (β = 0.248, P < 0.001), hypertension (β = 0.104, P < 0.001), dyslipidemia (β = 0.331, P < 0.001), time between measurements (β = 0.318, P < 0.001), lab experience (β = -0.133, P < 0.001) and baseline FMD% (β = 0.082, P < 0.05) as contributors to the coefficient of variation. After including all significant factors in the final model, we found that time between measurements, hypertension, baseline FMD% and lab experience with FMD independently predicted brachial artery variability (total R = 0.202). Although FMD% showed good reproducibility, larger variation was observed in conditions with longer time between measurements, hypertension, less experience and lower baseline FMD%. Accounting for these factors may improve FMD% variability.
Regression Simulation Model. Appendix X. Users Manual,
1981-03-01
change as the prediction equations become refined. Whereas no notice will be provided when the changes are made, the programs will be modified such that...NATIONAL BUREAU Of STANDARDS 1963 A ___,_ __ _ __ _ . APPENDIX X ( R4/ EGRESSION IMULATION ’jDEL. Ape’A ’) 7 USERS MANUA submitted to The Great River...regression analysis and to establish a prediction equation (model). The prediction equation contains the partial regression coefficients (B-weights) which
Separation in Logistic Regression: Causes, Consequences, and Control.
Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg
2018-04-01
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
NASA Astrophysics Data System (ADS)
Sanchez Rivera, Yamil
The purpose of this study is to add to what we know about the affective domain and to create a valid instrument for future studies. The Motivation to Learn Science (MLS) Inventory is based on Krathwohl's Taxonomy of Affective Behaviors (Krathwohl et al., 1964). The results of the Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) demonstrated that the MLS Inventory is a valid and reliable instrument. Therefore, the MLS Inventory is a uni-dimensional instrument composed of 9 items with convergent validity (no divergence). The instrument had a high Chronbach Alpha value of .898 during the EFA analysis and .919 with the CFA analysis. Factor loadings on the 9 items ranged from .617 to .800. Standardized regression weights ranged from .639 to .835 in the CFA analysis. Various indices (RMSEA = .033; NFI = .987; GFI = .985; CFI = 1.000) demonstrated a good fitness of the proposed model. Hierarchical linear modeling was used to statistical analyze data where students' motivation to learn science scores (level-1) were nested within teachers (level-2). The analysis was geared toward identifying if teachers' use of affective behavior (a level-2 classroom variable) was significantly related with students' MLS scores (level-1 criterion variable). Model testing proceeded in three phases: intercept-only model, means-as-outcome model, and a random-regression coefficient model. The intercept-only model revealed an intra-class correlation coefficient of .224 with an estimated reliability of .726. Therefore, data suggested that only 22.4% of the variance in MLS scores is between-classes and the remaining 77.6% is at the student-level. Due to the significant variance in MLS scores, X2(62.756, p<.0001), teachers' TAB scores were added as a level-2 predictor. The regression coefficient was non-significant (p>.05). Therefore, the teachers' self-reported use of affective behaviors was not a significant predictor of students' motivation to learn science.
Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.
2015-09-28
Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Ono, Tomohiro; Nakamura, Mitsuhiro; Hirose, Yoshinori; Kitsuda, Kenji; Ono, Yuka; Ishigaki, Takashi; Hiraoka, Masahiro
2017-09-01
To estimate the lung tumor position from multiple anatomical features on four-dimensional computed tomography (4D-CT) data sets using single regression analysis (SRA) and multiple regression analysis (MRA) approach and evaluate an impact of the approach on internal target volume (ITV) for stereotactic body radiotherapy (SBRT) of the lung. Eleven consecutive lung cancer patients (12 cases) underwent 4D-CT scanning. The three-dimensional (3D) lung tumor motion exceeded 5 mm. The 3D tumor position and anatomical features, including lung volume, diaphragm, abdominal wall, and chest wall positions, were measured on 4D-CT images. The tumor position was estimated by SRA using each anatomical feature and MRA using all anatomical features. The difference between the actual and estimated tumor positions was defined as the root-mean-square error (RMSE). A standard partial regression coefficient for the MRA was evaluated. The 3D lung tumor position showed a high correlation with the lung volume (R = 0.92 ± 0.10). Additionally, ITVs derived from SRA and MRA approaches were compared with ITV derived from contouring gross tumor volumes on all 10 phases of the 4D-CT (conventional ITV). The RMSE of the SRA was within 3.7 mm in all directions. Also, the RMSE of the MRA was within 1.6 mm in all directions. The standard partial regression coefficient for the lung volume was the largest and had the most influence on the estimated tumor position. Compared with conventional ITV, average percentage decrease of ITV were 31.9% and 38.3% using SRA and MRA approaches, respectively. The estimation accuracy of lung tumor position was improved by the MRA approach, which provided smaller ITV than conventional ITV. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Multiple imputation for cure rate quantile regression with censored data.
Wu, Yuanshan; Yin, Guosheng
2017-03-01
The main challenge in the context of cure rate analysis is that one never knows whether censored subjects are cured or uncured, or whether they are susceptible or insusceptible to the event of interest. Considering the susceptible indicator as missing data, we propose a multiple imputation approach to cure rate quantile regression for censored data with a survival fraction. We develop an iterative algorithm to estimate the conditionally uncured probability for each subject. By utilizing this estimated probability and Bernoulli sample imputation, we can classify each subject as cured or uncured, and then employ the locally weighted method to estimate the quantile regression coefficients with only the uncured subjects. Repeating the imputation procedure multiple times and taking an average over the resultant estimators, we obtain consistent estimators for the quantile regression coefficients. Our approach relaxes the usual global linearity assumption, so that we can apply quantile regression to any particular quantile of interest. We establish asymptotic properties for the proposed estimators, including both consistency and asymptotic normality. We conduct simulation studies to assess the finite-sample performance of the proposed multiple imputation method and apply it to a lung cancer study as an illustration. © 2016, The International Biometric Society.
Estimation of subsurface thermal structure using sea surface height and sea surface temperature
NASA Technical Reports Server (NTRS)
Kang, Yong Q. (Inventor); Jo, Young-Heon (Inventor); Yan, Xiao-Hai (Inventor)
2012-01-01
A method of determining a subsurface temperature in a body of water is disclosed. The method includes obtaining surface temperature anomaly data and surface height anomaly data of the body of water for a region of interest, and also obtaining subsurface temperature anomaly data for the region of interest at a plurality of depths. The method further includes regressing the obtained surface temperature anomaly data and surface height anomaly data for the region of interest with the obtained subsurface temperature anomaly data for the plurality of depths to generate regression coefficients, estimating a subsurface temperature at one or more other depths for the region of interest based on the generated regression coefficients and outputting the estimated subsurface temperature at the one or more other depths. Using the estimated subsurface temperature, signal propagation times and trajectories of marine life in the body of water are determined.
NASA Technical Reports Server (NTRS)
Rogers, R. H. (Principal Investigator)
1976-01-01
The author has identified the following significant results. Computer techniques were developed for mapping water quality parameters from LANDSAT data, using surface samples collected in an ongoing survey of water quality in Saginaw Bay. Chemical and biological parameters were measured on 31 July 1975 at 16 bay stations in concert with the LANDSAT overflight. Application of stepwise linear regression bands to nine of these parameters and corresponding LANDSAT measurements for bands 4 and 5 only resulted in regression correlation coefficients that varied from 0.94 for temperature to 0.73 for Secchi depth. Regression equations expressed with the pair of bands 4 and 5, rather than the ratio band 4/band 5, provided higher correlation coefficients for all the water quality parameters studied (temperature, Secchi depth, chloride, conductivity, total kjeldahl nitrogen, total phosphorus, chlorophyll a, total solids, and suspended solids).
Prediction of anthropometric foot characteristics in children.
Morrison, Stewart C; Durward, Brian R; Watt, Gordon F; Donaldson, Malcolm D C
2009-01-01
The establishment of growth reference values is needed in pediatric practice where pathologic conditions can have a detrimental effect on the growth and development of the pediatric foot. This study aims to use multiple regression to evaluate the effects of multiple predictor variables (height, age, body mass, and gender) on anthropometric characteristics of the peripubescent foot. Two hundred children aged 9 to 12 years were recruited, and three anthropometric measurements of the pediatric foot were recorded (foot length, forefoot width, and navicular height). Multiple regression analysis was conducted, and coefficients for gender, height, and body mass all had significant relationships for the prediction of forefoot width and foot length (P < or = .05, r > or = 0.7). The coefficients for gender and body mass were not significant for the prediction of navicular height (P > or = .05), whereas height was (P < or = .05). Normative growth reference values and prognostic regression equations are presented for the peripubescent foot.
Heat transfer and flow friction correlations for perforated plate matrix heat exchangers
NASA Astrophysics Data System (ADS)
Ratna Raju, L.; Kumar, S. Sunil; Chowdhury, K.; Nandi, T. K.
2017-02-01
Perforated plate matrix heat exchangers (MHE) are constructed of high conductivity perforated plates stacked alternately with low conductivity spacers. They are being increasingly used in many cryogenic applications including Claude cycle or Reversed Brayton cycle cryo-refrigerators and liquefiers. Design of high NTU (number of (heat) transfer unit) cryogenic MHEs requires accurate heat transfer coefficient and flow friction factor. Thermo-hydraulic behaviour of perforated plates strongly depends on the geometrical parameters. Existing correlations, however, are mostly expressed as functions of Reynolds number only. This causes, for a given configuration, significant variations in coefficients from one correlation to the other. In this paper we present heat transfer and flow friction correlations as functions of all geometrical and other controlling variables. A FluentTM based numerical model has been developed for heat transfer and pressure drop studies over a stack of alternately arranged perforated plates and spacers. The model is validated with the data from literature. Generalized correlations are obtained through regression analysis over a large number of computed data.
Extraction of anthocyanins from red cabbage using high pressure CO2.
Xu, Zhenzhen; Wu, Jihong; Zhang, Yan; Hu, Xiaosong; Liao, Xiaojun; Wang, Zhengfu
2010-09-01
The extraction kinetics of anthocyanins from red cabbage using high pressure CO(2) (HPCD) against conventional acidified water (CAW) was investigated. The HPCD time, temperature, pressure and volume ratio of solid-liquid mixture vs. pressurized CO(2) (R((S+L)/G)) exhibited important roles on the extraction kinetics of anthocyanins. The extraction kinetics showed two phases, the yield increased with increasing the time in the first phase, the yield defined as steady-state yield (y(*)) was constant in the second phase. The y(*) of anthocyanins using HPCD increased with higher temperature, higher pressure and lower R((S+L)/G). The general mass transfer model with higher regression coefficients (R(2)>0.97) fitted the kinetic data better than the Fick's second law diffusion model. As compared with CAW, the time (t(*)) to reach the y(*) of anthocyanins using HPCD was reduced by half while its corresponding overall volumetric mass transfer coefficients k(L)xa from the general mass transfer model increased by two folds. Copyright 2010 Elsevier Ltd. All rights reserved.
Arora, Simran Kaur; Patel, A A; Kumar, Naveen; Chauhan, O P
2016-04-01
The shear-thinning low, medium and high-viscosity fiber preparations (0.15-1.05 % psyllium husk, 0.07-0.6 % guar gum, 0.15-1.20 % gum tragacanth, 0.1-0.8 % gum karaya, 0.15-1.05 % high-viscosity Carboxy Methyl Cellulose and 0.1-0.7 % xanthan gum) showed that the consistency coefficient (k) was a function of concentration, the relationship being exponential (R(2), 0.87-0.96; P < 0.01). The flow behaviour index (n) (except for gum karaya and CMC) was exponentially related to concentration (R(2), 0.61-0.98). The relationship between k and sensory viscosity rating (SVR) was essentially linear in nearly all cases. The SVR could be predicted from the consistency coefficient using the regression equations developed. Also, the relationship of k with fiber concentration would make it possible to identify the concentration of a particular gum required to have desired consistency in terms of SVR.
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Mean centering, multicollinearity, and moderators in multiple regression: The reconciliation redux.
Iacobucci, Dawn; Schneider, Matthew J; Popovich, Deidre L; Bakamitsos, Georgios A
2017-02-01
In this article, we attempt to clarify our statements regarding the effects of mean centering. In a multiple regression with predictors A, B, and A × B (where A × B serves as an interaction term), mean centering A and B prior to computing the product term can clarify the regression coefficients (which is good) and the overall model fit R 2 will remain undisturbed (which is also good).
To, Minh-Son; Prakash, Shivesh; Poonnoose, Santosh I; Bihari, Shailesh
2018-05-01
The study uses meta-regression analysis to quantify the dose-dependent effects of statin pharmacotherapy on vasospasm, delayed ischemic neurologic deficits (DIND), and mortality in aneurysmal subarachnoid hemorrhage. Prospective, retrospective observational studies, and randomized controlled trials (RCTs) were retrieved by a systematic database search. Summary estimates were expressed as absolute risk (AR) for a given statin dose or control (placebo). Meta-regression using inverse variance weighting and robust variance estimation was performed to assess the effect of statin dose on transformed AR in a random effects model. Dose-dependence of predicted AR with 95% confidence interval (CI) was recovered by using Miller's Freeman-Tukey inverse. The database search and study selection criteria yielded 18 studies (2594 patients) for analysis. These included 12 RCTs, 4 retrospective observational studies, and 2 prospective observational studies. Twelve studies investigated simvastatin, whereas the remaining studies investigated atorvastatin, pravastatin, or pitavastatin, with simvastatin-equivalent doses ranging from 20 to 80 mg. Meta-regression revealed dose-dependent reductions in Freeman-Tukey-transformed AR of vasospasm (slope coefficient -0.00404, 95% CI -0.00720 to -0.00087; P = 0.0321), DIND (slope coefficient -0.00316, 95% CI -0.00586 to -0.00047; P = 0.0392), and mortality (slope coefficient -0.00345, 95% CI -0.00623 to -0.00067; P = 0.0352). The present meta-regression provides weak evidence for dose-dependent reductions in vasospasm, DIND and mortality associated with acute statin use after aneurysmal subarachnoid hemorrhage. However, the analysis was limited by substantial heterogeneity among individual studies. Greater dosing strategies are a potential consideration for future RCTs. Copyright © 2018 Elsevier Inc. All rights reserved.
Suppressor Variables: The Difference between "Is" versus "Acting As"
ERIC Educational Resources Information Center
Ludlow, Larry; Klein, Kelsey
2014-01-01
Correlated predictors in regression models are a fact of life in applied social science research. The extent to which they are correlated will influence the estimates and statistics associated with the other variables they are modeled along with. These effects, for example, may include enhanced regression coefficients for the other variables--a…
Causal Models with Unmeasured Variables: An Introduction to LISREL.
ERIC Educational Resources Information Center
Wolfle, Lee M.
Whenever one uses ordinary least squares regression, one is making an implicit assumption that all of the independent variables have been measured without error. Such an assumption is obviously unrealistic for most social data. One approach for estimating such regression models is to measure implied coefficients between latent variables for which…
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Jović, Ozren
2016-12-15
A novel method for quantitative prediction and variable-selection on spectroscopic data, called Durbin-Watson partial least-squares regression (dwPLS), is proposed in this paper. The idea is to inspect serial correlation in infrared data that is known to consist of highly correlated neighbouring variables. The method selects only those variables whose intervals have a lower Durbin-Watson statistic (dw) than a certain optimal cutoff. For each interval, dw is calculated on a vector of regression coefficients. Adulteration of cold-pressed linseed oil (L), a well-known nutrient beneficial to health, is studied in this work by its being mixed with cheaper oils: rapeseed oil (R), sesame oil (Se) and sunflower oil (Su). The samples for each botanical origin of oil vary with respect to producer, content and geographic origin. The results obtained indicate that MIR-ATR, combined with dwPLS could be implemented to quantitative determination of edible-oil adulteration. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Cai, Jun; Wang, Kuaishe; Shi, Jiamin; Wang, Wen; Liu, Yingying
2018-01-01
Constitutive analysis for hot working of BFe10-1-2 alloy was carried out by using experimental stress-strain data from isothermal hot compression tests, in a wide range of temperature of 1,023 1,273 K, and strain rate range of 0.001 10 s-1. A constitutive equation based on modified double multiple nonlinear regression was proposed considering the independent effects of strain, strain rate, temperature and their interrelation. The predicted flow stress data calculated from the developed equation was compared with the experimental data. Correlation coefficient (R), average absolute relative error (AARE) and relative errors were introduced to verify the validity of the developed constitutive equation. Subsequently, a comparative study was made on the capability of strain-compensated Arrhenius-type constitutive model. The results showed that the developed constitutive equation based on modified double multiple nonlinear regression could predict flow stress of BFe10-1-2 alloy with good correlation and generalization.
NASA Astrophysics Data System (ADS)
Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.
2018-05-01
In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Hidaka, Nobuhiro; Murata, Masaharu; Sasahara, Jun; Ishii, Keisuke; Mitsuda, Nobuaki
2015-05-01
Observed/expected lung area to head circumference ratio (o/e LHR) and lung to thorax transverse area ratio (LTR) are the sonographic indicators of postnatal outcome in fetuses with congenital diaphragmatic hernia (CDH), and they are not influenced by gestational age. We aimed to evaluate the relationship between these two parameters in the same subjects with fetal left-sided CDH. Fetuses with left-sided CDH managed between 2005 and 2012 were included. Data of LTR and o/e LHR values measured on the same day prior to 33 weeks' gestation in target fetuses were retrospectively collected. The correlation between the two parameters was estimated using the Spearman's rank-correlation coefficient, and linear regression analysis was used to assess the relationship between them. Data on 61 measurements from 36 CDH fetuses were analyzed to obtain a Spearman's rank-correlation coefficient of 0.74 with the following linear equation: LTR = 0.002 × (o/e LHR) + 0.005. The determination coefficient of this linear equation was sufficiently high at 0.712, and the prediction accuracy obtained with this regression formula was considered satisfactory. A good linear correlation between the LTR and the o/e LHR was obtained, suggesting that we can translate the predictive parameters for each other. This information is expected to be useful to improve our understanding of different investigations focusing on LTR or o/e LHR as a predictor of postnatal outcome in CDH. © 2014 Japanese Teratology Society.
NASA Astrophysics Data System (ADS)
Makama, Ezekiel Kaura; Lim, Hwee San; Abdullah, Khiruddin
2018-01-01
Precipitable water vapor (PWV) is a highly variable, but important greenhouse gas that regulates the radiation budget of the earth. Its variability in time and space makes it difficult to quantify. Knowledge of its vertical distribution, in particular, is crucial for many reasons. In this study, empirical relationships between isobaric layers of PWV over Peninsular Malaysia are examined. Analysis of variance (ANOVA) technique on Advanced Television and Infrared Observation Satellite Operational Vertical Sounder (ATOVS) observations, from 2005 to 2011, has been used to propose a relationship of the form, W=α(WL)β for the middle (MW) and upper (UW) layers PWV. W is either MW or UW with α and β as regression coefficients, which are functions of latitude. Coefficients of determination (R2) and root mean square error (RMSE) of respective values between 0.75-0.86 and 1.65-2.38 mm, across the zones, were obtained for both the MW and UW predictions, with a mean bias (MB) below ±1 mm.The predicted and observed PWV presented a better agreement northerly. Initial predictability test for each model was done on two independent data sets: ATOVS (2012-2015), and radiosonde (2010-2011) at Penang, Kuantan and Sepang stations, with very good outcomes. The results of the tests revealed remarkable performances, when compared with two previously reported models. The inclusion of variable regression coefficients, and the utilization of satellite-derived data, which provide soundings of data-void regions between radiosonde networks, proved to have optimized the results.
The importance of regional models in assessing canine cancer incidences in Switzerland
Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina
2018-01-01
Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships. PMID:29652921
The importance of regional models in assessing canine cancer incidences in Switzerland.
Boo, Gianluca; Leyk, Stefan; Brunsdon, Christopher; Graf, Ramona; Pospischil, Andreas; Fabrikant, Sara Irina
2018-01-01
Fitting canine cancer incidences through a conventional regression model assumes constant statistical relationships across the study area in estimating the model coefficients. However, it is often more realistic to consider that these relationships may vary over space. Such a condition, known as spatial non-stationarity, implies that the model coefficients need to be estimated locally. In these kinds of local models, the geographic scale, or spatial extent, employed for coefficient estimation may also have a pervasive influence. This is because important variations in the local model coefficients across geographic scales may impact the understanding of local relationships. In this study, we fitted canine cancer incidences across Swiss municipal units through multiple regional models. We computed diagnostic summaries across the different regional models, and contrasted them with the diagnostics of the conventional regression model, using value-by-alpha maps and scalograms. The results of this comparative assessment enabled us to identify variations in the goodness-of-fit and coefficient estimates. We detected spatially non-stationary relationships, in particular, for the variables related to biological risk factors. These variations in the model coefficients were more important at small geographic scales, making a case for the need to model canine cancer incidences locally in contrast to more conventional global approaches. However, we contend that prior to undertaking local modeling efforts, a deeper understanding of the effects of geographic scale is needed to better characterize and identify local model relationships.
Burns, Jonathan K; Tomita, Andrew; Kapadia, Amy S
2014-03-01
Income inequality is associated with numerous negative health outcomes. There is evidence that ecological-level socio-environmental factors may increase risk for schizophrenia. The aim was to investigate whether measures of income inequality are associated with incidence of schizophrenia at the country level. We conducted a systematic review of incidence rates for schizophrenia, reported between 1975 and 2011. For each country, national measures of income inequality (Gini coefficient) along with covariate risk factors for schizophrenia were obtained. Multi-level mixed-effects Poisson regression was performed to investigate the relationship between Gini coefficients and incidence rates of schizophrenia controlling for covariates. One hundred and seven incidence rates (from 26 countries) were included. Mean incidence of schizophrenia was 18.50 per 100,000 (SD = 11.9; range = 1.7-67). There was a significant positive relationship between incidence rate of schizophrenia and Gini coefficient (β = 1.02; Z = 2.28; p = .02; 95% CI = 1.00, 1.03). Countries characterized by a large rich-poor gap may be at increased risk of schizophrenia. We suggest that income inequality impacts negatively on social cohesion, eroding social capital, and that chronic stress associated with living in highly disparate societies places individuals at risk of schizophrenia.
Zer, Matan; Lindner, Arie; Greenstein, Alexander; Leibovici, Dan
2011-07-01
Academic careers of individual doctors are commonly evaluated by examining the number and quality of authored publications. Similarly, the extent and quality of medical research may be assessed nationwide by measuring the number of publications originating from the country of interest over time. This in turn, may indicate on the quality of medicine practiced. To evaluate the extent and quality of IsraeLi publications we measured the rate and quality of medical publications originating from Israel for two decades in the fields of urology, cardiology and orthopedics, and compared the data to those of other countries. Leading journals in urology, cardiology, and orthopedics were selected. A Medline search (http://www.ncbi.ntm.nih.gov/sites/entrez] was conducted for all the publications originating in Israel between the years 1990-2009 in the selected journals. Data from Israel was compared to those from Italy, France, Germany, Egypt and Turkey. The change in rate of publications was tested using Linear regression. The quality of publications was calculated by multiplying the number of publications by the relevant impact factor. While the urology publications rate in Israel increased by 32.7% in the second study decade as compared with the first, the urology publication rates during the same time period from Italy, France, Germany, Egypt and Turkey were 199%, 115%, 184%, 180% and 227% respectively. The regression coefficient for the urology publication rate was 0.51 for Israel, and 0.78, 0.95, 0.78, 0.87 and 0.97 for the other countries, respectively. The regression coefficient for the change in the quality of publications from Israel was 0.31 and 0.81, 0.75, 0.92, 0.73, and 0.92 for the other countries, respectively. In cardiology, the Israeli publication rate increased by 26% during the second study decade, whereas in the other countries the increments were 46%, 35%, 76%, 80% and 309% respectively. The regression coefficient for Israeli pubLication rate was 0.45, and 0.78, 0.54, 0.62, 0.13 and 0.75 for the other countries, respectively. The regression coefficient of the quality of publications in Israel was 0.3 as opposed to 0.47, 0.36, 0.48, 0.01, and 0.78 respectively. The Israeli publications in orthopedics increased by 9.3% during the second decade compared with the first. At the same time, other countries increased the publication rate in orthopedics by 69%, 121%, 173%, 140% and 296% respectively. The regression coefficient for the publication rate in orthopedics was 0.02 for Israel, and 0.62, 0.64, 0.78, 0.34 and 0.71 for the other countries, respectively. The regression coefficient of the quality of publications in Israel was 0.05 as opposed to 0.67, 0.62, 0.75, 0.31, and 0.66 in the other countries, respectively. Israel lags behind Italy, France, Germany, Egypt and Turkey with regard to the increase of both the number and the quality of medical publications in urology and orthopedics. While the rate and quality of IsraeLi publications in cardiology surpasses those from Egypt, they lag in the number of publications in this medical field behind those of all the rest of the countries examined. In a world of rapid progress and expansion of medical research, Israel has been stagnant in publications in 3 medical specialties, rendering it inferior to other nations.
Large signal-to-noise ratio quantification in MLE for ARARMAX models
NASA Astrophysics Data System (ADS)
Zou, Yiqun; Tang, Xiafei
2014-06-01
It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.
Effects of Medical Insurance on the Health Status and Life Satisfaction of the Elderly
GU, Liubao; FENG, Huihui; JIN, Jian
2017-01-01
Background: Population aging has become increasingly serious in China. The demand for medical insurance of the elderly is increasing, and their health status and life satisfaction are becoming significant issues. This study investigates the effects of medical insurance on the health status and life satisfaction of the elderly. Methods: The national baseline survey data of the China Health and Retirement Longitudinal Survey in 2013 were adopted. The Ordered Probit Model was established. The effects of the medical insurance for urban employees, medical insurance for urban residents, and new rural cooperative medical insurance on the health status and life satisfaction of the elderly were investigated. Results: Medical insurance could facilitate the improvement of the health status and life satisfaction of the elderly. Accordingly, the health status and life satisfaction of the elderly who have medical insurance for urban residents improved significantly. The regression coefficients were 0.348 and 0.307. The corresponding regression coefficients of the medical insurance for urban employees were 0.189 and 0.236. The regression coefficients of the new rural cooperative medical insurance were 0.170 and 0.188. Conclusion: Medical insurance can significantly improve the health status and life satisfaction of the elderly. This development is of immense significance for the formulation of equal medical security. PMID:29026784
Periodontal disease in children and adolescents with type 1 diabetes in Serbia.
Dakovic, Dragana; Pavlovic, Milos D
2008-06-01
The purpose of this study was to evaluate periodontal health in young patients with type 1 diabetes mellitus in Serbia. Periodontal disease was clinically assessed and compared in 187 children and adolescents (6 to 18 years of age) with type 1 diabetes mellitus and 178 control subjects without diabetes. Children and adolescents with type 1 diabetes mellitus had significantly more plaque, gingival inflammation, and periodontal destruction than control subjects. The main risk factors for periodontitis were diabetes (odds ratio [OR] = 2.78; 95% confidence interval [CI]: 1.42 to 5.44), bleeding/plaque ratio (OR = 1.25; 95% CI: 1.06 to 1.48), and age (OR = 1.10; 95% CI: 1.01 to 1.21). In case subjects, the number of teeth affected by periodontal destruction was associated with mean hemoglobin A1c (regression coefficient 0.17; P = 0.026), duration of diabetes (regression coefficient 0.19; P = 0.021), and bleeding/plaque ratio (regression coefficient 0.17; P = 0.021). Compared to children and adolescents without diabetes, periodontal disease is more prevalent and widespread in children and adolescents with type 1 diabetes mellitus and depends on the duration of disease, metabolic control, and the severity of gingival inflammation. Gingival inflammation in young patients with diabetes is more evident and more often results in periodontal destruction.
Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung
2012-07-01
In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
Gerrard, Paul
2012-10-01
To determine whether there is a relationship between the level of education and the accuracy of self-reported physical activity as a proxy measure of aerobic fitness. Data from the National Health and Nutrition Examination from the years 1999 to 2004 were used. Linear regression was performed for measured maximum oxygen consumption (Vo(2)max) versus self-reported physical activity for 5 different levels of education. This was a national survey in the United States. Participants included adults from the general U.S. population (N=3290). None. Coefficients of determination obtained from models for each education level were used to compare how well self-reported physical activity represents cardiovascular fitness. These coefficients were the main outcome measure. Coefficients of determination for Vo(2)max versus reported physical activity increased as the level of education increased. In this preliminary study, self-reported physical activity is a better proxy measure for aerobic fitness in highly educated individuals than in poorly educated individuals. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
ERIC Educational Resources Information Center
Longford, Nicholas T.
Operational procedures for the Graduate Record Examinations Validity Study Service are reviewed, with emphasis on the problem of frequent occurrence of negative coefficients in the fitted within-department regressions obtained by the empirical Bayes method of H. I. Braun and D. Jones (1985). Several alterations of the operational procedures are…
Middleton, Michael S; Haufe, William; Hooker, Jonathan; Borga, Magnus; Dahlqvist Leinhard, Olof; Romu, Thobias; Tunón, Patrik; Hamilton, Gavin; Wolfson, Tanya; Gamst, Anthony; Loomba, Rohit; Sirlin, Claude B
2017-05-01
Purpose To determine the repeatability and accuracy of a commercially available magnetic resonance (MR) imaging-based, semiautomated method to quantify abdominal adipose tissue and thigh muscle volume and hepatic proton density fat fraction (PDFF). Materials and Methods This prospective study was institutional review board- approved and HIPAA compliant. All subjects provided written informed consent. Inclusion criteria were age of 18 years or older and willingness to participate. The exclusion criterion was contraindication to MR imaging. Three-dimensional T1-weighted dual-echo body-coil images were acquired three times. Source images were reconstructed to generate water and calibrated fat images. Abdominal adipose tissue and thigh muscle were segmented, and their volumes were estimated by using a semiautomated method and, as a reference standard, a manual method. Hepatic PDFF was estimated by using a confounder-corrected chemical shift-encoded MR imaging method with hybrid complex-magnitude reconstruction and, as a reference standard, MR spectroscopy. Tissue volume and hepatic PDFF intra- and interexamination repeatability were assessed by using intraclass correlation and coefficient of variation analysis. Tissue volume and hepatic PDFF accuracy were assessed by means of linear regression with the respective reference standards. Results Adipose and thigh muscle tissue volumes of 20 subjects (18 women; age range, 25-76 years; body mass index range, 19.3-43.9 kg/m 2 ) were estimated by using the semiautomated method. Intra- and interexamination intraclass correlation coefficients were 0.996-0.998 and coefficients of variation were 1.5%-3.6%. For hepatic MR imaging PDFF, intra- and interexamination intraclass correlation coefficients were greater than or equal to 0.994 and coefficients of variation were less than or equal to 7.3%. In the regression analyses of manual versus semiautomated volume and spectroscopy versus MR imaging, PDFF slopes and intercepts were close to the identity line, and correlations of determination at multivariate analysis (R 2 ) ranged from 0.744 to 0.994. Conclusion This MR imaging-based, semiautomated method provides high repeatability and accuracy for estimating abdominal adipose tissue and thigh muscle volumes and hepatic PDFF. © RSNA, 2017.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.
Singh, Preet Mohinder; Borle, Anuradha; Shah, Dipal; Sinha, Ashish; Makkar, Jeetinder Kaur; Trikha, Anjan; Goudra, Basavana Gouda
2016-04-01
Prophylactic continuous positive airway pressure (CPAP) can prevent pulmonary adverse events following upper abdominal surgeries. The present meta-regression evaluates and quantifies the effect of degree/duration of (CPAP) on the incidence of postoperative pulmonary events. Medical databases were searched for randomized controlled trials involving adult patients, comparing the outcome in those receiving prophylactic postoperative CPAP versus no CPAP, undergoing high-risk abdominal surgeries. Our meta-analysis evaluated the relationship between the postoperative pulmonary complications and the use of CPAP. Furthermore, meta-regression was used to quantify the effect of cumulative duration and degree of CPAP on the measured outcomes. Seventy-three potentially relevant studies were identified, of which 11 had appropriate data, allowing us to compare a total of 362 and 363 patients in CPAP and control groups, respectively. Qualitatively, Odds ratio for CPAP showed protective effect for pneumonia [0.39 (0.19-0.78)], atelectasis [0.51 (0.32-0.80)] and pulmonary complications [0.37 (0.24-0.56)] with zero heterogeneity. For prevention of pulmonary complications, odds ratio was better for continuous than intermittent CPAP. Meta-regression demonstrated a positive correlation between the degree of CPAP and the incidence of pneumonia with a regression coefficient of +0.61 (95 % CI 0.02-1.21, P = 0.048, τ (2) = 0.078, r (2) = 7.87 %). Overall, adverse effects were similar with or without the use of CPAP. Prophylactic postoperative use of continuous CPAP significantly reduces the incidence of postoperative pneumonia, atelectasis and pulmonary complications in patients undergoing high-risk abdominal surgeries. Quantitatively, increasing the CPAP levels does not necessarily enhance the protective effect against pneumonia. Instead, protective effect diminishes with increasing degree of CPAP.
Su, Peng-Hao; Tomy, Gregg T; Hou, Chun-Yan; Yin, Fang; Feng, Dao-Lun; Ding, Yong-Sheng; Li, Yi-Fan
2018-04-01
A size-segregated gas/particle partitioning coefficient K Pi was proposed and evaluated in the predicting models on the basis of atmospheric polybrominated diphenyl ether (PBDE) field data comparing with the bulk coefficient K P . Results revealed that the characteristics of atmospheric PBDEs in southeast Shanghai rural area were generally consistent with previous investigations, suggesting that this investigation was representative to the present pollution status of atmospheric PBDEs. K Pi was generally greater than bulk K P , indicating an overestimate of TSP (the mass concentration of total suspended particles) in the expression of bulk K P . In predicting models, K Pi led to a significant shift in regression lines as compared to K P , thus it should be more cautious to investigate sorption mechanisms using the regression lines. The differences between the performances of K Pi and K P were helpful to explain some phenomenon in predicting investigations, such as P L 0 and K OA models overestimate the particle fractions of PBDEs and the models work better at high temperature than at low temperature. Our findings are important because they enabled an insight into the influence of particle size on predicting models. Copyright © 2018 Elsevier Ltd. All rights reserved.
Towards molecular design using 2D-molecular contour maps obtained from PLS regression coefficients
NASA Astrophysics Data System (ADS)
Borges, Cleber N.; Barigye, Stephen J.; Freitas, Matheus P.
2017-12-01
The multivariate image analysis descriptors used in quantitative structure-activity relationships are direct representations of chemical structures as they are simply numerical decodifications of pixels forming the 2D chemical images. These MDs have found great utility in the modeling of diverse properties of organic molecules. Given the multicollinearity and high dimensionality of the data matrices generated with the MIA-QSAR approach, modeling techniques that involve the projection of the data space onto orthogonal components e.g. Partial Least Squares (PLS) have been generally used. However, the chemical interpretation of the PLS-based MIA-QSAR models, in terms of the structural moieties affecting the modeled bioactivity has not been straightforward. This work describes the 2D-contour maps based on the PLS regression coefficients, as a means of assessing the relevance of single MIA predictors to the response variable, and thus allowing for the structural, electronic and physicochemical interpretation of the MIA-QSAR models. A sample study to demonstrate the utility of the 2D-contour maps to design novel drug-like molecules is performed using a dataset of some anti-HIV-1 2-amino-6-arylsulfonylbenzonitriles and derivatives, and the inferences obtained are consistent with other reports in the literature. In addition, the different schemes for encoding atomic properties in molecules are discussed and evaluated.
Liu, Ruixin; Zhang, Xiaodong; Zhang, Lu; Gao, Xiaojie; Li, Huiling; Shi, Junhan; Li, Xuelin
2014-06-01
The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue.
LIU, RUIXIN; ZHANG, XIAODONG; ZHANG, LU; GAO, XIAOJIE; LI, HUILING; SHI, JUNHAN; LI, XUELIN
2014-01-01
The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue. PMID:24926369
Verification studies of Seasat-A satellite scatterometer /SASS/ measurements
NASA Technical Reports Server (NTRS)
Halberstam, I.
1981-01-01
Two comparisons between Seasat-A satellite scatterometer (SASS) data and surface truth, obtained from the Gulf of Alaska Seasat Experiment and the Joint Air-Sea Interaction program, have been made to determine the behavior of SASS and its algorithms. The performance of SASS was first evaluated irrespective of the algorithms employed to convert the SASS data to geophysical parameters, which was done by separating the backscatter measurements into small bins of incidence and azimuth angles and polarity and regression against wind speed measurements. The algorithms were then tested by comparing their predicted slopes and y intercepts with those derived from the regressions, and by comparing each SASS backscatter measurement with the backscatter derived from the algorithms, and the given wind velocity from the observations. It was shown that SASS was insensitive to winds at high incidence angles for horizontal polarizations. Fairly high correlations were found between backscatter and wind speeds. The algorithms functioned well at mid-ranges of incidence angle and backscattering coefficient.
ERIC Educational Resources Information Center
Vasu, Ellen Storey
1978-01-01
The effects of the violation of the assumption of normality in the conditional distributions of the dependent variable, coupled with the condition of multicollinearity upon the outcome of testing the hypothesis that the regression coefficient equals zero, are investigated via a Monte Carlo study. (Author/JKS)
ERIC Educational Resources Information Center
Marland, Eric; Bossé, Michael J.; Rhoads, Gregory
2018-01-01
Rounding is a necessary step in many mathematical processes. We are taught early in our education about significant figures and how to properly round a number. So when we are given a data set and asked to find a regression line, we are inclined to offer the line with rounded coefficients to reflect our model. However, the effects are not as…
Modeling maximum daily temperature using a varying coefficient regression model
Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith
2014-01-01
Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...
Cho, Yeoungjee; Büchel, Janine; Steppan, Sonja; Passlick-Deetjen, Jutta; Hawley, Carmel M.; Dimeski, Goce; Clarke, Margaret; Johnson, David W.
2016-01-01
♦ Background: The longitudinal trends of lipid parameters and the impact of biocompatible peritoneal dialysis (PD) solutions on these levels remain to be fully defined. The present study aimed to a) evaluate the influence of neutral pH, low glucose degradation product (GDP) PD solutions on serum lipid parameters, and b) explore the capacity of lipid parameters (total cholesterol [TC], triglyceride [TG], high density lipoprotein [HDL], TC/HDL, low density lipoprotein [LDL], very low density lipoprotein [VLDL]) to predict cardiovascular events (CVE) and mortality in PD patients. ♦ Methods: The study included 175 incident participants from the balANZ trial with at least 1 stored serum sample. A composite CVE score was used as a primary clinical outcome measure. Multilevel linear regression and Poisson regression models were fitted to describe the trend of lipid parameters over time and its ability to predict composite CVE, respectively. ♦ Results: Small but statistically significant increases in serum TG (coefficient 0.006, p < 0.001), TC/HDL (coefficient 0.004, p = 0.001), and VLDL cholesterol (coefficient 0.005, p = 0.001) levels and a decrease in the serum HDL cholesterol levels (coefficient −0.004, p = 0.009) were observed with longer time on PD, whilst the type of PD solution (biocompatible vs standard) received had no significant effect on these levels. Peritoneal dialysis glucose exposure was significantly associated with trends in TG, TC/HDL, HDL and VLDL levels. Baseline lipid parameter levels were not predictive of composite CVEs or all-cause mortality. ♦ Conclusion: Serum TG, TC/HDL, and VLDL levels increased and the serum HDL levels decreased with increasing PD duration. None of the lipid parameters were significantly modified by biocompatible PD solution use over the time period studied or predictive of composite CVE or mortality. PMID:26429421
Novel risk score of contrast-induced nephropathy after percutaneous coronary intervention.
Ji, Ling; Su, XiaoFeng; Qin, Wei; Mi, XuHua; Liu, Fei; Tang, XiaoHong; Li, Zi; Yang, LiChuan
2015-08-01
Contrast-induced nephropathy (CIN) post-percutaneous coronary intervention (PCI) is a major cause of acute kidney injury. In this study, we established a comprehensive risk score model to assess risk of CIN after PCI procedure, which could be easily used in a clinical environment. A total of 805 PCI patients, divided into analysis cohort (70%) and validation cohort (30%), were enrolled retrospectively in this study. Risk factors for CIN were identified using univariate analysis and multivariate logistic regression in the analysis cohort. Risk score model was developed based on multiple regression coefficients. Sensitivity and specificity of the new risk score system was validated in the validation cohort. Comparisons between the new risk score model and previous reported models were applied. The incidence of post-PCI CIN in the analysis cohort (n = 565) was 12%. Considerably high CIN incidence (50%) was observed in patients with chronic kidney disease (CKD). Age >75, body mass index (BMI) >25, myoglobin level, cardiac function level, hypoalbuminaemia, history of chronic kidney disease (CKD), Intra-aortic balloon pump (IABP) and peripheral vascular disease (PVD) were identified as independent risk factors of post-PCI CIN. A novel risk score model was established using multivariate regression coefficients, which showed highest sensitivity and specificity (0.917, 95%CI 0.877-0.957) compared with previous models. A new post-PCI CIN risk score model was developed based on a retrospective study of 805 patients. Application of this model might be helpful to predict CIN in patients undergoing PCI procedure. © 2015 Asian Pacific Society of Nephrology.
Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D
2017-07-01
Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.
Understanding multidecadal variability in ENSO amplitude
NASA Astrophysics Data System (ADS)
Russell, A.; Gnanadesikan, A.
2013-12-01
Sea surface temperatures (SSTs) in the tropical Pacific vary as a result of the coupling between the ocean and atmosphere driven largely by the El Niño - Southern Oscillation (ENSO). ENSO has a large impact on the local climate and hydrology of the tropical Pacific, as well as broad-reaching effects on global climate. ENSO amplitude is known to vary on long timescales, which makes it very difficult to quantify its response to climate change and constrain the physical processes that drive it. In order to assess the extent of unforced multidecadal changes in ENSO variability, a linear regression of local SST changes is applied to the GFDL CM2.1 model 4000-yr pre-industrial control run. The resulting regression coefficient strengths, which represent the sensitivity of SST changes to thermocline depth and zonal wind stress, vary by up to a factor of 2 on multi-decadal time scales. This long-term modulation in ocean-atmosphere coupling is highly correlated with ENSO variability, but do not explain the reasons for such variability. Variation in the relationship between SST changes and wind stress points to a role for changing stratification in the central equatorial Pacific in modulating ENSO amplitudes with stronger stratification reducing the response to winds. The main driving mechanism we have identified for higher ENSO variance are changes in the response of zonal winds to SST anomalies. The shifting convection and precipitation patterns associated with the changing state of the atmosphere also contribute to the variability of the regression coefficients. These mechanisms drive much of the variability in ENSO amplitude and hence ocean-atmosphere coupling in the tropical Pacific.
Quality of life, depression, and sexual dysfunction in spouses of female patients with fibromyalgia.
Tutoglu, Ahmet; Boyaci, Ahmet; Koca, Irfan; Celen, Esra; Korkmaz, Nurdan
2014-08-01
The aim of this study was to investigate the effects of the quality of life and psychological condition of female patients with fibromyalgia and their spouses on sexual function. A total of 32 female patients diagnosed with fibromyalgia and their spouses were analyzed. Thirty married couples were included in the study as the control group. The demographic data of the fibromyalgia patients were recorded, a visual analog scale was used to evaluate the level of pain, and the Fibromyalgia Impact Questionnaire was used to evaluate the impact of the symptoms on the quality of life of the patients. The quality of life of both the patients and the control group were evaluated using the Short Form 36 (SF-36), and psychological variables were evaluated using the Beck Depression Inventory (BDI) and Beck Anxiety Inventory. Sexual function was assessed using the Female Sexual Function Index for female participants and the International Index of Erectile Function (IIEF) for male participants. The IIEF erectile dysfunction scores were significantly lower in the spouses of female patients with fibromyalgia than in the control group (p < 0.05), and the BDI scores were significantly higher in the spouses of the female patients with fibromyalgia (p < 0.05). Among the SF-36 scores, the emotional and physical roles were significantly lower in the spouses of the female patients with fibromyalgia (p = 0.003 and p = 0.004, respectively). In all spouses of FMS patients and controls, there was a significantly negative correlation between erectile function, the BDI score, and to be married with FMS patient and positive correlations between erectile function and emotional role, social function, mental health, SF-36 pain score, and general health (p < 0.05 for all). In a linear regression model, BDI, to be married with FMS patient and general health were found to affect erectile function (beta regression coefficient = -0.572, SE = 0.082, p = 0.001; beta regression coefficient = -0.332, SE = 1.619, p = 0.007; beta regression coefficient = 0.445, SE = 0.065, p = 0.005, respectively). Being a spouse of a patient with fibromyalgia might significantly interfere with quality of life and lead to a high rate of sexual dysfunction. Spouses of patients with fibromyalgia might also be investigated for sexual dysfunction and quality of life. Treatment programs for this group should be considered.
Hsu, David
2015-09-27
Clustering methods are often used to model energy consumption for two reasons. First, clustering is often used to process data and to improve the predictive accuracy of subsequent energy models. Second, stable clusters that are reproducible with respect to non-essential changes can be used to group, target, and interpret observed subjects. However, it is well known that clustering methods are highly sensitive to the choice of algorithms and variables. This can lead to misleading assessments of predictive accuracy and mis-interpretation of clusters in policymaking. This paper therefore introduces two methods to the modeling of energy consumption in buildings: clusterwise regression,more » also known as latent class regression, which integrates clustering and regression simultaneously; and cluster validation methods to measure stability. Using a large dataset of multifamily buildings in New York City, clusterwise regression is compared to common two-stage algorithms that use K-means and model-based clustering with linear regression. Predictive accuracy is evaluated using 20-fold cross validation, and the stability of the perturbed clusters is measured using the Jaccard coefficient. These results show that there seems to be an inherent tradeoff between prediction accuracy and cluster stability. This paper concludes by discussing which clustering methods may be appropriate for different analytical purposes.« less
Boosting structured additive quantile regression for longitudinal childhood obesity data.
Fenske, Nora; Fahrmeir, Ludwig; Hothorn, Torsten; Rzehak, Peter; Höhle, Michael
2013-07-25
Childhood obesity and the investigation of its risk factors has become an important public health issue. Our work is based on and motivated by a German longitudinal study including 2,226 children with up to ten measurements on their body mass index (BMI) and risk factors from birth to the age of 10 years. We introduce boosting of structured additive quantile regression as a novel distribution-free approach for longitudinal quantile regression. The quantile-specific predictors of our model include conventional linear population effects, smooth nonlinear functional effects, varying-coefficient terms, and individual-specific effects, such as intercepts and slopes. Estimation is based on boosting, a computer intensive inference method for highly complex models. We propose a component-wise functional gradient descent boosting algorithm that allows for penalized estimation of the large variety of different effects, particularly leading to individual-specific effects shrunken toward zero. This concept allows us to flexibly estimate the nonlinear age curves of upper quantiles of the BMI distribution, both on population and on individual-specific level, adjusted for further risk factors and to detect age-varying effects of categorical risk factors. Our model approach can be regarded as the quantile regression analog of Gaussian additive mixed models (or structured additive mean regression models), and we compare both model classes with respect to our obesity data.
NASA Astrophysics Data System (ADS)
Gowda, Shivalinge; Krishnaveni, S.; Yashoda, T.; Umesh, T. K.; Gowda, Ramakrishna
2004-09-01
Photon mass attenuation coefficients of some thermoluminescent dosimetric (TLD) compounds, such as LiF, CaCO_3, CaSO_4, CaSO_4\\cdot2H_2O, SrSO_4, CdSO_4, BaSO_4, C_4H_6BaO_4 and 3CdSO_4\\cdot8H_2O were determined at 279.2, 320.07, 514.0, 661.6, 1115.5, 1173.2 and 1332.5 keV in a well-collimated narrow beam good geometry set-up using a high resolution, hyper pure germanium detector. The attenuation coefficient data were then used to compute the effective atomic number and the electron density of TLD compounds. The interpolation of total attenuation cross-sections of photons of energy E in elements of atomic number Z was performed using the logarithmic regression analysis of the data measured by the authors and reported earlier. The best-fit coefficients so obtained in the photon energy range of 279.2 to 320.07 keV, 514.0 to 661.6 keV and 1115.5 to 1332.5 keV by a piece-wise interpolation method were then used to find the effective atomic number and electron density of the compounds. These values are found to be in agreement with other available published values.
Liu, Cong; Kolarik, Barbara; Gunnarsen, Lars; Zhang, Yinping
2015-10-20
Polychlorinated biphenyls (PCBs) have been found to be persistent in the environment and possibly harmful. Many buildings are characterized with high PCB concentrations. Knowledge about partitioning between primary sources and building materials is critical for exposure assessment and practical remediation of PCB contamination. This study develops a C-depth method to determine diffusion coefficient (D) and partition coefficient (K), two key parameters governing the partitioning process. For concrete, a primary material studied here, relative standard deviations of results among five data sets are 5%-22% for K and 42-66% for D. Compared with existing methods, C-depth method overcomes the inability to obtain unique estimation for nonlinear regression and does not require assumed correlations for D and K among congeners. Comparison with a more sophisticated two-term approach implies significant uncertainty for D, and smaller uncertainty for K. However, considering uncertainties associated with sampling and chemical analysis, and impact of environmental factors, the results are acceptable for engineering applications. This was supported by good agreement between model prediction and measurement. Sensitivity analysis indicated that effective diffusion distance, contacting time of materials with primary sources, and depth of measured concentrations are critical for determining D, and PCB concentration in primary sources is critical for K.
Improved model of the retardance in citric acid coated ferrofluids using stepwise regression
NASA Astrophysics Data System (ADS)
Lin, J. F.; Qiu, X. R.
2017-06-01
Citric acid (CA) coated Fe3O4 ferrofluids (FFs) have been conducted for biomedical application. The magneto-optical retardance of CA coated FFs was measured by a Stokes polarimeter. Optimization and multiple regression of retardance in FFs were executed by Taguchi method and Microsoft Excel previously, and the F value of regression model was large enough. However, the model executed by Excel was not systematic. Instead we adopted the stepwise regression to model the retardance of CA coated FFs. From the results of stepwise regression by MATLAB, the developed model had highly predictable ability owing to F of 2.55897e+7 and correlation coefficient of one. The average absolute error of predicted retardances to measured retardances was just 0.0044%. Using the genetic algorithm (GA) in MATLAB, the optimized parametric combination was determined as [4.709 0.12 39.998 70.006] corresponding to the pH of suspension, molar ratio of CA to Fe3O4, CA volume, and coating temperature. The maximum retardance was found as 31.712°, close to that obtained by evolutionary solver in Excel and a relative error of -0.013%. Above all, the stepwise regression method was successfully used to model the retardance of CA coated FFs, and the maximum global retardance was determined by the use of GA.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Wang, Xiaojing; Chen, Ming-Hui; Yan, Jun
2013-07-01
Cox models with time-varying coefficients offer great flexibility in capturing the temporal dynamics of covariate effects on event times, which could be hidden from a Cox proportional hazards model. Methodology development for varying coefficient Cox models, however, has been largely limited to right censored data; only limited work on interval censored data has been done. In most existing methods for varying coefficient models, analysts need to specify which covariate coefficients are time-varying and which are not at the time of fitting. We propose a dynamic Cox regression model for interval censored data in a Bayesian framework, where the coefficient curves are piecewise constant but the number of pieces and the jump points are covariate specific and estimated from the data. The model automatically determines the extent to which the temporal dynamics is needed for each covariate, resulting in smoother and more stable curve estimates. The posterior computation is carried out via an efficient reversible jump Markov chain Monte Carlo algorithm. Inference of each coefficient is based on an average of models with different number of pieces and jump points. A simulation study with three covariates, each with a coefficient of different degree in temporal dynamics, confirmed that the dynamic model is preferred to the existing time-varying model in terms of model comparison criteria through conditional predictive ordinate. When applied to a dental health data of children with age between 7 and 12 years, the dynamic model reveals that the relative risk of emergence of permanent tooth 24 between children with and without an infected primary predecessor is the highest at around age 7.5, and that it gradually reduces to one after age 11. These findings were not seen from the existing studies with Cox proportional hazards models.
Near infrared spectral linearisation in quantifying soluble solids content of intact carambola.
Omar, Ahmad Fairuz; MatJafri, Mohd Zubir
2013-04-12
This study presents a novel application of near infrared (NIR) spectral linearisation for measuring the soluble solids content (SSC) of carambola fruits. NIR spectra were measured using reflectance and interactance methods. In this study, only the interactance measurement technique successfully generated a reliable measurement result with a coefficient of determination of (R2) = 0.724 and a root mean square error of prediction for (RMSEP) = 0.461° Brix. The results from this technique produced a highly accurate and stable prediction model compared with multiple linear regression techniques.
Near Infrared Spectral Linearisation in Quantifying Soluble Solids Content of Intact Carambola
Omar, Ahmad Fairuz; MatJafri, Mohd Zubir
2013-01-01
This study presents a novel application of near infrared (NIR) spectral linearisation for measuring the soluble solids content (SSC) of carambola fruits. NIR spectra were measured using reflectance and interactance methods. In this study, only the interactance measurement technique successfully generated a reliable measurement result with a coefficient of determination of (R2) = 0.724 and a root mean square error of prediction for (RMSEP) = 0.461° Brix. The results from this technique produced a highly accurate and stable prediction model compared with multiple linear regression techniques. PMID:23584118
DFT study on oxidation of HS(CH2) m SH ( m = 1-8) in oxidative desulfurization
NASA Astrophysics Data System (ADS)
Song, Y. Z.; Song, J. J.; Zhao, T. T.; Chen, C. Y.; He, M.; Du, J.
2016-06-01
Density functional theory was employed for calculation of HS(CH2) m SH ( m = 1-8) and its derivatives at B3LYP method at 6-31++g ( d, p) level. Using eigenvalues of LUMO and HOMO for HS(CH2) m SH, the standard electrode potentials were estimated by a stepwise multiple regression techniques (MLR), and obtained as E° = 1.500 + 7.167 × 10-3 HOMO-0.229 LUMO with high correlation coefficients of 0.973 and F values of 43.973.
Structured sparse linear graph embedding.
Wang, Haixian
2012-03-01
Subspace learning is a core issue in pattern recognition and machine learning. Linear graph embedding (LGE) is a general framework for subspace learning. In this paper, we propose a structured sparse extension to LGE (SSLGE) by introducing a structured sparsity-inducing norm into LGE. Specifically, SSLGE casts the projection bases learning into a regression-type optimization problem, and then the structured sparsity regularization is applied to the regression coefficients. The regularization selects a subset of features and meanwhile encodes high-order information reflecting a priori structure information of the data. The SSLGE technique provides a unified framework for discovering structured sparse subspace. Computationally, by using a variational equality and the Procrustes transformation, SSLGE is efficiently solved with closed-form updates. Experimental results on face image show the effectiveness of the proposed method. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Abdul, Wares MD.; Ohtsu, Mizuki; Nakano, Kazuya; Haneishi, Hideaki
2018-02-01
We propose a method to estimate transcutaneous bilirubin, hemoglobin, and melanin based on the diffuse reflectance spectroscopy. In the proposed method, the Monte Carlo simulation-based multiple regression analysis for an absorbance spectrum in the visible wavelength region (460-590 nm) is used to specify the concentrations of bilirubin (Cbil), oxygenated hemoglobin (Coh), deoxygenated hemoglobin (Cdh), and melanin (Cm). Using the absorbance spectrum calculated from the measured diffuse reflectance spectrum as a response variable and the extinction coefficients of bilirubin, oxygenated hemoglobin, deoxygenated hemoglobin, and melanin, as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of bilirubin, oxygenated hemoglobin, deoxygenated hemoglobin, and melanin, are then determined from the regression coefficients using conversion vectors that are numerically deduced in advance by the Monte Carlo simulations for light transport in skin. Total hemoglobin concentration (Cth) and tissue oxygen saturation (StO2) are simply calculated from the oxygenated hemoglobin and deoxygenated hemoglobin. In vivo animal experiments with bile duct ligation in rats demonstrated that the estimated Cbil is increased after ligation of bile duct and reaches to around 20 mg/dl at 72 h after the onset of the ligation, which corresponds to the reference value of Cbil measured by a commercially available transcutaneous bilirubin meter. We also performed in vivo experiments with rats while varying the fraction of inspired oxygen (FiO2). Coh and Cdh decreased and increased, respectively, as FiO2 decreased. Consequently, StO2 was dramatically decreased. The results in this study indicate potential of the method for simultaneous evaluation of multiple chromophores in skin tissue.
Daily magnesium intake and serum magnesium concentration among Japanese people.
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. The mean (+/-standard deviation) daily magnesium intake was 322 (+/-132), 323 (+/-163), and 322 (+/-147) mg/day for men, women, and the entire group, respectively. The mean (+/-standard deviation) serum magnesium concentration was 20.69 (+/-2.83), 20.69 (+/-2.88), and 20.69 (+/-2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log(10)X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed.
Estimation of variance in Cox's regression model with shared gamma frailties.
Andersen, P K; Klein, J P; Knudsen, K M; Tabanera y Palacios, R
1997-12-01
The Cox regression model with a shared frailty factor allows for unobserved heterogeneity or for statistical dependence between the observed survival times. Estimation in this model when the frailties are assumed to follow a gamma distribution is reviewed, and we address the problem of obtaining variance estimates for regression coefficients, frailty parameter, and cumulative baseline hazards using the observed nonparametric information matrix. A number of examples are given comparing this approach with fully parametric inference in models with piecewise constant baseline hazards.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Determination of suitable drying curve model for bread moisture loss during baking
NASA Astrophysics Data System (ADS)
Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.
2013-03-01
This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
NASA Astrophysics Data System (ADS)
Nazeer, Majid; Bilal, Muhammad
2018-04-01
Landsat-5 Thematic Mapper (TM) dataset have been used to estimate salinity in the coastal area of Hong Kong. Four adjacent Landsat TM images were used in this study, which was atmospherically corrected using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) radiative transfer code. The atmospherically corrected images were further used to develop models for salinity using Ordinary Least Square (OLS) regression and Geographically Weighted Regression (GWR) based on in situ data of October 2009. Results show that the coefficient of determination ( R 2) of 0.42 between the OLS estimated and in situ measured salinity is much lower than that of the GWR model, which is two times higher ( R 2 = 0.86). It indicates that the GWR model has more ability than the OLS regression model to predict salinity and show its spatial heterogeneity better. It was observed that the salinity was high in Deep Bay (north-western part of Hong Kong) which might be due to the industrial waste disposal, whereas the salinity was estimated to be constant (32 practical salinity units) towards the open sea.
Hong, K; Muntner, P; Kronish, I; Shilane, D; Chang, T I
2016-01-01
Lower adherence to antihypertensive medications may increase visit-to-visit variability of blood pressure (VVV of BP), a risk factor for cardiovascular events and death. We used data from the African American Study of Kidney Disease and Hypertension (AASK) trial to examine whether lower medication adherence is associated with higher systolic VVV of BP in African Americans with hypertensive chronic kidney disease (CKD). Determinants of VVV of BP were also explored. AASK participants (n=988) were categorized by self-report or pill count as having perfect (100%), moderately high (75-99%), moderately low (50-74%) or low (<50%) proportion of study visits with high medication adherence over a 1-year follow-up period. We used multinomial logistic regression to examine determinants of medication adherence, and multivariable-adjusted linear regression to examine the association between medication adherence and systolic VVV of BP, defined as the coefficient of variation or the average real variability (ARV). Participants with lower self-reported adherence were generally younger and had a higher prevalence of comorbid conditions. Compared with perfect adherence, moderately high, moderately low and low adherence was associated with 0.65% (±0.31%), 0.99% (±0.31%) and 1.29% (±0.32%) higher systolic VVV of BP (defined as the coefficient of variation) in fully adjusted models. Results were qualitatively similar when using ARV or when using pill counts as the measure of adherence. Lower medication adherence is associated with higher systolic VVV of BP in African Americans with hypertensive CKD; efforts to improve medication adherence in this population may reduce systolic VVV of BP.
The Reliability of Individualized Load-Velocity Profiles.
Banyard, Harry G; Nosaka, K; Vernon, Alex D; Haff, G Gregory
2017-11-15
This study examined the reliability of peak velocity (PV), mean propulsive velocity (MPV), and mean velocity (MV) in the development of load-velocity profiles (LVP) in the full depth free-weight back squat performed with maximal concentric effort. Eighteen resistance-trained men performed a baseline one-repetition maximum (1RM) back squat trial and three subsequent 1RM trials used for reliability analyses, with 48-hours interval between trials. 1RM trials comprised lifts from six relative loads including 20, 40, 60, 80, 90, and 100% 1RM. Individualized LVPs for PV, MPV, or MV were derived from loads that were highly reliable based on the following criteria: intra-class correlation coefficient (ICC) >0.70, coefficient of variation (CV) ≤10%, and Cohen's d effect size (ES) <0.60. PV was highly reliable at all six loads. Importantly, MPV and MV were highly reliable at 20, 40, 60, 80 and 90% but not 100% 1RM (MPV: ICC=0.66, CV=18.0%, ES=0.10, standard error of the estimate [SEM]=0.04m·s -1 ; MV: ICC=0.55, CV=19.4%, ES=0.08, SEM=0.04m·s -1 ). When considering the reliable ranges, almost perfect correlations were observed for LVPs derived from PV 20-100% (r=0.91-0.93), MPV 20-90% (r=0.92-0.94) and MV 20-90% (r=0.94-0.95). Furthermore, the LVPs were not significantly different (p>0.05) between trials, movement velocities, or between linear regression versus second order polynomial fits. PV 20-100% , MPV 20-90% , and MV 20-90% are reliable and can be utilized to develop LVPs using linear regression. Conceptually, LVPs can be used to monitor changes in movement velocity and employed as a method for adjusting sessional training loads according to daily readiness.
Kanjanahattakij, Napatt; Sirinvaravong, Natee; Aguilar, Francisco; Agrawal, Akanksha; Krishnamoorthy, Parasuram; Gupta, Shuchita
2018-01-01
In patients with heart failure with preserved ejection fraction (HFpEF), worse kidney function is associated with worse overall cardiac mechanics. Right ventricular stroke work index (RVSWI) is a parameter of right ventricular function. The aim of our study was to determine the relationship between RVSWI and glomerular filtration rate (GFR) in patients with HFpEF. This was a single-center cross-sectional study. HFpEF is defined as patients with documented heart failure with ejection fraction > 50% and pulmonary wedge pressure > 15 mm Hg from right heart catheterization. RVSWI (normal value 8-12 g/m/beat/m2) was calculated using the formula: RVSWI = 0.0136 × stroke volume index × (mean pulmonary artery pressure - mean right atrial pressure). Univariate and multivariate linear regression analysis was performed to study the correlation between RVSWI and GFR. Ninety-one patients were included in the study. The patients were predominantly female (n = 64, 70%) and African American (n = 61, 67%). Mean age was 66 ± 12 years. Mean GFR was 59 ± 35 mL/min/1.73 m2. Mean RVSWI was 11 ± 6 g/m/beat/m2. Linear regression analysis showed that there was a significant independent inverse relationship between RVSWI and GFR (unstandardized coefficient = -1.3, p = 0.029). In the subgroup with combined post and precapillary pulmonary hypertension (Cpc-PH) the association remained significant (unstandardized coefficient = -1.74, 95% CI -3.37 to -0.11, p = 0.04). High right ventricular workload indicated by high RVSWI is associated with worse renal function in patients with Cpc-PH. Further prospective studies are needed to better understand this association. © 2018 S. Karger AG, Basel.
Badgett, Majors J; Boyes, Barry; Orlando, Ron
2018-02-16
A model that predicts retention for peptides using a HALO ® penta-HILIC column and gradient elution was created. Coefficients for each amino acid were derived using linear regression analysis and these coefficients can be summed to predict the retention of peptides. This model has a high correlation between experimental and predicted retention times (0.946), which is on par with previous RP and HILIC models. External validation of the model was performed using a set of H. pylori samples on the same LC-MS system used to create the model, and the deviation from actual to predicted times was low. Apart from amino acid composition, length and location of amino acid residues on a peptide were examined and two site-specific corrections for hydrophobic residues at the N-terminus as well as hydrophobic residues one spot over from the N-terminus were created. Copyright © 2017 Elsevier B.V. All rights reserved.
Fast function-on-scalar regression with penalized basis expansions.
Reiss, Philip T; Huang, Lei; Mennes, Maarten
2010-01-01
Regression models for functional responses and scalar predictors are often fitted by means of basis functions, with quadratic roughness penalties applied to avoid overfitting. The fitting approach described by Ramsay and Silverman in the 1990 s amounts to a penalized ordinary least squares (P-OLS) estimator of the coefficient functions. We recast this estimator as a generalized ridge regression estimator, and present a penalized generalized least squares (P-GLS) alternative. We describe algorithms by which both estimators can be implemented, with automatic selection of optimal smoothing parameters, in a more computationally efficient manner than has heretofore been available. We discuss pointwise confidence intervals for the coefficient functions, simultaneous inference by permutation tests, and model selection, including a novel notion of pointwise model selection. P-OLS and P-GLS are compared in a simulation study. Our methods are illustrated with an analysis of age effects in a functional magnetic resonance imaging data set, as well as a reanalysis of a now-classic Canadian weather data set. An R package implementing the methods is publicly available.
Aisbett, B; Le Rossignol, P
2003-09-01
The VO2-power regression and estimated total energy demand for a 6-minute supra-maximal exercise test was predicted from a continuous incremental exercise test. Sub-maximal VO2-power co-ordinates were established from the last 40 seconds (s) of 150-second exercise stages. The precision of the estimated total energy demand was determined using the 95% confidence interval (95% CI) of the estimated total energy demand. The linearity of the individual VO2-power regression equations was determined using Pearson's correlation coefficient. The mean 95% CI of the estimated total energy demand was 5.9 +/- 2.5 mL O2 Eq x kg(-1) x min(-1), and the mean correlation coefficient was 0.9942 +/- 0.0042. The current study contends that the sub-maximal VO2-power co-ordinates from a continuous incremental exercise test can be used to estimate supra-maximal energy demand without compromising the precision of the accumulated oxygen deficit (AOD) method.
Source apportionment of PM2.5 light extinction in an urban atmosphere in China.
Lan, Zijuan; Zhang, Bin; Huang, Xiaofeng; Zhu, Qiao; Yuan, Jinfeng; Zeng, Liwu; Hu, Min; He, Lingyan
2018-01-01
Haze in China is primarily caused by high pollution of atmospheric fine particulates (PM 2.5 ). However, the detailed source structures of PM 2.5 light extinction have not been well established, especially for the roles of various organic aerosols, which makes haze management lack specified targets. This study obtained the mass concentrations of the chemical compositions and the light extinction coefficients of fine particles in the winter in Dongguan, Guangdong Province, using high time resolution aerosol observation instruments. We combined the positive matrix factor (PMF) analysis model of organic aerosols and the multiple linear regression method to establish a quantitative relationship model between the main chemical components, in particular the different sources of organic aerosols and the extinction coefficients of fine particles with a high goodness of fit (R 2 =0.953). The results show that the contribution rates of ammonium sulphate, ammonium nitrate, biomass burning organic aerosol (BBOA), secondary organic aerosol (SOA) and black carbon (BC) were 48.1%, 20.7%, 15.0%, 10.6%, and 5.6%, respectively. It can be seen that the contribution of the secondary aerosols is much higher than that of the primary aerosols (79.4% versus 20.6%) and are a major factor in the visibility decline. BBOA is found to have a high visibility destroying potential, with a high mass extinction coefficient, and was the largest contributor during some high pollution periods. A more detailed analysis indicates that the contribution of the enhanced absorption caused by BC mixing state was approximately 37.7% of the total particle absorption and should not be neglected. Copyright © 2017. Published by Elsevier B.V.
Income inequality, disinvestment in health care and use of dental services.
Bhandari, Bishal; Newton, Jonathan T; Bernabé, Eduardo
2015-01-01
To explore the interrelationships between income inequality, disinvestment in health care, and use of dental services at country level. This study pooled national estimates for use of dental services among adults aged 18 years or older from the 70 countries that participated in the World Health Survey from 2002 to 2004, together with aggregate data on national income (GDP per capita), income inequality (Gini coefficient), and disinvestment in health care (total health expenditure and dentist-to-population ratio) from various international sources. Use of dental services was defined as having had dental problems in the last 12 months and having received any treatment to address those needs. Associations between variables were explored using Pearson correlation coefficients and linear regression. Data from 63 countries representing the six WHO regions were analyzed. Use of dental services was negatively correlated with Gini coefficient (Pearson correlation coefficient -0.48, P < 0.001) and positively correlated with GDP per capita (0.40, P < 0.05), total health expenditure (0.45, P < 0.001), and dentist-to-population ratio (0.67, P < 0.001). The association between Gini coefficient and use of dental services was attenuated but remained significant after adjustments for GDP per capita, total health expenditure, and dentist-to-population ratio (regression coefficient -0.36; 95% CI -0.57, -0.15). This study shows an inverse relationship between income inequality and use of dental services. Of the two indicators of disinvestment in health care assessed, only dentist-to-population ratio was associated with income inequality and use of dental services. © 2014 American Association of Public Health Dentistry.
A Weighted Least Squares Approach To Robustify Least Squares Estimates.
ERIC Educational Resources Information Center
Lin, Chowhong; Davenport, Ernest C., Jr.
This study developed a robust linear regression technique based on the idea of weighted least squares. In this technique, a subsample of the full data of interest is drawn, based on a measure of distance, and an initial set of regression coefficients is calculated. The rest of the data points are then taken into the subsample, one after another,…
Measuring Productivity of Depot-Level Aircraft Maintenance in the Air Force Logistics Command.
1985-09-01
of Figures...... . . . . . . . . . . . . vi List of Tables . . . . . . . . . ............ vii Abstract . . . ...................... viii I...59 6. DEA Efficiency Values (Third DEA Model) . .... 62 7. DMU 5 Input Efficiencies ................ 64 vi F "-’ List of Tables Table Page I. DEA...Regression Results for 20 Months . . . ..... 68 V. Regression Results for 7 Quarters . . ..... 70 VI . Coefficients of Correlation (Using Quarterly Data
Erosion and soil displacement related to timber harvesting in northwestern California, U.S.A.
R.M. Rice; D.J. Furbish
1984-01-01
The relationship between measures of site disturbance and erosion resulting from timber harvest was studied by regression analyses. None of the 12 regression models developed and tested yielded a coefficient of determination (R2) greater than 0.60. The results indicated that the poor fits to the data were due, in part, to unexplained qualitative...
"Erosion and soil displacement related to timber harvesting in northwestern California, U.S.A."
R. M. Rice; D. J. Furbish
1984-01-01
The relationship between measures of site disturbance and erosion resulting from timber harvest was studied by regression analyses. None of the 12 regression models developed and tested yielded a coefficient of determination (R 2) greater than 0.60. The results indicated that the poor fits to the data were due, in part, to unexplained qualitative differences in...
Solar energy distribution over Egypt using cloudiness from Meteosat photos
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mosalam Shaltout, M.A.; Hassen, A.H.
1990-01-01
In Egypt, there are 10 ground stations for measuring the global solar radiation, and five stations for measuring the diffuse solar radiation. Every day at noon, the Meteorological Authority in Cairo receives three photographs of cloudiness over Egypt from the Meteosat satellite, one in the visible, and two in the infra-red bands (10.5-12.5 {mu}m) and (5.7-7.1 {mu}m). The monthly average cloudiness for 24 sites over Egypt are measured and calculated from Meteosat observations during the period 1985-1986. Correlation analysis between the cloudiness observed by Meteosat and global solar radiation measured from the ground stations is carried out. It is foundmore » that, the correlation coefficients are about 0.90 for the simple linear regression, and increase for the second and third degree regressions. Also, the correlation coefficients for the cloudiness with the diffuse solar radiation are about 0.80 for the simple linear regression, and increase for the second and third degree regression. Models and empirical relations for estimating the global and diffuse solar radiation from Meteosat cloudiness data over Egypt are deduced and tested. Seasonal maps for the global and diffuse radiation over Egypt are carried out.« less
NASA Astrophysics Data System (ADS)
Shafizadeh-Moghadam, Hossein; Helbich, Marco
2015-03-01
The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Semenova, Vera A.; Steward-Clark, Evelene; Maniatis, Panagiotis; Epperson, Monica; Sabnis, Amit; Schiffer, Jarad
2017-01-01
To improve surge testing capability for a response to a release of Bacillus anthracis, the CDC anti-Protective Antigen (PA) IgG Enzyme-Linked Immunosorbent Assay (ELISA) was re-designed into a high throughput screening format. The following assay performance parameters were evaluated: goodness of fit (measured as the mean reference standard r2), accuracy (measured as percent error), precision (measured as coefficient of variance (CV)), lower limit of detection (LLOD), lower limit of quantification (LLOQ), dilutional linearity, diagnostic sensitivity (DSN) and diagnostic specificity (DSP). The paired sets of data for each sample were evaluated by Concordance Correlation Coefficient (CCC) analysis. The goodness of fit was 0.999; percent error between the expected and observed concentration for each sample ranged from −4.6% to 14.4%. The coefficient of variance ranged from 9.0% to 21.2%. The assay LLOQ was 2.6 μg/mL. The regression analysis results for dilutional linearity data were r2 = 0.952, slope = 1.02 and intercept = −0.03. CCC between assays was 0.974 for the median concentration of serum samples. The accuracy and precision components of CCC were 0.997 and 0.977, respectively. This high throughput screening assay is precise, accurate, sensitive and specific. Anti-PA IgG concentrations determined using two different assays proved high levels of agreement. The method will improve surge testing capability 18-fold from 4 to 72 sera per assay plate. PMID:27814939
Semenova, Vera A; Steward-Clark, Evelene; Maniatis, Panagiotis; Epperson, Monica; Sabnis, Amit; Schiffer, Jarad
2017-01-01
To improve surge testing capability for a response to a release of Bacillus anthracis, the CDC anti-Protective Antigen (PA) IgG Enzyme-Linked Immunosorbent Assay (ELISA) was re-designed into a high throughput screening format. The following assay performance parameters were evaluated: goodness of fit (measured as the mean reference standard r 2 ), accuracy (measured as percent error), precision (measured as coefficient of variance (CV)), lower limit of detection (LLOD), lower limit of quantification (LLOQ), dilutional linearity, diagnostic sensitivity (DSN) and diagnostic specificity (DSP). The paired sets of data for each sample were evaluated by Concordance Correlation Coefficient (CCC) analysis. The goodness of fit was 0.999; percent error between the expected and observed concentration for each sample ranged from -4.6% to 14.4%. The coefficient of variance ranged from 9.0% to 21.2%. The assay LLOQ was 2.6 μg/mL. The regression analysis results for dilutional linearity data were r 2 = 0.952, slope = 1.02 and intercept = -0.03. CCC between assays was 0.974 for the median concentration of serum samples. The accuracy and precision components of CCC were 0.997 and 0.977, respectively. This high throughput screening assay is precise, accurate, sensitive and specific. Anti-PA IgG concentrations determined using two different assays proved high levels of agreement. The method will improve surge testing capability 18-fold from 4 to 72 sera per assay plate. Published by Elsevier Ltd.
Morfeld, Peter; Spallek, Michael
2015-01-01
Vermeulen et al. 2014 published a meta-regression analysis of three relevant epidemiological US studies (Steenland et al. 1998, Garshick et al. 2012, Silverman et al. 2012) that estimated the association between occupational diesel engine exhaust (DEE) exposure and lung cancer mortality. The DEE exposure was measured as cumulative exposure to estimated respirable elemental carbon in μg/m(3)-years. Vermeulen et al. 2014 found a statistically significant dose-response association and described elevated lung cancer risks even at very low exposures. We performed an extended re-analysis using different modelling approaches (fixed and random effects regression analyses, Greenland/Longnecker method) and explored the impact of varying input data (modified coefficients of Garshick et al. 2012, results from Crump et al. 2015 replacing Silverman et al. 2012, modified analysis of Moehner et al. 2013). We reproduced the individual and main meta-analytical results of Vermeulen et al. 2014. However, our analysis demonstrated a heterogeneity of the baseline relative risk levels between the three studies. This heterogeneity was reduced after the coefficients of Garshick et al. 2012 were modified while the dose coefficient dropped by an order of magnitude for this study and was far from being significant (P = 0.6). A (non-significant) threshold estimate for the cumulative DEE exposure was found at 150 μg/m(3)-years when extending the meta-analyses of the three studies by hockey-stick regression modelling (including the modified coefficients for Garshick et al. 2012). The data used by Vermeulen and colleagues led to the highest relative risk estimate across all sensitivity analyses performed. The lowest relative risk estimate was found after exclusion of the explorative study by Steenland et al. 1998 in a meta-regression analysis of Garshick et al. 2012 (modified), Silverman et al. 2012 (modified according to Crump et al. 2015) and Möhner et al. 2013. The meta-coefficient was estimated to be about 10-20 % of the main effect estimate in Vermeulen et al. 2014 in this analysis. The findings of Vermeulen et al. 2014 should not be used without reservations in any risk assessments. This is particularly true for the low end of the exposure scale.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brink, Carsten, E-mail: carsten.brink@rsyd.dk; Laboratory of Radiation Physics, Odense University Hospital; Bernchou, Uffe
2014-07-15
Purpose: Large interindividual variations in volume regression of non-small cell lung cancer (NSCLC) are observable on standard cone beam computed tomography (CBCT) during fractionated radiation therapy. Here, a method for automated assessment of tumor volume regression is presented and its potential use in response adapted personalized radiation therapy is evaluated empirically. Methods and Materials: Automated deformable registration with calculation of the Jacobian determinant was applied to serial CBCT scans in a series of 99 patients with NSCLC. Tumor volume at the end of treatment was estimated on the basis of the first one third and two thirds of the scans.more » The concordance between estimated and actual relative volume at the end of radiation therapy was quantified by Pearson's correlation coefficient. On the basis of the estimated relative volume, the patients were stratified into 2 groups having volume regressions below or above the population median value. Kaplan-Meier plots of locoregional disease-free rate and overall survival in the 2 groups were used to evaluate the predictive value of tumor regression during treatment. Cox proportional hazards model was used to adjust for other clinical characteristics. Results: Automatic measurement of the tumor regression from standard CBCT images was feasible. Pearson's correlation coefficient between manual and automatic measurement was 0.86 in a sample of 9 patients. Most patients experienced tumor volume regression, and this could be quantified early into the treatment course. Interestingly, patients with pronounced volume regression had worse locoregional tumor control and overall survival. This was significant on patient with non-adenocarcinoma histology. Conclusions: Evaluation of routinely acquired CBCT images during radiation therapy provides biological information on the specific tumor. This could potentially form the basis for personalized response adaptive therapy.« less
Determination of teicoplanin concentrations in serum by high-pressure liquid chromatography.
Joos, B; Lüthy, R
1987-01-01
An isocratic reversed-phase high-pressure liquid chromatographic method for the determination of six components of the teicoplanin complex in biological fluid was developed. By using fluorescence detection after precolumn derivatization with fluorescamine, the assay is specific and highly sensitive, with reproducibility studies yielding coefficients of variation ranging from 1.5 to 8.5% (at 5 to 80 micrograms/ml). Response was linear from 2.5 to 80 micrograms/ml (r = 0.999); the recovery from spiked human serum was 76%. An external quality control was performed to compare this high-pressure liquid chromatographic method (H) with a standard microbiological assay (M); no significant deviation from slope = 1 and intercept = 0 was found by regression analysis (H = 1.03M - 0.45; n = 15). PMID:2957953
Haiduc, Adrian Marius; van Duynhoven, John
2005-02-01
The porous properties of food materials are known to determine important macroscopic parameters such as water-holding capacity and texture. In conventional approaches, understanding is built from a long process of establishing macrostructure-property relations in a rational manner. Only recently, multivariate approaches were introduced for the same purpose. The model systems used here are oil-in-water emulsions, stabilised by protein, and form complex structures, consisting of fat droplets dispersed in a porous protein phase. NMR time-domain decay curves were recorded for emulsions with varied levels of fat, protein and water. Hardness, dry matter content and water drainage were determined by classical means and analysed for correlation with the NMR data with multivariate techniques. Partial least squares can calibrate and predict these properties directly from the continuous NMR exponential decays and yields regression coefficients higher than 82%. However, the calibration coefficients themselves belong to the continuous exponential domain and do little to explain the connection between NMR data and emulsion properties. Transformation of the NMR decays into a discreet domain with non-negative least squares permits the use of multilinear regression (MLR) on the resulting amplitudes as predictors and hardness or water drainage as responses. The MLR coefficients show that hardness is highly correlated with the components that have T2 distributions of about 20 and 200 ms whereas water drainage is correlated with components that have T2 distributions around 400 and 1800 ms. These T2 distributions very likely correlate with water populations present in pores with different sizes and/or wall mobility. The results for the emulsions studied demonstrate that NMR time-domain decays can be employed to predict properties and to provide insight in the underlying microstructural features.
Qing, Si-han; Chang, Yun-feng; Dong, Xiao-ai; Li, Yuan; Chen, Xiao-gang; Shu, Yong-kang; Deng, Zhen-hua
2013-10-01
To establish the mathematical models of stature estimation for Sichuan Han female with measurement of lumbar vertebrae by X-ray to provide essential data for forensic anthropology research. The samples, 206 Sichuan Han females, were divided into three groups including group A, B and C according to the ages. Group A (206 samples) consisted of all ages, group B (116 samples) were 20-45 years old and 90 samples over 45 years old were group C. All the samples were examined lumbar vertebrae through CR technology, including the parameters of five centrums (L1-L5) as anterior border, posterior border and central heights (x1-x15), total central height of lumbar spine (x16), and the real height of every sample. The linear regression analysis was produced using the parameters to establish the mathematical models of stature estimation. Sixty-two trained subjects were tested to verify the accuracy of the mathematical models. The established mathematical models by hypothesis test of linear regression equation model were statistically significant (P<0.05). The standard errors of the equation were 2.982-5.004 cm, while correlation coefficients were 0.370-0.779 and multiple correlation coefficients were 0.533-0.834. The return tests of the highest correlation coefficient and multiple correlation coefficient of each group showed that the highest accuracy of the multiple regression equation, y = 100.33 + 1.489 x3 - 0.548 x6 + 0.772 x9 + 0.058 x12 + 0.645 x15, in group A were 80.6% (+/- lSE) and 100% (+/- 2SE). The established mathematical models in this study could be applied for the stature estimation for Sichuan Han females.
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza
2012-01-01
Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people. PMID:23091560
Robust, Adaptive Functional Regression in Functional Mixed Model Framework.
Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S
2011-09-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework
Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.
2012-01-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets. PMID:22308015
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
[Study of blending method for the extracts of herbal plants].
Liu, Yongsuo; Cao, Min; Chen, Yuying; Hu, Yuzhu; Wang, Yiming; Luo, Guoan
2006-03-01
The irregularity in herbal plant composition is influenced by multiple factors. As for quality control of traditional Chinese medicine, the most critical challenge is to ensure the dosage content uniformity. This content uniformity can be improved by blending different batches of the extracts of herbal plants. Nonlinear least-squares regression was used to calculate the blending coefficient, which means no great absolute differences allowed for all ingredients. For traditional Chinese medicines, even relatively smaller differences could present to be very important for all the ingredients. The auto-scaling pretreatment was used prior to the calculation of the blending coefficients. The pretreatment buffered the characteristics of individual data for the ingredients in different batches, so an improved auto-scaling pretreatment method was proposed. With the improved auto-scaling pretreatment, the relative. differences decreased after blending different batches of extracts of herbal plants according to the reference samples. And the content uniformity control of the specific ingredients could be achieved by the error control coefficient. In the studies for the extracts of fructus gardeniae, the relative differences of all the ingredients is less than 3% after blending different batches of the extracts. The results showed that nonlinear least-squares regression can be used to calculate the blending coefficient of the herbal plant extracts.
Influence of droplet spacing on drag coefficient in nonevaporating, monodisperse streams
NASA Astrophysics Data System (ADS)
Mulholland, J. A.; Srivastava, R. K.; Wendt, J. O. L.
1988-10-01
Trajectory measurements on single, monodisperse, nonevaporating droplet streams whose droplet size, velocity, and spacing were varied to yield initial Re numbers in the 90-290 range are presently used to ascertain the influence of droplet spacing on the drag coefficient of individual drops injected into a quiescent environment. A trajectory model containing the local drag coefficient was fitted to the experimental data by a nonlinear regression; over 40 additional trajectories were predicted with acceptable accuracy. This formulation will aid the computation of waste-droplet drag in flames for improved combustion-generated pollutant predictions.
NASA Technical Reports Server (NTRS)
Lee, C. M.; Addy, H. E.; Bond, T. H.; Chun, K. S.; Lu, C. Y.
1987-01-01
The main objective of this report was to derive equations to estimate heat transfer coefficients in both the combustion chamber and coolant pasage of a rotary engine. This was accomplished by making detailed temperature and pressure measurements in a direct-injection stratified-charge rotary engine under a range of conditions. For each sppecific measurement point, the local physical properties of the fluids were calculated. Then an empirical correlation of the coefficients was derived by using a multiple regression program. This correlation expresses the Nusselt number as a function of the Prandtl number and Reynolds number.
Motlagh, Mohadeseh Ghanbari; Kafaky, Sasan Babaie; Mataji, Asadollah; Akhavan, Reza
2018-05-21
Hyrcanian forests of North of Iran are of great importance in terms of various economic and environmental aspects. In this study, Spot-6 satellite images and regression models were applied to estimate above-ground biomass in these forests. This research was carried out in six compartments in three climatic (semi-arid to humid) types and two altitude classes. In the first step, ground sampling methods at the compartment level were used to estimate aboveground biomass (Mg/ha). Then, by reviewing the results of other studies, the most appropriate vegetation indices were selected. In this study, three indices of NDVI, RVI, and TVI were calculated. We investigated the relationship between the vegetation indices and aboveground biomass measured at sample-plot level. Based on the results, the relationship between aboveground biomass values and vegetation indices was a linear regression with the highest level of significance for NDVI in all compartments. Since at the compartment level the correlation coefficient between NDVI and aboveground biomass was the highest, NDVI was used for mapping aboveground biomass. According to the results of this study, biomass values were highly different in various climatic and altitudinal classes with the highest biomass value observed in humid climate and high-altitude class.
Gierlinger, Notburga; Luss, Saskia; König, Christian; Konnerth, Johannes; Eder, Michaela; Fratzl, Peter
2010-01-01
The functional characteristics of plant cell walls depend on the composition of the cell wall polymers, as well as on their highly ordered architecture at scales from a few nanometres to several microns. Raman spectra of wood acquired with linear polarized laser light include information about polymer composition as well as the alignment of cellulose microfibrils with respect to the fibre axis (microfibril angle). By changing the laser polarization direction in 3 degrees steps, the dependency between cellulose and laser orientation direction was investigated. Orientation-dependent changes of band height ratios and spectra were described by quadratic linear regression and partial least square regressions, respectively. Using the models and regressions with high coefficients of determination (R(2) > 0.99) microfibril orientation was predicted in the S1 and S2 layers distinguished by the Raman imaging approach in cross-sections of spruce normal, opposite, and compression wood. The determined microfibril angle (MFA) in the different S2 layers ranged from 0 degrees to 49.9 degrees and was in coincidence with X-ray diffraction determination. With the prerequisite of geometric sample and laser alignment, exact MFA prediction can complete the picture of the chemical cell wall design gained by the Raman imaging approach at the micron level in all plant tissues.
Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing
2013-01-01
According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Feature Grouping and Selection Over an Undirected Graph.
Yang, Sen; Yuan, Lei; Lai, Ying-Cheng; Shen, Xiaotong; Wonka, Peter; Ye, Jieping
2012-01-01
High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the features has been considered to be promising in promoting regression/classification performance. Graph-guided fused lasso (GFlasso) has recently been proposed to facilitate feature selection and graph structure exploitation, when features exhibit certain graph structures. However, the formulation in GFlasso relies on pairwise sample correlations to perform feature grouping, which could introduce additional estimation bias. In this paper, we propose three new feature grouping and selection methods to resolve this issue. The first method employs a convex function to penalize the pairwise l ∞ norm of connected regression/classification coefficients, achieving simultaneous feature grouping and selection. The second method improves the first one by utilizing a non-convex function to reduce the estimation bias. The third one is the extension of the second method using a truncated l 1 regularization to further reduce the estimation bias. The proposed methods combine feature grouping and feature selection to enhance estimation accuracy. We employ the alternating direction method of multipliers (ADMM) and difference of convex functions (DC) programming to solve the proposed formulations. Our experimental results on synthetic data and two real datasets demonstrate the effectiveness of the proposed methods.
Characteristics of youth soccer players aged 13–15 years classified by skill level
Malina, Robert M; Ribeiro, Basil; Aroso, João; Cumming, Sean P
2007-01-01
Objective To evaluate the growth, maturity status and functional capacity of youth soccer players grouped by level of skill. Subjects The sample included 69 male players aged 13.2–15.1 years from clubs that competed in the highest division for their age group. Methods Height and body mass of players were measured and stage of pubic hair (PH) was assessed at clinical examination. Years of experience in football were obtained at interview. Three tests of functional capacity were administered: dash, vertical jump and endurance shuttle run. Performances on six soccer‐specific tests were converted to a composite score which was used to classify players into quintiles of skill. Multiple analysis of covariance, controlling for age, was used to test differences among skill groups in experience, growth status and functional capacity, whereas multiple linear regression analysis was used to estimate the relative contributions of age, years of training in soccer, stage of PH, height, body mass, the height×weight interaction and functional capacities to the composite skill score. Results The skill groups differed significantly in the intermittent endurance run (p<0.05) but not in the other variables. Only the difference between the highest and lowest skill groups in the endurance shuttle run was significant. Most players in the highest (12 of 14) and high (11 of 14) skill groups were in stages PH 4 and PH 5. Pubertal status and height accounted for 21% of the variance in the skill score; adding aerobic resistance to the regression increased the variance in skill accounted for to 29%. In both regressions, the coefficient for height was negative. Conclusion Adolescent soccer players aged 13–15 years classified by skill do not differ in age, experience, body size, speed and power, but differ in aerobic endurance, specifically at the extremes of skill. Stage of puberty and aerobic resistance (positive coefficients) and height (negative coefficient) are significant predictors of soccer skill (29% of the total explained variance), highlighting the inter‐relationship of growth, maturity and functional characteristics of youth soccer players. PMID:17224444
Determining the response of sea level to atmospheric pressure forcing using TOPEX/POSEIDON data
NASA Technical Reports Server (NTRS)
Fu, Lee-Lueng; Pihos, Greg
1994-01-01
The static response of sea level to the forcing of atmospheric pressure, the so-called inverted barometer (IB) effect, is investigated using TOPEX/POSEIDON data. This response, characterized by the rise and fall of sea level to compensate for the change of atmospheric pressure at a rate of -1 cm/mbar, is not associated with any ocean currents and hence is normally treated as an error to be removed from sea level observation. Linear regression and spectral transfer function analyses are applied to sea level and pressure to examine the validity of the IB effect. In regions outside the tropics, the regression coefficient is found to be consistently close to the theoretical value except for the regions of western boundary currents, where the mesoscale variability interferes with the IB effect. The spectral transfer function shows near IB response at periods of 30 degrees is -0.84 +/- 0.29 cm/mbar (1 standard deviation). The deviation from = 1 cm /mbar is shown to be caused primarily by the effect of wind forcing on sea level, based on multivariate linear regression model involving both pressure and wind forcing. The regression coefficient for pressure resulting from the multivariate analysis is -0.96 +/- 0.32 cm/mbar. In the tropics the multivariate analysis fails because sea level in the tropics is primarily responding to remote wind forcing. However, after removing from the data the wind-forced sea level estimated by a dynamic model of the tropical Pacific, the pressure regression coefficient improves from -1.22 +/- 0.69 cm/mbar to -0.99 +/- 0.46 cm/mbar, clearly revealing an IB response. The result of the study suggests that with a proper removal of the effect of wind forcing the IB effect is valid in most of the open ocean at periods longer than 20 days and spatial scales larger than 500 km.
Analysis of a Split-Plot Experimental Design Applied to a Low-Speed Wind Tunnel Investigation
NASA Technical Reports Server (NTRS)
Erickson, Gary E.
2013-01-01
A procedure to analyze a split-plot experimental design featuring two input factors, two levels of randomization, and two error structures in a low-speed wind tunnel investigation of a small-scale model of a fighter airplane configuration is described in this report. Standard commercially-available statistical software was used to analyze the test results obtained in a randomization-restricted environment often encountered in wind tunnel testing. The input factors were differential horizontal stabilizer incidence and the angle of attack. The response variables were the aerodynamic coefficients of lift, drag, and pitching moment. Using split-plot terminology, the whole plot, or difficult-to-change, factor was the differential horizontal stabilizer incidence, and the subplot, or easy-to-change, factor was the angle of attack. The whole plot and subplot factors were both tested at three levels. Degrees of freedom for the whole plot error were provided by replication in the form of three blocks, or replicates, which were intended to simulate three consecutive days of wind tunnel facility operation. The analysis was conducted in three stages, which yielded the estimated mean squares, multiple regression function coefficients, and corresponding tests of significance for all individual terms at the whole plot and subplot levels for the three aerodynamic response variables. The estimated regression functions included main effects and two-factor interaction for the lift coefficient, main effects, two-factor interaction, and quadratic effects for the drag coefficient, and only main effects for the pitching moment coefficient.
Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku
2013-09-01
Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p < 0.01), bizygomatic breadth (p < 0.01) and head height (p < 0.05), and a negative relationship between CI and morphological facial height (p < 0.01) and head circumference (p < 0.01). Moreover, the coefficient and odds ratio of logistic regression analysis showed a greater likelihood for minimum frontal breadth (p < 0.01) and bizygomatic breadth (p < 0.01) to predict round-headedness, and morphological facial height (p < 0.05) and head circumference (p < 0.01) to predict long-headedness. Stepwise regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.
Kaneko, Hiromasa; Funatsu, Kimito
2013-09-23
We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regression models are updated, whereas cross-validation cannot be performed in such a situation. The proposed method is effective and helpful in handling big data when cross-validation cannot be applied. By analyzing data from numerical simulations and quantitative structural relationships, we confirm that the proposed criteria enable the predictive ability of the nonlinear regression models to be appropriately quantified.
NASA Astrophysics Data System (ADS)
Sharudin, R. W.; AbdulBari Ali, S.; Zulkarnain, M.; Shukri, M. A.
2018-05-01
This study reports on the integration of Artificial Neural Network (ANNs) with experimental data in predicting the solubility of carbon dioxide (CO2) blowing agent in SEBS by generating highest possible value for Regression coefficient (R2). Basically, foaming of thermoplastic elastomer with CO2 is highly affected by the CO2 solubility. The ability of ANN in predicting interpolated data of CO2 solubility was investigated by comparing training results via different method of network training. Regards to the final prediction result for CO2 solubility by ANN, the prediction trend (output generate) was corroborated with the experimental results. The obtained result of different method of training showed the trend of output generated by Gradient Descent with Momentum & Adaptive LR (traingdx) required longer training time and required more accurate input to produce better output with final Regression Value of 0.88. However, it goes vice versa with Levenberg-Marquardt (trainlm) technique as it produced better output in quick detention time with final Regression Value of 0.91.
Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian
2016-01-01
Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135
USDA-ARS?s Scientific Manuscript database
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immun...
ERIC Educational Resources Information Center
Pecorella, Patricia A.; Bowers, David G.
Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…
The intermediate endpoint effect in logistic and probit regression
MacKinnon, DP; Lockwood, CM; Brown, CH; Wang, W; Hoffman, JM
2010-01-01
Background An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. Purpose The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. Methods The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Results Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. Limitations More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Conclusions Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted conclusions regarding the intermediate effect. PMID:17942466
NASA Astrophysics Data System (ADS)
Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.
2013-10-01
In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.
Chen, Gang; Wu, Yulian; Wang, Tao; Liang, Jixing; Lin, Wei; Li, Liantao; Wen, Junping; Lin, Lixiang; Huang, Huibin
2012-10-01
The role of the endogenous secretory receptor for advanced glycation end products (esRAGE) in depression of diabetes patients and its clinical significance are unclear. This study investigated the role of serum esRAGE in patients with type 2 diabetes mellitus with depression in the Chinese population. One hundred nineteen hospitalized patients with type 2 diabetes were recruited at Fujian Provincial Hospital (Fuzhou, China) from February 2010 to January 2011. All selected subjects were assessed with the Hamilton Rating Scale for Depression (HAMD). Among them, 71 patients with both type 2 diabetes and depression were included. All selected subjects were examined for the following: esRAGE concentration, glycosylated hemoglobin (HbA1c), blood lipids, C-reactive protein, trace of albumin in urine, and carotid artery intima-media thickness (IMT). Association between serum esRAGE levels and risk of type 2 diabetes mellitus with depression was also analyzed. There were statistically significant differences in gender, age, body mass index, waist circumference, and treatment methods between the group with depression and the group without depression (P<0.05). Multiple linear regression analysis showed that HAMD scores were negatively correlated with esRAGE levels (standard regression coefficient -0.270, P<0.01). HAMD-17 scores were positively correlated with IMT (standard regression coefficient 0.183, P<0.05) and with HbA1c (standard regression coefficient 0.314, P<0.01). Female gender, younger age, obesity, poor glycemic control, complications, and insulin therapy are all risk factors of type 2 diabetes mellitus with combined depression in the Chinese population. Inflammation and atherosclerosis play an important role in the pathogenesis of depression. esRAGE is a protective factor of depression among patients who have type 2 diabetes.
Daily Magnesium Intake and Serum Magnesium Concentration among Japanese People
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
Background The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. Methods The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. Results The mean (±standard deviation) daily magnesium intake was 322 (±132), 323 (±163), and 322 (±147) mg/day for men, women, and the entire group, respectively. The mean (±standard deviation) serum magnesium concentration was 20.69 (±2.83), 20.69 (±2.88), and 20.69 (±2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log10X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. Conclusion The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed. PMID:18635902
McCurdy, M; Bellows, A; Deng, D; Leppert, M; Mahone, E; Pritchard, A
2015-01-01
Reliable and valid screening and assessment tools are necessary to identify children at risk for neurodevelopmental disabilities who may require additional services. This study evaluated the test-retest reliability of the Capute Scales in a high-risk sample, hypothesizing adequate reliability across 6- and 12-month intervals. Capute Scales scores (N = 66) were collected via retrospective chart review from a NICU follow-up clinic within a large urban medical center spanning three age-ranges: 12-18, 19-24, and 25-36 months. On average, participants were classified as very low birth weight and premature. Reliability of the Capute Scales was evaluated with intraclass correlation coefficients across length of test-retest interval, age at testing, and degree of neonatal complications. The Capute Scales demonstrated high reliability, regardless of length of test-retest interval (ranging from 6 to 14 months) or age of participant, for all index scores, including overall Developmental Quotient (DQ), language-based skill index (CLAMS) and nonverbal reasoning index (CAT). Linear regressions revealed that greater neonatal risk was related to poorer test-retest reliability; however, reliability coefficients remained strong. The Capute Scales afford clinicians a reliable and valid means of screening and assessing for neurodevelopmental delay within high-risk infant populations.
1990-05-01
THUMBBR THUMB BREADTH NO. VARIABLE CONSTANT REGRESS. COEF. ST.ERROR ADJUQTED (.ERR OF ESTIMATE E4 58 HANDBRTH 7.623 0.183 ( 0.006) 1.124 .319 59 HANDCIRC...945, 956, 965 TRAGION TO TOP OF HEAD (TRAGT, 255) 39,51, 718, 851, 899, 945, 956, 965 Trapezius Post 23 Trochanter 23 TROCHArNTERION HEIGHT (TROQINT...THGHCLR, 105) 33, 40, 673, M08 881,927, 955, 964 Thigh Poia 23 THUMB BREADTH (THUMBBR, 106) 34, 49, 674,88 882,94955, 964 ThumlAip 23 THUMBTIP REACH
NASA Astrophysics Data System (ADS)
Kozioł, Michał
2017-10-01
The article presents a parametric model describing the registered distributions spectrum of optical radiation emitted by electrical discharges generated in the systems: the needle- needle, the needleplate and in the system for surface discharges. Generation of electrical discharges and registration of the emitted radiation was carried out in three different electrical insulating oils: fabric new, operated (used) and operated with air bubbles. For registration of optical spectra in the range of ultraviolet, visible and near infrared a high resolution spectrophotometer was. The proposed mathematical model was developed in a regression procedure using gauss-sigmoid type function. The dependent variable was the intensity of the recorded optical signals. In order to estimate the optimal parameters of the model an evolutionary algorithm was used. The optimization procedure was performed in Matlab environment. For determination of the matching quality of theoretical parameters of the regression function to the empirical data determination coefficient R2 was applied.
New 1,6-heptadienes with pyrimidine bases attached: Syntheses and spectroscopic analyses
NASA Astrophysics Data System (ADS)
Hammud, Hassan H.; Ghannoum, Amer M.; Fares, Fares A.; Abramian, Lara K.; Bouhadir, Kamal H.
2008-06-01
A simple, high yielding synthesis leading to the functionalization of some pyrimidine bases with a 1,6-heptadienyl moiety spaced from the N - 1 position by a methylene group is described. A key step in this synthesis involves a Mitsunobu reaction by coupling 3N-benzoyluracil and 3N-benzoylthymine to 2-allyl-pent-4-en-1-ol followed by alkaline hydrolysis of the 3N-benzoyl protecting groups. This protocol should eventually lend itself to the synthesis of a host of N-alkylated nucleoside analogs. The absorption and emission properties of these pyrimidine derivatives ( 3- 6) were studied in solvents of different physical properties. Computerized analysis and multiple regression techniques were applied to calculate the regression and correlation coefficients based on the equation that relates peak position λmax to the solvent parameters that depend on the H-bonding ability, refractive index, and dielectric constant of solvents.
Predicting Active Users' Personality Based on Micro-Blogging Behaviors
Hao, Bibo; Guan, Zengda; Zhu, Tingshao
2014-01-01
Because of its richness and availability, micro-blogging has become an ideal platform for conducting psychological research. In this paper, we proposed to predict active users' personality traits through micro-blogging behaviors. 547 Chinese active users of micro-blogging participated in this study. Their personality traits were measured by the Big Five Inventory, and digital records of micro-blogging behaviors were collected via web crawlers. After extracting 845 micro-blogging behavioral features, we first trained classification models utilizing Support Vector Machine (SVM), differentiating participants with high and low scores on each dimension of the Big Five Inventory. The classification accuracy ranged from 84% to 92%. We also built regression models utilizing PaceRegression methods, predicting participants' scores on each dimension of the Big Five Inventory. The Pearson correlation coefficients between predicted scores and actual scores ranged from 0.48 to 0.54. Results indicated that active users' personality traits could be predicted by micro-blogging behaviors. PMID:24465462
Design of experiments enhanced statistical process control for wind tunnel check standard testing
NASA Astrophysics Data System (ADS)
Phillips, Ben D.
The current wind tunnel check standard testing program at NASA Langley Research Center is focused on increasing data quality, uncertainty quantification and overall control and improvement of wind tunnel measurement processes. The statistical process control (SPC) methodology employed in the check standard testing program allows for the tracking of variations in measurements over time as well as an overall assessment of facility health. While the SPC approach can and does provide researchers with valuable information, it has certain limitations in the areas of process improvement and uncertainty quantification. It is thought by utilizing design of experiments methodology in conjunction with the current SPC practices that one can efficiently and more robustly characterize uncertainties and develop enhanced process improvement procedures. In this research, methodologies were developed to generate regression models for wind tunnel calibration coefficients, balance force coefficients and wind tunnel flow angularities. The coefficients of these regression models were then tracked in statistical process control charts, giving a higher level of understanding of the processes. The methodology outlined is sufficiently generic such that this research can be applicable to any wind tunnel check standard testing program.
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Tashiro, Atsushi; Aida, Jun; Shobugawa, Yugo; Fujiyama, Yuki; Yamamoto, Tatsuo; Saito, Reiko; Kondo, Katsunori
2017-01-01
Objectives Personal income affects dental status in older people. However, the impact of income inequality on dental status at the community level (junior high school district) is unclear. The purpose of this study was to examine the association between dental status and community level income inequity after adjust for individual socio-economic status in Japanese older adults, and to verify the relative income hypothesis, also known as the Wilkinson hypothesis.Methods We used data from the Japan Gerontological Evaluation Study (JAGES) conducted in Niigata city. JAGES is a postal survey of functionally independent adults aged 65 years or older. We enrolled 4,983 respondents (response rate 62.3%) and used data on 3,980 of them after excluding incomplete data. We evaluated health condition and socio-economic status using questionnaires. The Gini coefficient, as an indicator of income inequality, was calculated by junior high school district (57 districts) based on the data from the questionnaire. Additionally, the Pearson's coefficient of correlation was calculated to evaluate the association between the mean number of remaining teeth and the community level Gini coefficient. Then we evaluated the mean number of remaining teeth among the groups stratified by the Gini coefficient conditions. Next, we conducted a multilevel analysis using an ordinal logistic regression model. The number of remaining teeth was set as the dependent variable, while sex, age, household size, education, smoking status, diabetes treatment, current living conditions, and equivalent income were used as independent variables at the individual level. The Gini coefficient and average equivalent income in the junior high school district were used as independent variables at the community level.Results The Pearson's correlation coefficient for the relationship between the Gini coefficient and the mean number of remaining teeth in the junior high school district was -0.44 (P<0.01). Wider income disparity area (Gini coefficient≧0.35) revealed a significantly small number of remaining teeth (P<0.001). The multilevel analysis showed that a higher Gini coefficient and a lower average equivalent income at the community level were significantly associated with a lower number of remaining teeth, and with educational attainment, smoking status, current living conditions, and equivalent income at the individual level, after adjusting for sex and age. On the other hand, educational attainment at the individual level, and average equivalent income at the community level were not significant factors after adjusting for all individual level variables.Conclusion This study showed that, in addition to individual socio-economic status, income inequality at the community level was significantly associated with number of remaining teeth in Japanese older adults. Although the precise mechanism of this association is still unclear, our result supports the relative income hypothesis.
Meili, Marc; Kutz, Alexander; Briel, Matthias; Christ-Crain, Mirjam; Bucher, Heiner C; Mueller, Beat; Schuetz, Philipp
2016-03-24
There is a lack of studies comparing the utility of C-reactive protein (CRP) with Procalcitonin (PCT) for the management of patients with acute respiratory tract infections (ARI) in primary care. Our aim was to study the correlation between these markers and to compare their predictive accuracy in regard to clinical outcome prediction. This is a secondary analysis using clinical and biomarker data of 458 primary care patients with pneumonic and non-pneumonic ARI. We used correlation statistics (spearman's rank test) and multivariable regression models to assess association of markers with adverse outcome, namely days with restricted activities and persistence of discomfort from infection at day 14. At baseline, CRP and PCT did not correlate well in the overall population (r(2) = 0.16) and particularly in the subgroup of patients with non-pneumonic ARI (r(2) = 0.08). Low correlation of biomarkers were also found when comparing cut-off ranges, day seven levels or changes from baseline to day seven. High baseline levels of CRP (>100 mg/dL, regression coefficient 1.6, 95 % CI 0.5 to 2.6, sociodemographic-adjusted model) as well as PCT (>0.5ug/L regression coefficient 2.0, 95 % CI 0.0 to 4.0, sociodemographic-adjusted model) were significantly associated with larger number of days with restricted activities. There were no associations of either biomarker with persistence of discomfort at day 14. CRP and PCT levels do not well correlate, but both have moderate prognostic accuracy in primary care patients with ARI to predict clinical outcomes. The low correlation between the two biomarkers calls for interventional research comparing these markers head to head in regard to their ability to guide antibiotic decisions. Current Controlled Trials, ISRCTN73182671.
Hewitt, Angela L.; Popa, Laurentiu S.; Pasalar, Siavash; Hendrix, Claudia M.
2011-01-01
Encoding of movement kinematics in Purkinje cell simple spike discharge has important implications for hypotheses of cerebellar cortical function. Several outstanding questions remain regarding representation of these kinematic signals. It is uncertain whether kinematic encoding occurs in unpredictable, feedback-dependent tasks or kinematic signals are conserved across tasks. Additionally, there is a need to understand the signals encoded in the instantaneous discharge of single cells without averaging across trials or time. To address these questions, this study recorded Purkinje cell firing in monkeys trained to perform a manual random tracking task in addition to circular tracking and center-out reach. Random tracking provides for extensive coverage of kinematic workspaces. Direction and speed errors are significantly greater during random than circular tracking. Cross-correlation analyses comparing hand and target velocity profiles show that hand velocity lags target velocity during random tracking. Correlations between simple spike firing from 120 Purkinje cells and hand position, velocity, and speed were evaluated with linear regression models including a time constant, τ, as a measure of the firing lead/lag relative to the kinematic parameters. Across the population, velocity accounts for the majority of simple spike firing variability (63 ± 30% of Radj2), followed by position (28 ± 24% of Radj2) and speed (11 ± 19% of Radj2). Simple spike firing often leads hand kinematics. Comparison of regression models based on averaged vs. nonaveraged firing and kinematics reveals lower Radj2 values for nonaveraged data; however, regression coefficients and τ values are highly similar. Finally, for most cells, model coefficients generated from random tracking accurately estimate simple spike firing in either circular tracking or center-out reach. These findings imply that the cerebellum controls movement kinematics, consistent with a forward internal model that predicts upcoming limb kinematics. PMID:21795616
Zhou, Hua-ying; Luo, Yue; Chen, Wen-dong; Gong, Guo-zhong
2015-06-01
A number of studies have confirmed that antiviral therapy with nucleotide analogs (NAs) can improve the prognosis of hepatitis B virus (HBV)-related hepatocellular carcinoma (HCC) after curative therapy. However, what factors affected the prognosis of HBV-HCC after removal of the primary tumor and inhibition of HBV replication? A meta-regression analysis was conducted to explore the prognostic factor for this subgroup of patients. MEDLINE, EMBASE, Web of Science, and Cochrane library were searched from January 1995 to February 2014 for clinical trials evaluating the effect of NAs on the prognosis of HBV-HCC after curative therapy. Data were extracted for host, viral, and intervention information. Single-arm meta-analysis was performed to assess overall survival (OS) rates and HCC recurrence. Meta-regression analysis was carried out to explore risk factors for 1-year OS rate and HCC recurrence for HBV-HCC patients after curative therapy and antiviral therapy. Fourteen observational studies with 1284 patients met the inclusion criteria. Influential factors for prognosis of HCC were mainly baseline HBeAg positivity, cirrhotic stage, advanced Tumor-Node-Metastasis (TNM) stage, macrovascular invasion, and antiviral agent type. The 1-year OS rate decreased by more than four times (coefficient -4.45, P<0.001) and the 1-year HCC recurrence increased by more than one time (coefficient 1.20, P=0.003) when lamivudine was chosen for HCC after curative therapy, relative to entecavir for HCC. HBV mutation may play a role in HCC recurrence. Entecavir or tenofovir, a high genetic barrier to resistance, should be recommended for HBV-HCC patients. © 2015 The Authors. Journal of Gastroenterology and Hepatology published by Journal of Gastroenterology and Hepatology Foundation and Wiley Publishing Asia Pty Ltd.
Surányi, A; Kozinszky, Z; Molnár, A; Nyári, T; Bitó, T; Pál, A
2013-10-01
The aim of our study was to evaluate placental three-dimensional power Doppler indices in diabetic pregnancies in the second and third trimesters and to compare them with those of the normal controls. Placental vascularization of pregnant women was determined by three-dimensional power Doppler ultrasound technique. The calculated indices included vascularization index (VI), flow index (FI), and vascularization flow index (VFI). Uncomplicated pregnancies (n = 113) were compared with pregnancies complicated by gestational diabetes mellitus (n = 56) and diabetes mellitus (n = 43). The three-dimensional power Doppler indices were not significantly different between the two diabetic subgroups. All the indices in diabetic patients were significantly reduced compared with those in non-diabetic individuals (p < 0.001). Placental three-dimensional power Doppler indices are slightly diminished throughout diabetic pregnancy [regression coefficients: -0.23 (FI), -0.06 (VI), and -0.04 (VFI)] and normal pregnancy [regression coefficients: -0.13 (FI), -0.20 (VI), and -0.11 (VFI)]. The uteroplacental circulation (umbilical and uterine artery) was not correlated significantly to the three-dimensional power Doppler indices. If all placental indices are low during late pregnancy, then the odds of the diabetes are significantly high (adjusted odds ratio: 1.10). A decreased placental vascularization could be an adjunct sonographic marker in the diagnosis of diabetic pregnancy in mid-gestation and late gestation. © 2013 John Wiley & Sons, Ltd.
Inequality in Maternal Mortality in Iran: An Ecologic Study
Tajik, Parvin; Nedjat, Saharnaz; Afshar, Nozhat Emami; Changizi, Nasrin; Yazdizadeh, Bahareh; Azemikhah, Arash; Aamrolalaei, Sima; Majdzadeh, Reza
2012-01-01
Background: Maternal mortality (MM) is an avoidable death and there is national, international and political commitment to reduce it. The objective of this study is to examine the relation of MM to socioeconomic factors and its inequality in Iran's provinces at an ecologic level. Methods: The overall MM from each province was considered for 3 years from 2004 to 2006. The five independent variables whose relations were studied included the literacy rate among men and women in each province, mean annual household income per capita, Gini coefficients in each province, and Human Development Index (HDI). The correlation of Maternal Mortality Ratio (MMR) to the above five variables was evaluated through Pearson's correlation coefficient (simple and weighted for each province's population) and linear regression – by considering MMR as the dependent variable and the Gini coefficient, HDI, and difference in literacy rate among men and women as the independent variables. Results: The mean MMR in the years 2004–2006 was 24.7 in 100,000 live births. The correlation coefficients between MMR and literacy rate among women, literacy rate among men, the mean annual household income per capita, Gini coefficient and HDI were 0.82, 0.90, –0.61, 0.52 and –0.77, respectively. Based on multivariate regression, MMR was significantly associated with HDI (standardized B=–0.93) and difference in literacy rate among men and women (standardized B=–0.47). However, MMR was not significantly associated with the Gini coefficient. Conclusion: This study shows the association between socioeconomic variables and their inequalities with MMR in Iran's provinces at an ecologic level. In addition to the other direct interventions performed to reduce MM, it seems essential to especially focus on more distal factors influencing MMR. PMID:22347608
NASA Astrophysics Data System (ADS)
Kelly, B.; Chelsky, A.; Bulygina, E.; Roberts, B. J.
2017-12-01
Remote sensing techniques have become valuable tools to researchers, providing the capability to measure and visualize important parameters without the need for time or resource intensive sampling trips. Relationships between dissolved organic carbon (DOC), colored dissolved organic matter (CDOM) and spectral data have been used to remotely sense DOC concentrations in riverine systems, however, this approach has not been applied to the northern Gulf of Mexico (GoM) and needs to be tested to determine how accurate these relationships are in riverine-dominated shelf systems. In April, July, and October 2017 we sampled surface water from 80+ sites over an area of 100,000 km2 along the Louisiana-Texas shelf in the northern GoM. DOC concentrations were measured on filtered water samples using a Shimadzu TOC-VCSH analyzer using standard techniques. Additionally, DOC concentrations were estimated from CDOM absorption coefficients of filtered water samples on a UV-Vis spectrophotometer using a modification of the methods of Fichot and Benner (2011). These values were regressed against Landsat visible band spectral data for those same locations to establish a relationship between the spectral data, CDOM absorption coefficients. This allowed us to spatially map CDOM absorption coefficients in the Gulf of Mexico using the Landsat spectral data in GIS. We then used a multiple linear regressions model to derive DOC concentrations from the CDOM absorption coefficients and applied those to our map. This study provides an evaluation of the viability of scaling up CDOM absorption coefficient and remote-sensing derived estimates of DOC concentrations to the scale of the LA-TX shelf ecosystem.
Mechanisms behind the estimation of photosynthesis traits from leaf reflectance observations
NASA Astrophysics Data System (ADS)
Dechant, Benjamin; Cuntz, Matthias; Doktor, Daniel; Vohland, Michael
2016-04-01
Many studies have investigated the reflectance-based estimation of leaf chlorophyll, water and dry matter contents of plants. Only few studies focused on photosynthesis traits, however. The maximum potential uptake of carbon dioxide under given environmental conditions is determined mainly by RuBisCO activity, limiting carboxylation, or the speed of photosynthetic electron transport. These two main limitations are represented by the maximum carboxylation capacity, V cmax,25, and the maximum electron transport rate, Jmax,25. These traits were estimated from leaf reflectance before but the mechanisms underlying the estimation remain rather speculative. The aim of this study was therefore to reveal the mechanisms behind reflectance-based estimation of V cmax,25 and Jmax,25. Leaf reflectance, photosynthetic response curves as well as nitrogen content per area, Narea, and leaf mass per area, LMA, were measured on 37 deciduous tree species. V cmax,25 and Jmax,25 were determined from the response curves. Partial Least Squares (PLS) regression models for the two photosynthesis traits V cmax,25 and Jmax,25 as well as Narea and LMA were studied using a cross-validation approach. Analyses of linear regression models based on Narea and other leaf traits estimated via PROSPECT inversion, PLS regression coefficients and model residuals were conducted in order to reveal the mechanisms behind the reflectance-based estimation. We found that V cmax,25 and Jmax,25 can be estimated from leaf reflectance with good to moderate accuracy for a large number of species and different light conditions. The dominant mechanism behind the estimations was the strong relationship between photosynthesis traits and leaf nitrogen content. This was concluded from very strong relationships between PLS regression coefficients, the model residuals as well as the prediction performance of Narea- based linear regression models compared to PLS regression models. While the PLS regression model for V cmax,25 was fully based on the correlation to Narea, the PLS regression model for Jmax,25 was not entirely based on it. Analyses of the contributions of different parts of the reflectance spectrum revealed that the information contributing to the Jmax,25 PLS regression model in addition to the main source of information, Narea, was mainly located in the visible part of the spectrum (500-900 nm). Estimated chlorophyll content could be excluded as potential source of this extra information. The PLS regression coefficients of the Jmax,25 model indicated possible contributions from chlorophyll fluorescence and cytochrome f content. In summary, we found that the main mechanism behind the estimation of V cmax,25 and Jmax,25 from leaf reflectance observations is the correlation to Narea but that there is additional information related to Jmax,25 mainly in the visible part of the spectrum.
Prediction of soil organic carbon partition coefficients by soil column liquid chromatography.
Guo, Rongbo; Liang, Xinmiao; Chen, Jiping; Wu, Wenzhong; Zhang, Qing; Martens, Dieter; Kettrup, Antonius
2004-04-30
To avoid the limitation of the widely used prediction methods of soil organic carbon partition coefficients (KOC) from hydrophobic parameters, e.g., the n-octanol/water partition coefficients (KOW) and the reversed phase high performance liquid chromatographic (RP-HPLC) retention factors, the soil column liquid chromatographic (SCLC) method was developed for KOC prediction. The real soils were used as the packing materials of RP-HPLC columns, and the correlations between the retention factors of organic compounds on soil columns (ksoil) and KOC measured by batch equilibrium method were studied. Good correlations were achieved between ksoil and KOC for three types of soils with different properties. All the square of the correlation coefficients (R2) of the linear regression between log ksoil and log KOC were higher than 0.89 with standard deviations of less than 0.21. In addition, the prediction of KOC from KOW and the RP-HPLC retention factors on cyanopropyl (CN) stationary phase (kCN) was comparatively evaluated for the three types of soils. The results show that the prediction of KOC from kCN and KOW is only applicable to some specific types of soils. The results obtained in the present study proved that the SCLC method is appropriate for the KOC prediction for different types of soils, however the applicability of using hydrophobic parameters to predict KOC largely depends on the properties of soil concerned.
Ramírez-Vélez, Robinson; Correa-Bautista, Jorge Enrique; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavides, Daniel Humberto; Carrillo, Hugo Alejandro; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio; García-Hermoso, Antonio
2017-01-17
Recently, a body adiposity index (BAI = (hip circumference)/((height)(1.5)) -18 ) was developed and validated in adult populations. The aim of this study was to evaluate the performance of BAI in estimating percentage body fat (BF%) in a sample of Colombian collegiate young adults. The participants were comprised of 903 volunteers (52% females, mean age = 21.4 years ± 3.3). We used the Lin's concordance correlation coefficient, linear regression, Bland-Altman's agreement analysis, concordance correlation coefficient ( ρc ) and the coefficient of determination ( R ²) between BAI, and BF%; by bioelectrical impedance analysis (BIA)). The correlation between the two methods of estimating BF% was R ² = 0.384, p < 0.001. A paired-sample t -test showed a difference between the methods (BIA BF% = 16.2 ± 3.1, BAI BF% = 30.0 ± 5.4%; p < 0.001). For BIA, bias value was 6.0 ± 6.2 BF% (95% confidence interval (CI) = -6.0 to 18.2), indicating that the BAI method overestimated BF% relative to the reference method. Lin's concordance correlation coefficient was poor ( ρc = 0.014, 95% CI = -0.124 to 0.135; p = 0.414). In Colombian college students, there was poor agreement between BAI- and BIA-based estimates of BF%, and so BAI is not accurate in people with low or high body fat percentage levels.
Hasmi, Laila; Drukker, Marjan; Guloksuz, Sinan; Menne-Lothmann, Claudia; Decoster, Jeroen; van Winkel, Ruud; Collip, Dina; Delespaul, Philippe; De Hert, Marc; Derom, Catherine; Thiery, Evert; Jacobs, Nele; Rutten, Bart P. F.; Wichers, Marieke; van Os, Jim
2017-01-01
Background: The network analysis of intensive time series data collected using the Experience Sampling Method (ESM) may provide vital information in gaining insight into the link between emotion regulation and vulnerability to psychopathology. The aim of this study was to apply the network approach to investigate whether genetic liability (GL) to psychopathology and childhood trauma (CT) are associated with the network structure of the emotions “cheerful,” “insecure,” “relaxed,” “anxious,” “irritated,” and “down”—collected using the ESM method. Methods: Using data from a population-based sample of twin pairs and siblings (704 individuals), we examined whether momentary emotion network structures differed across strata of CT and GL. GL was determined empirically using the level of psychopathology in monozygotic and dizygotic co-twins. Network models were generated using multilevel time-lagged regression analysis and were compared across three strata (low, medium, and high) of CT and GL, respectively. Permutations were utilized to calculate p values and compare regressions coefficients, density, and centrality indices. Regression coefficients were presented as connections, while variables represented the nodes in the network. Results: In comparison to the low GL stratum, the high GL stratum had significantly denser overall (p = 0.018) and negative affect network density (p < 0.001). The medium GL stratum also showed a directionally similar (in-between high and low GL strata) but statistically inconclusive association with network density. In contrast to GL, the results of the CT analysis were less conclusive, with increased positive affect density (p = 0.021) and overall density (p = 0.042) in the high CT stratum compared to the medium CT stratum but not to the low CT stratum. The individual node comparisons across strata of GL and CT yielded only very few significant results, after adjusting for multiple testing. Conclusions: The present findings demonstrate that the network approach may have some value in understanding the relation between established risk factors for mental disorders (particularly GL) and the dynamic interplay between emotions. The present finding partially replicates an earlier analysis, suggesting it may be instructive to model negative emotional dynamics as a function of genetic influence. PMID:29163289
ERIC Educational Resources Information Center
Choi, Kilchan; Seltzer, Michael
2010-01-01
In studies of change in education and numerous other fields, interest often centers on how differences in the status of individuals at the start of a period of substantive interest relate to differences in subsequent change. In this article, the authors present a fully Bayesian approach to estimating three-level Hierarchical Models in which latent…
Pezzei, Cornelia K; Schönbichler, Stefan A; Hussain, Shah; Kirchler, Christian G; Huck-Pezzei, Verena A; Popp, Michael; Krolitzek, Justine; Bonn, Günther K; Huck, Christian W
2018-04-01
In this study, novel near-infrared and attenuated total reflectance mid-infrared spectroscopic methods coupled with multivariate data analysis were established enabling the determination of thymol, rosmarinic acid, and the antioxidant capacity of Thymi herba. A new high-performance liquid chromatography method and UV-Vis spectroscopy were applied as reference methods. Partial least squares regressions were carried out as cross and test set validations. To reduce systematic errors, different data pretreatments, such as multiplicative scatter correction, 1st derivative, or 2nd derivative, were applied on the spectra. The performances of the two infrared spectroscopic techniques were evaluated and compared. In general, attenuated total reflectance mid-infrared spectroscopy demonstrated a slightly better predictive power (thymol: coefficient of determination = 0.93, factors = 3, ratio of performance to deviation = 3.94; rosmarinic acid: coefficient of determination = 0.91, factors = 3, ratio of performance to deviation = 3.35, antioxidant capacity: coefficient of determination = 0.87, factors = 2, ratio of performance to deviation = 2.80; test set validation) than near-infrared spectroscopy (thymol: coefficient of determination = 0.90, factors = 6, ratio of performance to deviation = 3.10; rosmarinic acid: coefficient of determination = 0.92, factors = 6, ratio of performance to deviation = 3.61, antioxidant capacity: coefficient of determination = 0.91, factors = 6, ratio of performance to deviation = 3.42; test set validation). The capability of infrared vibrational spectroscopy as a quick and simple analytical tool to replace conventional time and chemical consuming analyses for the quality control of T. herba could be demonstrated. Georg Thieme Verlag KG Stuttgart · New York.
Villar Balboa, Iván; Carrillo Muñoz, Ricard; Regí Bosque, Meritxell; Marzo Castillejo, Mercè; Arcusa Villacampa, Núria; Segundo Yagüe, Marta
2014-04-01
To describe the relationship between individual or combined prognostic factors in the multidimensional classifications (BODE and ADO), and health-related quality of life (HRQOL) in patients with chronic obstructive pulmonary disease (COPD). Cross-sectional descriptive study. Primary care. Systematic random sample of 102 patients diagnosed with COPD, excluding those patients with acute exacerbation, dementia, terminal illness or those who receive home care. Demographics variables, smoking habits, body mass index and number of exacerbations. Comorbidity. Degree of dyspnea. Respiratory function tests. Exercise capacity. The BODE index and the ADO index. The EuroQol-5D questionnaire (EQ-5D), and visual analogue scale (VAS). EQ-5D: mobility: 43.9%; personal care: 13.3%; daily-life activities: 29.6%; pain/discomfort: 55.1%; anxiety/depression: 37.8%, and 34.7% VAS ≤ 60%. Exacerbations: Mobility, OR: 1.85 (95%CI: 1.08-3.20); personal care, OR: 2.12 (95%CI: 1.3-4.76); daily-life activities, OR: 2.35 (95%CI: 1.17-4.71); VAS, regression coefficient: -3.50 (95%CI: 6.31- -0.70). Dyspnea: mobility, OR: 4.47 (95%CI: 1.39-14.42); daily-life activities, OR: 7.71 (95%CI: 2.03-12.34); VAS, regression coefficient: -7.15 (95%CI: 11.71- -2.59). BODE: mobility, OR: 1.53 (95%CI: 1.15-2.02); personal care, OR: 2.08 (95%CI: 1.40-3.11); daily-life activities, OR: 1.97 (95%CI: 1.38-2.80); VAS, regression coefficient: -3.96 (95%CI: -5.51- -2.42). ADO: mobility, OR: 2.42 (95%CI: 1.39-4.20); personal care, OR: 3.21 (95%CI: 1.67-6.18); daily-life activities, OR: 3.17 (95%CI: 1.69-5.93); VAS, regression coefficient: -3.53 (95%CI: -5.57- -1.49). The BODE index and the ADO index showed a significant association with HRQOL. Exacerbations and dyspnea were the best individual factors related to HRQoL. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Minute ventilation of cyclists, car and bus passengers: an experimental study.
Zuurbier, Moniek; Hoek, Gerard; van den Hazel, Peter; Brunekreef, Bert
2009-10-27
Differences in minute ventilation between cyclists, pedestrians and other commuters influence inhaled doses of air pollution. This study estimates minute ventilation of cyclists, car and bus passengers, as part of a study on health effects of commuters' exposure to air pollutants. Thirty-four participants performed a submaximal test on a bicycle ergometer, during which heart rate and minute ventilation were measured simultaneously at increasing cycling intensity. Individual regression equations were calculated between heart rate and the natural log of minute ventilation. Heart rates were recorded during 280 two hour trips by bicycle, bus and car and were calculated into minute ventilation levels using the individual regression coefficients. Minute ventilation during bicycle rides were on average 2.1 times higher than in the car (individual range from 1.3 to 5.3) and 2.0 times higher than in the bus (individual range from 1.3 to 5.1). The ratio of minute ventilation of cycling compared to travelling by bus or car was higher in women than in men. Substantial differences in regression equations were found between individuals. The use of individual regression equations instead of average regression equations resulted in substantially better predictions of individual minute ventilations. The comparability of the gender-specific overall regression equations linking heart rate and minute ventilation with one previous American study, supports that for studies on the group level overall equations can be used. For estimating individual doses, the use of individual regression coefficients provides more precise data. Minute ventilation levels of cyclists are on average two times higher than of bus and car passengers, consistent with the ratio found in one small previous study of young adults. The study illustrates the importance of inclusion of minute ventilation data in comparing air pollution doses between different modes of transport.
NASA Astrophysics Data System (ADS)
Li, Wang; Niu, Zheng; Gao, Shuai; Wang, Cheng
2014-11-01
Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are two competitive active remote sensing techniques in forest above ground biomass estimation, which is important for forest management and global climate change study. This study aims to further explore their capabilities in temperate forest above ground biomass (AGB) estimation by emphasizing the spatial auto-correlation of variables obtained from these two remote sensing tools, which is a usually overlooked aspect in remote sensing applications to vegetation studies. Remote sensing variables including airborne LiDAR metrics, backscattering coefficient for different SAR polarizations and their ratio variables for Radarsat-2 imagery were calculated. First, simple linear regression models (SLR) was established between the field-estimated above ground biomass and the remote sensing variables. Pearson's correlation coefficient (R2) was used to find which LiDAR metric showed the most significant correlation with the regression residuals and could be selected as co-variable in regression co-kriging (RCoKrig). Second, regression co-kriging was conducted by choosing the regression residuals as dependent variable and the LiDAR metric (Hmean) with highest R2 as co-variable. Third, above ground biomass over the study area was estimated using SLR model and RCoKrig model, respectively. The results for these two models were validated using the same ground points. Results showed that both of these two methods achieved satisfactory prediction accuracy, while regression co-kriging showed the lower estimation error. It is proved that regression co-kriging model is feasible and effective in mapping the spatial pattern of AGB in the temperate forest using Radarsat-2 data calibrated by airborne LiDAR metrics.
The Correlation Between Metacognition Level with Self-Efficacy of Biology Education College Students
NASA Astrophysics Data System (ADS)
Ridlo, S.; Lutfiya, F.
2017-04-01
Self-efficacy is a strong predictor of academic achievement. Self-efficacy refers to the ability of college students to achieve the desired results. The metacognition level can influence college student’s self-efficacy. This study aims to identify college student’s metacognition level and self-efficacy, as well as determine the relationship between self-efficacy and metacognition level for college students of Biology Education 2013, Semarang State University. The ex-post facto quantitative research was conducted on 99 students Academic Year 2015/2016. Saturation sampling technique determined samples. E-D scale collected data for self-efficacy identification. Data for assess the metacognition level collected by Metacognitive Awareness Inventory. Data were analysed quantitatively by Pearson correlation and linear regression. Most college students have the high level of metacognition and average self-efficacy. Pearson correlation coefficient result was 0.367. This result showed that metacognition level and self-efficacy has a weak relationship. Based on linear regression test, self-efficacy influenced by metacognition level up to 13.5%. The results of the study showed that positive and significant relationships exist between metacognition level and self-efficacy. Therefore, if the metacognition level is high, then self-efficacy will also be high (appropriate).
Aerodynamic parameters of High-Angle-of attack Research Vehicle (HARV) estimated from flight data
NASA Technical Reports Server (NTRS)
Klein, Vladislav; Ratvasky, Thomas R.; Cobleigh, Brent R.
1990-01-01
Aerodynamic parameters of the High-Angle-of-Attack Research Aircraft (HARV) were estimated from flight data at different values of the angle of attack between 10 degrees and 50 degrees. The main part of the data was obtained from small amplitude longitudinal and lateral maneuvers. A small number of large amplitude maneuvers was also used in the estimation. The measured data were first checked for their compatibility. It was found that the accuracy of air data was degraded by unexplained bias errors. Then, the data were analyzed by a stepwise regression method for obtaining a structure of aerodynamic model equations and least squares parameter estimates. Because of high data collinearity in several maneuvers, some of the longitudinal and all lateral maneuvers were reanalyzed by using two biased estimation techniques, the principal components regression and mixed estimation. The estimated parameters in the form of stability and control derivatives, and aerodynamic coefficients were plotted against the angle of attack and compared with the wind tunnel measurements. The influential parameters are, in general, estimated with acceptable accuracy and most of them are in agreement with wind tunnel results. The simulated responses of the aircraft showed good prediction capabilities of the resulting model.
Bradshaw, Elizabeth J; Keogh, Justin W L; Hume, Patria A; Maulder, Peter S; Nortje, Jacques; Marnewick, Michel
2009-06-01
The purpose of this study was to examine the role of neuromotor noise on golf swing performance in high- and low-handicap players. Selected two-dimensional kinematic measures of 20 male golfers (n=10 per high- or low-handicap group) performing 10 golf swings with a 5-iron club was obtained through video analysis. Neuromotor noise was calculated by deducting the standard error of the measurement from the coefficient of variation obtained from intra-individual analysis. Statistical methods included linear regression analysis and one-way analysis of variance using SPSS. Absolute invariance in the key technical positions (e.g., at the top of the backswing) of the golf swing appears to be a more favorable technique for skilled performance.
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
Current research efforts with Bacillus thuringiensis
Normand R. Dubois
1991-01-01
The bioassay of 260 strains of Bacillus thuringiensis (Bt) and 70 commercial preparations show that regression coefficient estimates may be as critical as LC5O estimates when evaluating them for future consideration.
Societal Value of Surgery for Facial Reanimation.
Su, Peiyi; Ishii, Lisa E; Joseph, Andrew; Nellis, Jason; Dey, Jacob; Bater, Kristin; Byrne, Patrick J; Boahene, Kofi D O; Ishii, Masaru
2017-03-01
Patients with facial paralysis are perceived negatively by society in a number of domains. Society's perception of the health utility of varying degrees of facial paralysis and the value society places on reconstructive surgery for facial reanimation need to be quantified. To measure health state utility of varying degrees of facial paralysis, willingness to pay (WTP) for a repair, and the subsequent value of facial reanimation surgery as perceived by society. This prospective observational study conducted in an academic tertiary referral center evaluated a group of 348 casual observers who viewed images of faces with unilateral facial paralysis of 3 severity levels (low, medium, and high) categorized by House-Brackmann grade. Structural equation modeling was performed to understand associations among health utility metrics, WTP, and facial perception domains. Data were collected from July 16 to September 26, 2015. Observer-rated (1) quality of life (QOL) using established health utility metrics (standard gamble, time trade-off, and a visual analog scale) and (2) their WTP for surgical repair. Among the 348 observers (248 women [71.3%]; 100 men [28.7%]; mean [SD] age, 29.3 [11.6] years), mixed-effects linear regression showed that WTP increased nonlinearly with increasing severity of paralysis. Participants were willing to pay $3487 (95% CI, $2362-$4961) to repair low-grade paralysis, $8571 (95% CI, $6401-$11 234) for medium-grade paralysis, and $20 431 (95% CI, $16 273-$25 317) for high-grade paralysis. The dominant factor affecting the participants' WTP was perceived QOL. Modeling showed that perceived QOL decreased with paralysis severity (regression coefficient, -0.004; 95% CI, -0.005 to -0.004; P < .001) and increased with attractiveness (regression coefficient, 0.002; 95% CI, 0.002 to 0.003; P < .001). Mean (SD) health utility scores calculated by the standard gamble metric for low- and high-grade paralysis were 0.98 (0.09) and 0.77 (0.25), respectively. Time trade-off and visual analog scale measures were highly correlated. We calculated mean (SD) WTP per quality-adjusted life-year, which ranged from $10 167 ($14 565) to $17 008 ($38 288) for low- to high-grade paralysis, respectively. Society perceives the repair of facial paralysis to be a high-value intervention. Societal WTP increases and perceived health state utility decreases with increasing House-Brackmann grade. This study demonstrates the usefulness of WTP as an objective measure to inform dimensions of disease severity and signal the value society places on proper facial function. NA.
Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields
NASA Astrophysics Data System (ADS)
Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.
2018-01-01
This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.
Uncertainty Analysis on Heat Transfer Correlations for RP-1 Fuel in Copper Tubing
NASA Technical Reports Server (NTRS)
Driscoll, E. A.; Landrum, D. B.
2004-01-01
NASA is studying kerosene (RP-1) for application in Next Generation Launch Technology (NGLT). Accurate heat transfer correlations in narrow passages at high temperatures and pressures are needed. Hydrocarbon fuels, such as RP-1, produce carbon deposition (coke) along the inside of tube walls when heated to high temperatures. A series of tests to measure the heat transfer using RP-1 fuel and examine the coking were performed in NASA Glenn Research Center's Heated Tube Facility. The facility models regenerative cooling by flowing room temperature RP-1 through resistively heated copper tubing. A Regression analysis is performed on the data to determine the heat transfer correlation for Nusselt number as a function of Reynolds and Prandtl numbers. Each measurement and calculation is analyzed to identify sources of uncertainty, including RP-1 property variations. Monte Carlo simulation is used to determine how each uncertainty source propagates through the regression and an overall uncertainty in predicted heat transfer coefficient. The implications of these uncertainties on engine design and ways to minimize existing uncertainties are discussed.
Relationship Between Earthquake b-Values and Crustal Stresses in a Young Orogenic Belt
NASA Astrophysics Data System (ADS)
Wu, Yih-Min; Chen, Sean Kuanhsiang; Huang, Ting-Chung; Huang, Hsin-Hua; Chao, Wei-An; Koulakov, Ivan
2018-02-01
It has been reported that earthquake b-values decrease linearly with the differential stresses in the continental crust and subduction zones. Here we report a regression-derived relation between earthquake b-values and crustal stresses using the Anderson fault parameter (Aϕ) in a young orogenic belt of Taiwan. This regression relation is well established by using a large and complete earthquake catalog for Taiwan. The data set consists of b-values and Aϕ values derived from relocated earthquakes and focal mechanisms, respectively. Our results show that b-values decrease linearly with the Aϕ values at crustal depths with a high correlation coefficient of -0.9. Thus, b-values could be used as stress indicators for orogenic belts. However, the state of stress is relatively well correlated with the surface geological setting with respect to earthquake b-values in Taiwan. Temporal variations in the b-value could constitute one of the main reasons for the spatial heterogeneity of b-values. We therefore suggest that b-values could be highly sensitive to temporal stress variations.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Continuous monitoring of sediment and nutrients in the Illinois River at Florence, Illinois, 2012-13
Terrio, Paul J.; Straub, Timothy D.; Domanski, Marian M.; Siudyla, Nicholas A.
2015-01-01
The Illinois River is the largest river in Illinois and is the primary contributing watershed for nitrogen, phosphorus, and suspended-sediment loading to the upper Mississippi River from Illinois. In addition to streamflow, the following water-quality constituents were monitored at the Illinois River at Florence, Illinois (U.S. Geological Survey station number 05586300), during May 2012–October 2013: phosphate, nitrate, turbidity, temperature, specific conductance, pH, and dissolved oxygen. The objectives of this monitoring were to (1) determine performance capabilities of the in-situ instruments; (2) collect continuous data that would provide an improved understanding of constituent characteristics during normal, low-, and high-flow periods and during different climatic and land-use seasons; (3) evaluate the ability to use continuous turbidity as a surrogate constituent to determine suspended-sediment concentrations; and (4) evaluate the ability to develop a regression model for total phosphorus using phosphate, turbidity, and other measured parameters. Reliable data collection was achieved, following some initial periods of instrument and data-communication difficulties. The resulting regression models for suspended sediment had coefficient of determination (R2) values of about 0.9. Nitrate plus nitrite loads computed using continuous data were found to be approximately 8 percent larger than loads computed using traditional discrete-sampling based models. A regression model for total phosphorus was developed by using historic orthophosphate data (important during periods of low flow and low concentrations) and historic suspended-sediment data (important during periods of high flow and higher concentrations). The R2of the total phosphorus regression model using orthophosphorus and suspended sediment was 0.8. Data collection and refinement of the regression models is ongoing.
Abnormal dynamics of language in schizophrenia.
Stephane, Massoud; Kuskowski, Michael; Gundel, Jeanette
2014-05-30
Language could be conceptualized as a dynamic system that includes multiple interactive levels (sub-lexical, lexical, sentence, and discourse) and components (phonology, semantics, and syntax). In schizophrenia, abnormalities are observed at all language elements (levels and components) but the dynamic between these elements remains unclear. We hypothesize that the dynamics between language elements in schizophrenia is abnormal and explore how this dynamic is altered. We, first, investigated language elements with comparable procedures in patients and healthy controls. Second, using measures of reaction time, we performed multiple linear regression analyses to evaluate the inter-relationships among language elements and the effect of group on these relationships. Patients significantly differed from controls with respect to sub-lexical/lexical, lexical/sentence, and sentence/discourse regression coefficients. The intercepts of the regression slopes increased in the same order above (from lower to higher levels) in patients but not in controls. Regression coefficients between syntax and both sentence level and discourse level semantics did not differentiate patients from controls. This study indicates that the dynamics between language elements is abnormal in schizophrenia. In patients, top-down flow of linguistic information might be reduced, and the relationship between phonology and semantics but not between syntax and semantics appears to be altered. Published by Elsevier Ireland Ltd.
Raman spectroscopy-based screening of hepatitis C and associated molecular changes
NASA Astrophysics Data System (ADS)
Bilal, Maria; Bilal, M.; Saleem, M.; Khan, Saranjam; Ullah, Rahat; Fatima, Kiran; Ahmed, M.; Hayat, Abbas; Shahzada, Shaista; Ullah Khan, Ehsan
2017-09-01
This study presents the optical screening of hepatitis C and its associated molecular changes in human blood sera using a partial least-squares regression model based on their Raman spectra. In total, 152 samples were tested through enzyme-linked immunosorbent assay for confirmation. This model utilizes minor spectral variations in the Raman spectra of the positive and control groups. Regression coefficients of this model were analyzed with reference to the variations in concentration of associated molecules in these two groups. It was found that trehalose, chitin, ammonia, and cytokines are positively correlated while lipids, beta structures of proteins, and carbohydrate-binding proteins are negatively correlated with hepatitis C. The regression vector yielded by this model is utilized to predict hepatitis C in unknown samples. This model has been evaluated by a cross-validation method, which yielded a correlation coefficient of 0.91. Moreover, 30 unknown samples were screened for hepatitis C infection using this model to test its performance. Sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve from these predictions were found to be 93.3%, 100%, 96.7%, and 1, respectively.
Regression equations for disinfection by-products for the Mississippi, Ohio and Missouri rivers
Rathbun, R.E.
1996-01-01
Trihalomethane and nonpurgeable total organic-halide formation potentials were determined for the chlorination of water samples from the Mississippi, Ohio and Missouri Rivers. Samples were collected during the summer and fall of 1991 and the spring of 1992 at twelve locations on the Mississippi from New Orleans to Minneapolis, and on the Ohio and Missouri 1.6 km upstream from their confluences with the Mississippi. Formation potentials were determined as a function of pH, initial free-chlorine concentration, and reaction time. Multiple linear regression analysis of the data indicated that pH, reaction time, and the dissolved organic carbon concentration and/or the ultraviolet absorbance of the water were the most significant variables. The initial free-chlorine concentration had less significance and bromide concentration had little or no significance. Analysis of combinations of the dissolved organic carbon concentration and the ultraviolet absorbance indicated that use of the ultraviolet absorbance alone provided the best prediction of the experimental data. Regression coefficients for the variables were generally comparable to coefficients previously presented in the literature for waters from other parts of the United States.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
NASA Astrophysics Data System (ADS)
Mahrooghy, Majid; Ashraf, Ahmed B.; Daye, Dania; Mies, Carolyn; Rosen, Mark; Feldman, Michael; Kontos, Despina
2014-03-01
We evaluate the prognostic value of sparse representation-based features by applying the K-SVD algorithm on multiparametric kinetic, textural, and morphologic features in breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). K-SVD is an iterative dimensionality reduction method that optimally reduces the initial feature space by updating the dictionary columns jointly with the sparse representation coefficients. Therefore, by using K-SVD, we not only provide sparse representation of the features and condense the information in a few coefficients but also we reduce the dimensionality. The extracted K-SVD features are evaluated by a machine learning algorithm including a logistic regression classifier for the task of classifying high versus low breast cancer recurrence risk as determined by a validated gene expression assay. The features are evaluated using ROC curve analysis and leave one-out cross validation for different sparse representation and dimensionality reduction numbers. Optimal sparse representation is obtained when the number of dictionary elements is 4 (K=4) and maximum non-zero coefficients is 2 (L=2). We compare K-SVD with ANOVA based feature selection for the same prognostic features. The ROC results show that the AUC of the K-SVD based (K=4, L=2), the ANOVA based, and the original features (i.e., no dimensionality reduction) are 0.78, 0.71. and 0.68, respectively. From the results, it can be inferred that by using sparse representation of the originally extracted multi-parametric, high-dimensional data, we can condense the information on a few coefficients with the highest predictive value. In addition, the dimensionality reduction introduced by K-SVD can prevent models from over-fitting.
Waller, Niels G
2016-01-01
For a fixed set of standardized regression coefficients and a fixed coefficient of determination (R-squared), an infinite number of predictor correlation matrices will satisfy the implied quadratic form. I call such matrices fungible correlation matrices. In this article, I describe an algorithm for generating positive definite (PD), positive semidefinite (PSD), or indefinite (ID) fungible correlation matrices that have a random or fixed smallest eigenvalue. The underlying equations of this algorithm are reviewed from both algebraic and geometric perspectives. Two simulation studies illustrate that fungible correlation matrices can be profitably used in Monte Carlo research. The first study uses PD fungible correlation matrices to compare penalized regression algorithms. The second study uses ID fungible correlation matrices to compare matrix-smoothing algorithms. R code for generating fungible correlation matrices is presented in the supplemental materials.
MANCOVA for one way classification with homogeneity of regression coefficient vectors
NASA Astrophysics Data System (ADS)
Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.
2017-11-01
The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
The impact of professional identity on role stress in nursing students: A cross-sectional study.
Sun, Li; Gao, Ying; Yang, Juan; Zang, Xiao-Ying; Wang, Yao-Gang
2016-11-01
As newcomers to the clinical workplace, nursing students will encounter a high degree of role stress, which is an important predictor of burnout and engagement. Professional identity is theorised to be a key factor in providing high-quality care to improve patient outcomes and is thought to mediate the negative effects of a high-stress workplace and improve clinical performance and job retention. To investigate the level of nursing students' professional identity and role stress at the end of the first sub-internship, and to explore the impact of the nursing students' professional identity and other characteristics on role stress. A cross-sectional study. Three nursing schools in China. Nursing students after a 6-month sub-internship in a general hospital (n=474). The Role Stress Scale (score range: 12-60) and the Professional Identity Questionnaire for Nursing students (score range: 17-85) were used to investigate the levels of nursing students' role stress and professional identity. Higher scores indicated higher levels of role stress and professional identity. Basic demographic information about the nursing students was collected. The Pearson correlation, point-biserial correlation and multiple linear regression analysis were used to analyse the data. The mean total scores of the Role Stress Scale and Professional Identity Questionnaire for Nursing Students were 34.04 (SD=6.57) and 57.63 (SD=9.63), respectively. In the bivariate analyses, the following independent variables were found to be significantly associated with the total score of the Role Stress Scale: the total score of the Professional Identity Questionnaire for Nursing Students (r=-0.295, p<0.01), age (r=0.145, p<0.01), whether student was an only child or not (r=-0.114, p<0.05), education level (r=0.295, p<0.01) and whether student had experience in community organisations or not (r=0.151, p<0.01). In the multiple linear regression analysis, the total score of the Professional Identity Questionnaire for Nursing Students (standardised coefficient Beta: -0.260, p<0.001), education level (standardised coefficient Beta: 0.212, p<0.001) and whether or not student had experience in community organisations (standardised coefficient Beta: 0.107, p<0.016) were the factors significantly associated with the total score of the Role Stress Scale. The multiple linear regression model explained 18.2% (adjusted R 2 scores 16.5%) of the Role Stress Scale scores variance. The nursing students' level of role stress at the end of the first sub-internship was high. The students with higher professional identity values had lower role stress levels. Compared with other personal characteristics, professional identity and education level had the strongest impact on the nursing students' level of role stress. This is a new perspective that shows that developing and improving professional identity may prove helpful for nursing students in managing role stress. Copyright © 2016 Elsevier Ltd. All rights reserved.
Tascon, Marcos; Romero, Lílian M; Acquaviva, Agustín; Keunchkarian, Sonia; Castells, Cecilia
2013-06-14
This study focused on an investigation into the experimental quantities inherent in the determination of partition coefficients from gas-liquid chromatographic measurements through the use of capillary columns. We prepared several squalane - (2,6,10,15,19,23-hexamethyltetracosane) - containing columns with very precisely known phase ratios and determined solute retention and hold-up times at 30, 40, 50 and 60°C. We calculated infinite dilution partition coefficients from the slopes of the linear regression of retention factors as a function of the reciprocal of the phase ratio by means of fundamental chromatographic equations. In order to minimize gas-solid and liquid-solid interface contributions to retention, the surface of the capillary inner wall was pretreated to guarantee a uniform coat of stationary phase. The validity of the proposed approach was first tested by estimating the partition coefficients of n-alkanes between n-pentane and n-nonane, for which compounds data from the literature were available. Then partition coefficients of sixteen aliphatic alcohols in squalane were determined at those four temperatures. We deliberately chose these highly challenging systems: alcohols in the reference paraffinic stationary phase. These solutes exhibited adsorption in the gas-liquid interface that contributed to retention. The corresponding adsorption constant values were estimated. We fully discuss here the uncertainties associated with each experimental measurement and how these fundamental determinations can be performed precisely by circumventing the main drawbacks. The proposed strategy is reliable and much simpler than the classical chromatographic method employing packed columns. Copyright © 2013 Elsevier B.V. All rights reserved.
Statistical methods for astronomical data with upper limits. II - Correlation and regression
NASA Technical Reports Server (NTRS)
Isobe, T.; Feigelson, E. D.; Nelson, P. I.
1986-01-01
Statistical methods for calculating correlations and regressions in bivariate censored data where the dependent variable can have upper or lower limits are presented. Cox's regression and the generalization of Kendall's rank correlation coefficient provide significant levels of correlations, and the EM algorithm, under the assumption of normally distributed errors, and its nonparametric analog using the Kaplan-Meier estimator, give estimates for the slope of a regression line. Monte Carlo simulations demonstrate that survival analysis is reliable in determining correlations between luminosities at different bands. Survival analysis is applied to CO emission in infrared galaxies, X-ray emission in radio galaxies, H-alpha emission in cooling cluster cores, and radio emission in Seyfert galaxies.
Furmanchuk, Al'ona; Saal, James E; Doak, Jeff W; Olson, Gregory B; Choudhary, Alok; Agrawal, Ankit
2018-02-05
The regression model-based tool is developed for predicting the Seebeck coefficient of crystalline materials in the temperature range from 300 K to 1000 K. The tool accounts for the single crystal versus polycrystalline nature of the compound, the production method, and properties of the constituent elements in the chemical formula. We introduce new descriptive features of crystalline materials relevant for the prediction the Seebeck coefficient. To address off-stoichiometry in materials, the predictive tool is trained on a mix of stoichiometric and nonstoichiometric materials. The tool is implemented into a web application (http://info.eecs.northwestern.edu/SeebeckCoefficientPredictor) to assist field scientists in the discovery of novel thermoelectric materials. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Drag coefficients for modeling flow through emergent vegetation in the Florida Everglades
Lee, J.K.; Roig, L.C.; Jenter, H.L.; Visser, H.M.
2004-01-01
Hydraulic data collected in a flume fitted with pans of sawgrass were analyzed to determine the vertically averaged drag coefficient as a function of vegetation characteristics. The drag coefficient is required for modeling flow through emergent vegetation at low Reynolds numbers in the Florida Everglades. Parameters of the vegetation, such as the stem population per unit bed area and the average stem/leaf width, were measured for five fixed vegetation layers. The vertically averaged vegetation parameters for each experiment were then computed by weighted average over the submerged portion of the vegetation. Only laminar flow through emergent vegetation was considered, because this is the dominant flow regime of the inland Everglades. A functional form for the vegetation drag coefficient was determined by linear regression of the logarithmic transforms of measured resistance force and Reynolds number. The coefficients of the drag coefficient function were then determined for the Everglades, using extensive flow and vegetation measurements taken in the field. The Everglades data show that the stem spacing and the Reynolds number are important parameters for the determination of vegetation drag coefficient. ?? 2004 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Darmayasa, J. B.; Wahyudin; Mulyana, T.
2018-01-01
Ethnomathematics may be the connecting bridge between culture and technology and arts. Therefore, the exploration of mathematics values that intersects with cultural anthropology should be significantly conducted. One case containing such issue is the construction of Traditional House of Saka Roras in Bali. Thus, this research aimed to explore the mathematic concept adopted in the construction of such traditional Bale (house) located in Songan Village, Kintamani, Bali. Specifically, this research also aimed to investigate the selection of linear regression coefficient for the saka (pillar) in the Bale. This research applied Embedded Mix-Method Design. Meanwhile, the data collection was conducted by interview, observation and measurement of pillars of 32 Bale Saka Roras. The result of this research revealed that the connection between the width and height of pillars was stated in the formula Y = 26,3 + 18,2X, where X acted as stimulus variable. The coefficient value amounted to 18.2 showed that most preceding architects in Songan Village were more likely to use 19 as the coefficient towards the pillar width than the other coefficients such as 17, 20 and 21 as mentioned in book/palm-leaf manuscript entitled Kosala-Kosali. The last but not least, the researchers also figured out that the pillar width depended on the length of the house-owner candidate’s index finger.
On marker-based parentage verification via non-linear optimization.
Boerner, Vinzent
2017-06-15
Parentage verification by molecular markers is mainly based on short tandem repeat markers. Single nucleotide polymorphisms (SNPs) as bi-allelic markers have become the markers of choice for genotyping projects. Thus, the subsequent step is to use SNP genotypes for parentage verification as well. Recent developments of algorithms such as evaluating opposing homozygous SNP genotypes have drawbacks, for example the inability of rejecting all animals of a sample of potential parents. This paper describes an algorithm for parentage verification by constrained regression which overcomes the latter limitation and proves to be very fast and accurate even when the number of SNPs is as low as 50. The algorithm was tested on a sample of 14,816 animals with 50, 100 and 500 SNP genotypes randomly selected from 40k genotypes. The samples of putative parents of these animals contained either five random animals, or four random animals and the true sire. Parentage assignment was performed by ranking of regression coefficients, or by setting a minimum threshold for regression coefficients. The assignment quality was evaluated by the power of assignment (P[Formula: see text]) and the power of exclusion (P[Formula: see text]). If the sample of putative parents contained the true sire and parentage was assigned by coefficient ranking, P[Formula: see text] and P[Formula: see text] were both higher than 0.99 for the 500 and 100 SNP genotypes, and higher than 0.98 for the 50 SNP genotypes. When parentage was assigned by a coefficient threshold, P[Formula: see text] was higher than 0.99 regardless of the number of SNPs, but P[Formula: see text] decreased from 0.99 (500 SNPs) to 0.97 (100 SNPs) and 0.92 (50 SNPs). If the sample of putative parents did not contain the true sire and parentage was rejected using a coefficient threshold, the algorithm achieved a P[Formula: see text] of 1 (500 SNPs), 0.99 (100 SNPs) and 0.97 (50 SNPs). The algorithm described here is easy to implement, fast and accurate, and is able to assign parentage using genomic marker data with a size as low as 50 SNPs.
Validation of MODIS Aerosol Optical Depth Retrievals over a Tropical Urban Site, Pune, India
NASA Technical Reports Server (NTRS)
More, Sanjay; Kuman, P. Pradeep; Gupta, Pawan; Devara, P. C. S.; Aher, G. R.
2011-01-01
In the present paper, MODIS (Terra and Aqua; level 2, collection 5) derived aerosoloptical depths (AODs) are compared with the ground-based measurements obtained from AERONET (level 2.0) and Microtops - II sun-photometer over a tropical urban station, Pune (18 deg 32'N; 73 deg 49'E, 559 m amsl). This is the first ever systematic validation of the MODIS aerosol products over Pune. Analysis of the data indicates that the Terra and Aqua MODIS AOD retrievals at 550 nm have good correlations with the AERONET and Microtops - II sun-photometer AOD measurements. During winter the linear regression correlation coefficients for MODIS products against AERONET measurements are 0.79 for Terra and 0.62 for Aqua; however for premonsoon, the corresponding coefficients are 0.78 and 0.74. Similarly, the linear regression correlation coefficients for Microtops measurements against MODIS products are 0.72 and 0.93 for Terra and Aqua data respectively during winter and are 0.78 and 0.75 during pre-monsoon. On yearly basis in 2008-2009, correlation coefficients for MODIS products against AERONET measurements are 0.80 and 0.78 for Terra and Aqua respectively while the corresponding coefficients are 0.70 and 0.73 during 2009-2010. The regressed intercepts with MODIS vs. AERONET are 0.09 for Terra and 0.05 for Aqua during winter whereas their values are 0.04 and 0.07 during pre-monsoon. However, MODIS AODs are found to underestimate during winter and overestimate during pre-monsoon with respect to AERONET and Microtops measurements having slopes 0.63 (Terra) and 0.74 (Aqua) during winter and 0.97 (Terra) and 0.94 (Aqua) during pre-monsoon. Wavelength dependency of Single Scattering Albedo (SSA) shows presence of absorbing and scattering aerosol particles. For winter, SSA decreases with wavelength with the values 0.86 +/- 0.03 at 440 nm and 0.82 +/- 0.04 at 1020nm. In pre-monsoon, it increases with wavelength (SSA is 0.87 +/- 0.02 at 440nm; and 0.88 +/-0.04 at 1020 nm).
Li, Dapeng; Yue, Jiawei; Jiang, Lu; Huang, Yonghui; Sun, Jifu; Wu, Yan
2017-04-22
BACKGROUND Degrading enzymes play an important role in the process of disc degeneration. The objective of this study was to investigate the correlation between the expression of high temperature requirement serine protease A1 (HtrA1) in the nucleus pulposus and the T2 value of the nucleus pulposus region in magnetic resonance imaging (MRI). MATERIAL AND METHODS Thirty-six patients who had undergone surgical excision of the nucleus pulposus were examined by MRI before surgery. Pfirrmann grading of the target intervertebral disc was performed according to the sagittal T2-weighted imaging, and the T2 value of the target nucleus pulposus was measured according to the median sagittal T2 mapping. The correlation between the Pfirrmann grade and the T2 value was analyzed. The expression of HtrA1 in the nucleus pulposus was analyzed by RT-PCR and Western blot. The correlation between the expression of HtrA1 and the T2 value was analyzed. RESULTS The T2 value of the nucleus pulposus region was 33.11-167.91 ms, with an average of 86.64±38.73 ms. According to Spearman correlation analysis, there was a rank correlation between T2 value and Pfirrmann grade (P<0.0001), and the correlation coefficient (rs)=-0.93617. There was a linear correlation between the mRNA level of HtrA1 and T2 value in nucleus pulposus tissues (a=3.88, b=-0.019, F=112.63, P<0.0001), normalized regression coefficient=-0.88. There was a linear correlation between the expression level of HtrA1 protein and the T2 value in the nucleus pulposus tissues (a=3.30, b=-0.016, F=93.15, P<0.0001) and normalized regression coefficient=-0.86. CONCLUSIONS The expression of HtrA1 was strongly related to the T2 value, suggesting that HtrA1 plays an important role in the pathological process of intervertebral disc degeneration.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Saloheimo, T; González, S A; Erkkola, M; Milauskas, D M; Meisel, J D; Champagne, C M; Tudor-Locke, C; Sarmiento, O; Katzmarzyk, P T; Fogelholm, M
2015-01-01
Objective: The main aim of this study was to assess the reliability and validity of a food frequency questionnaire with 23 food groups (I-FFQ) among a sample of 9–11-year-old children from three different countries that differ on economical development and income distribution, and to assess differences between country sites. Furthermore, we assessed factors associated with I-FFQ's performance. Methods: This was an ancillary study of the International Study of Childhood Obesity, Lifestyle and the Environment. Reliability (n=321) and validity (n=282) components of this study had the same participants. Participation rates were 95% and 70%, respectively. Participants completed two I-FFQs with a mean interval of 4.9 weeks to assess reliability. A 3-day pre-coded food diary (PFD) was used as the reference method in the validity analyses. Wilcoxon signed-rank tests, intraclass correlation coefficients and cross-classifications were used to assess the reliability of I-FFQ. Spearman correlation coefficients, percentage difference and cross-classifications were used to assess the validity of I-FFQ. A logistic regression model was used to assess the relation of selected variables with the estimate of validity. Analyses based on information in the PFDs were performed to assess how participants interpreted food groups. Results: Reliability correlation coefficients ranged from 0.37 to 0.78 and gross misclassification for all food groups was <5%. Validity correlation coefficients were below 0.5 for 22/23 food groups, and they differed among country sites. For validity, gross misclassification was <5% for 22/23 food groups. Over- or underestimation did not appear for 19/23 food groups. Logistic regression showed that country of participation and parental education were associated (P⩽0.05) with the validity of I-FFQ. Analyses of children's interpretation of food groups suggested that the meaning of most food groups was understood by the children. Conclusion: I-FFQ is a moderately reliable method and its validity ranged from low to moderate, depending on food group and country site. PMID:27152180
Leptin but not adiponectin is related to type 2 diabetes mellitus in obese adolescents.
Reinehr, Thomas; Woelfle, Joachim; Wiegand, Susanna; Karges, Beate; Meissner, Thomas; Nagl, Katrin; Holl, Reinhard W
2016-06-01
Adipokines have been suggested to be involved in the development of type 2 diabetes mellitus (T2DM). However, studies in humans are controversial and analyzes at the onset of disease are scarce. We compared adiponectin and leptin levels between 74 predominately Caucasian adolescents with T2DM and 74 body mass index (BMI)-, age-, and gender-matched controls without T2DM. Adiponectin and leptin were correlated to age, BMI, hemoglobin A1c (HbA1c), blood pressure, and lipids. Adolescents with T2DM showed significant lower leptin levels as compared with controls (18 ± 12 vs. 37 ± 23 ng/mL, p < 0.001), whereas the adiponectin levels did not differ between the adolescents with and without T2DM (5.0 ± 2.5 vs. 4.9 ± 2.5 µg/mL, p = 0.833). The associations between adiponectin and high-density lipoprotein (HDL) cholesterol (r = 0.42), systolic (r = -0.15), and diastolic blood pressure (r = -0.20) were stronger as the associations of leptin to these parameters (all r < 0.07). In multiple linear regression analysis, leptin was significantly and positively associated with BMI [β-coefficient: 1.3 (95% confidence interval (95% CI): ±0.5), p < 0.001] and female sex [β-coefficient: 9.7 (95% CI: ±6.7), p = 0.005], and negatively with age [β-coefficient: -2.3 (95% CI: ±2.1), p < 0.001] and HbA1c [β-coefficient -3.1 (95% CI: ±2.1), p = 0.011]. Adiponectin was not significantly associated with BMI, HbA1c, age, or gender in multiple linear regression analysis. Because adiponectin levels did not differ between obese adolescents with and without T2DM, hypoadiponectinemia as observed in obesity seems not to be involved in the genesis of T2DM. The relative hypoleptinemia in obese adolescents with T2DM as compared with obese adolescents without T2DM may contribute to the development of T2DM. Future longitudinal studies in humans are necessary to prove this hypothesis. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
NASA Astrophysics Data System (ADS)
Melesse, Assefa; Hajigholizadeh, Mohammad; Blakey, Tara
2017-04-01
In this study, Landsat 8 and Sea-Viewing Wide Field-of-View Sensor (SeaWIFS) sensors were used to model the spatiotemporal changes of four water quality parameters: Landsat 8 (turbidity, chlorophyll-a (chl-a), total phosphate, and total nitrogen) and Sea-Viewing Wide Field-of-View Sensor (SeaWIFS) (algal blooms). The study was conducted in Florda bay, south Florida and model outputs were compared with in-situ observed data. The Landsat 8 based study found that, the predictive models to estimate chl-a and turbidity concentrations, developed through the use of stepwise multiple linear regression (MLR), gave high coefficients of determination in dry season (wet season) (R2 = 0.86(0.66) for chl-a and R2 = 0.84(0.63) for turbidity). Total phosphate and TN were estimated using best-fit multiple linear regression models as a function of Landsat TM and OLI,127 and ground data and showed a high coefficient of determination in dry season (wet season) (R2 = 0.74(0.69) for total phosphate and R2 = 0.82(0.82) for TN). Similarly, the ability of SeaWIFS for chl-a retrieval from optically shallow coastal waters by applying algorithms specific to the pixels' benthic class was evaluated. Benthic class was determined through satellite image-based classification methods. It was found that benthic class based chl-a modeling algorithm was better than the existing regionally-tuned approach. Evaluation of the residuals indicated the potential for further improvement to chl-a estimation through finer characterization of benthic environments. Key words: Landsat, SeaWIFS, water quality, Florida bay, Chl-a, turbidity
NASA Technical Reports Server (NTRS)
Khaiyer, Mandana M.; Doelling, David R.; Chan, Pui K.; Nordeen, MIchele L.; Palikonda, Rabindra; Yi, Yuhong; Minnis, Patrick
2006-01-01
Satellites can provide global coverage of a number of climatically important radiative parameters, including broadband (BB) shortwave (SW) and longwave (LW) fluxes at the top of the atmosphere (TOA) and surface. These parameters can be estimated from narrowband (NB) Geostationary Operational Environmental Satellite (GOES) data, but their accuracy is highly dependent on the validity of the narrowband-to-broadband (NB-BB) conversion formulas that are used to convert the NB fluxes to broadband values. The formula coefficients have historically been derived by regressing matched polarorbiting satellite BB fluxes or radiances with their NB counterparts from GOES (e.g., Minnis et al., 1984). More recently, the coefficients have been based on matched Earth Radiation Budget Experiment (ERBE) and GOES-6 data (Minnis and Smith, 1998). The Clouds and the Earth's Radiant Energy Budget (CERES see Wielicki et al. 1998)) project has recently developed much improved Angular Distribution Models (ADM; Loeb et al., 2003) and has higher resolution data compared to ERBE. A limited set of coefficients was also derived from matched GOES-8 and CERES data taken on Topical Rainfall Measuring Mission (TRMM) satellite (Chakrapani et al., 2003; Doelling et al., 2003). The NB-BB coefficients derived from CERES and the GOES suite should yield more accurate BB fluxes than from ERBE, but are limited spatially and seasonally. With CERES data taken from Terra and Aqua, it is now possible to derive more reliable NB-BB coefficients for any given area. Better TOA fluxes should translate to improved surface radiation fluxes derived using various algorithms. As part of an ongoing effort to provide accurate BB flux estimates for the Atmospheric Radiation Measurement (ARM) Program, this paper documents the derivation of new NB-BB coefficients for the ARM Southern Great Plains (SGP) domain and for the Darwin region of the Tropical Western Pacific (DTWP) domain.
Using a Grocery List Is Associated With a Healthier Diet and Lower BMI Among Very High-Risk Adults.
Dubowitz, Tamara; Cohen, Deborah A; Huang, Christina Y; Beckman, Robin A; Collins, Rebecca L
2015-01-01
Examine whether use of a grocery list is associated with healthier diet and weight among food desert residents. Cross-sectional analysis of in-person interview data from randomly selected household food shoppers in 2 low-income, primarily African American urban neighborhoods in Pittsburgh, PA with limited access to healthy foods. Multivariate ordinary least-square regressions conducted among 1,372 participants and controlling for sociodemographic factors and other potential confounding variables indicated that although most of the sample (78%) was overweight or obese, consistently using a list was associated with lower body mass index (based on measured height and weight) (adjusted multivariant coefficient = 0.095) and higher dietary quality (based on the Healthy Eating Index-2005) (adjusted multivariant coefficient = 0.103) (P < .05). Shopping with a list may be a useful tool for low-income individuals to improve diet or decrease body mass index. Copyright © 2015 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
The Extent and Prediction of Heavy Metal Pollution in Soils of Shahrood and Damghan, Iran.
Sakizadeh, Mohamad; Mirzaei, Rouhollah; Ghorbani, Hadi
2015-12-01
The levels of 12 heavy metals (Ag, Ba, Be, Cd, Co, Cr, Cu, Ni, Pb, Tl, V, Zn) were considered in 229 soil samples in Semnan Province, Iran. To discriminate between natural and anthropogenic inputs of heavy metals, factor analysis was used. Seven factors accounting for 90.5 % of the total variance were extracted. The mining and agricultural activities along with geogenic sources have been attributed as the main causes of the levels of heavy metals in the study area. The partial least squares regression was utilized to predict the level of soil pollution index (SPI) considering the concentrations of 12 heavy metals. The eigenvectors from the first three PLS represented more than 98 % of the overall variance. The correlation coefficient between the observed and predicted SPI was 0.99 indicating the high efficiency of this method. The resultant coefficient of determination for three PLS components was 0.984 confirming the predictive ability of this method.
Standiford, H C; Bernstein, D; Nipper, H C; Caplan, E; Tatem, B; Hall, J S; Reynolds, J
1981-01-01
Gentamicin levels were determined in 100 serum specimens by a new latex agglutination inhibition card test, a radioimmunoassay (RIA), and a bioassay. Correlation coefficients determined by linear regression analysis demonstrated that the levels obtained by the latex agglutination inhibition card test had a high degree of correlation with the RIA and could be performed much faster and more economically when processing small numbers of specimens. The bioassay had a slightly lower degree of correlation with both the RIA and the latex test and was adversely influenced by concurrently administered antibiotics which could not be eliminated by beta-lactamase. When measuring gentamicin concentrations above 2 micrograms/ml, the coefficient of variation was less than 14% for the latex agglutination assay compared with 15% for the bioassay and 12% for RIA. The latex agglutination inhibition card test is a rapid, accurate, specific, and reproducible method for monitoring gentamicin levels in patients and is particularly applicable for laboratories processing small numbers of specimens. PMID:7247384
Senior, Samir A; Madbouly, Magdy D; El massry, Abdel-Moneim
2011-09-01
Quantum chemical and topological descriptors of some organophosphorus compounds (OP) were correlated with their toxicity LD(50) as a dermal. The quantum chemical parameters were obtained using B3LYP/LANL2DZdp-ECP optimization. Using linear regression analysis, equations were derived to calculate the theoretical LD(50) of the studied compounds. The inclusion of quantum parameters, having both charge indices and topological indices, affects the toxicity of the studied compounds resulting in high correlation coefficient factors for the obtained equations. Two of the new four firstly supposed descriptors give higher correlation coefficients namely the Heteroatom Corrected Extended Connectivity Randic index ((1)X(HCEC)) and the Density Randic index ((1)X(Den)). The obtained linear equations were applied to predict the toxicity of some related structures. It was found that the sulfur atoms in these compounds must be replaced by oxygen atoms to achieve improved toxicity. Copyright © 2011 Elsevier Ltd. All rights reserved.
Estimation of old field ecosystem biomass using low altitude imagery
NASA Technical Reports Server (NTRS)
Nor, S. M.; Safir, G.; Burton, T. M.; Hook, J. E.; Schultink, G.
1977-01-01
Color-infrared photography was used to evaluate the biomass of experimental plots in an old-field ecosystem that was treated with different levels of waste water from a sewage treatment facility. Cibachrome prints at a scale of approximately 1:1,600 produced from 35 mm color infrared slides were used to analyze density patterns using prepared tonal density scales and multicell grids registered to ground panels shown on the photograph. Correlations between mean tonal density and harvest biomass data gave consistently high coefficients ranging from 0.530 to 0.896 at the 0.001 significance level. Corresponding multiple regression analysis resulted in higher correlation coefficients. The results indicate that aerial infrared photography can be used to estimate standing crop biomass on waste water irrigated old field ecosystems. Combined with minimal ground truth data, this technique could enable managers of waste water irrigation projects to precisely time harvest of such systems for maximal removal of nutrients in harvested biomass.
NASA Astrophysics Data System (ADS)
Han, Hao; Zhang, Hao; Wei, Xinzhou; Moore, William; Liang, Zhengrong
2016-03-01
In this paper, we proposed a low-dose computed tomography (LdCT) image reconstruction method with the help of prior knowledge learning from previous high-quality or normal-dose CT (NdCT) scans. The well-established statistical penalized weighted least squares (PWLS) algorithm was adopted for image reconstruction, where the penalty term was formulated by a texture-based Gaussian Markov random field (gMRF) model. The NdCT scan was firstly segmented into different tissue types by a feature vector quantization (FVQ) approach. Then for each tissue type, a set of tissue-specific coefficients for the gMRF penalty was statistically learnt from the NdCT image via multiple-linear regression analysis. We also proposed a scheme to adaptively select the order of gMRF model for coefficients prediction. The tissue-specific gMRF patterns learnt from the NdCT image were finally used to form an adaptive MRF penalty for the PWLS reconstruction of LdCT image. The proposed texture-adaptive PWLS image reconstruction algorithm was shown to be more effective to preserve image textures than the conventional PWLS image reconstruction algorithm, and we further demonstrated the gain of high-order MRF modeling for texture-preserved LdCT PWLS image reconstruction.
Gierlinger, Notburga; Luss, Saskia; König, Christian; Konnerth, Johannes; Eder, Michaela; Fratzl, Peter
2010-01-01
The functional characteristics of plant cell walls depend on the composition of the cell wall polymers, as well as on their highly ordered architecture at scales from a few nanometres to several microns. Raman spectra of wood acquired with linear polarized laser light include information about polymer composition as well as the alignment of cellulose microfibrils with respect to the fibre axis (microfibril angle). By changing the laser polarization direction in 3° steps, the dependency between cellulose and laser orientation direction was investigated. Orientation-dependent changes of band height ratios and spectra were described by quadratic linear regression and partial least square regressions, respectively. Using the models and regressions with high coefficients of determination (R2 > 0.99) microfibril orientation was predicted in the S1 and S2 layers distinguished by the Raman imaging approach in cross-sections of spruce normal, opposite, and compression wood. The determined microfibril angle (MFA) in the different S2 layers ranged from 0° to 49.9° and was in coincidence with X-ray diffraction determination. With the prerequisite of geometric sample and laser alignment, exact MFA prediction can complete the picture of the chemical cell wall design gained by the Raman imaging approach at the micron level in all plant tissues. PMID:20007198
Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah
2018-01-01
Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Baricci, Andrea; Casalegno, Andrea
2016-09-01
Limiting current density of oxygen reduction reaction in polymer electrolyte fuel cells is determined by several mass transport resistances that lower the concentration of oxygen on the catalyst active site. Among them, diffusion across porous media plays a significant role. Despite the extensive experimental activity documented in PEMFC literature, only few efforts have been dedicated to the measurement of the effective transport properties in porous layers. In the present work, a methodology for ex situ measurement of the effective diffusion coefficient and Knudsen radius of porous layers for polymer electrolyte fuel cells (gas diffusion layer, micro porous layer and catalyst layer) is described and applied to high temperature polymer fuel cells State of Art materials. Regression of the measured quantities by means of a quasi 2D physical model is performed to quantify the Knudsen effect, which is reported to account, respectively, for 30% and 50% of the mass transport resistance in micro porous layer and catalyst layer. On the other side, the model reveals that pressure gradient consequent to permeation in porous layers of high temperature polymer fuel cells has a negligible effect on oxygen concentration in relevant operating conditions.
Bhargava, Dinesh; Karthikeyan, C; Moorthy, N S H N; Trivedi, Piyush
2009-09-01
QSAR study was carried out for a series of piperazinyl phenylalanine derivatives exhibiting VLA-4/VCAM-1 inhibitory activity to find out the structural features responsible for the biological activity. The QSAR study was carried out on V-life Molecular Design Suite software and the derived best QSAR model by partial least square (forward) regression method showed 85.67% variation in biological activity. The statistically significant model with high correlation coefficient (r2=0.85) was selected for further study and the resulted validation parameters of the model, crossed squared correlation coefficient (q2=0.76 and pred_r2=0.42) show the model has good predictive ability. The model showed that the parameters SaaNEindex, SsClcount slogP,and 4PathCount are highly correlated with VLA-4/VCAM-1 inhibitory activity of piperazinyl phenylalanine derivatives. The result of the study suggests that the chlorine atoms in the molecule and fourth order fragmentation patterns in the molecular skeleton favour VLA-4/VCAM-1 inhibition shown by the title compounds whereas lipophilicity and nitrogen bonded to aromatic bond are not conducive for VLA-4/VCAM-1 inhibitory activity.
NASA Astrophysics Data System (ADS)
Ari, I. R. D.; Hasyim, A. W.; Pratama, B. A.; Helmy, M.; Sheilla, M. N.
2017-06-01
Poverty is a problem that requires attention from the government especially in developing countries such as Indonesia. This Research takes Place at Kasembon District because it has 53,19% family below poverty line in the region. The purpose of this research is to measure poverty based on 3 poverty indicators published by World Bank and 1 multidimensional poverty index. Furthermore, this research invesitigas the relationship between poverty with social and infrastructure in Kasembon District. This study using social network analysis, hot spots analysis, and regression analysis with ordinary least squares. From the poverty indicators known that Pondokagung Village has the highest poverty rate compared to another region. Results from regression model indicate that social and infrastructure affecting poverty in Kasembon District. Social parameter that affecting poverty is density. Infrastructure parameter that affecting poverty is length of paved road. Coefficient value of density is the largest in the model. Therefore it can be concluded that social factors can give more opportunity to reduce poverty rates in Kasembon District. In the local model of paved road coefficient, it is known that the coefficient for each village has not much different value from the global model.
Guo, Changning; Doub, William H; Kauffman, John F
2010-08-01
Monte Carlo simulations were applied to investigate the propagation of uncertainty in both input variables and response measurements on model prediction for nasal spray product performance design of experiment (DOE) models in the first part of this study, with an initial assumption that the models perfectly represent the relationship between input variables and the measured responses. In this article, we discard the initial assumption, and extended the Monte Carlo simulation study to examine the influence of both input variable variation and product performance measurement variation on the uncertainty in DOE model coefficients. The Monte Carlo simulations presented in this article illustrate the importance of careful error propagation during product performance modeling. Our results show that the error estimates based on Monte Carlo simulation result in smaller model coefficient standard deviations than those from regression methods. This suggests that the estimated standard deviations from regression may overestimate the uncertainties in the model coefficients. Monte Carlo simulations provide a simple software solution to understand the propagation of uncertainty in complex DOE models so that design space can be specified with statistically meaningful confidence levels. (c) 2010 Wiley-Liss, Inc. and the American Pharmacists Association
Wind tunnel test of Teledyne Geotech model 1564B cup anemometer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parker, M.J.; Addis, R.P.
1991-04-04
The Department of Energy (DOE) Environment, Safety and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0--25 mph regression equations than 0--50 mphmore » regression equations. Higher wind speeds were slightly overpredicted by the 0--25 mph regression equations when compared to 0--50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweight the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0--25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.« less
Wind tunnel test of Teledyne Geotech model 1564B cup anemometer
NASA Astrophysics Data System (ADS)
Parker, M. J.; Addis, R. P.
1991-04-01
The Department of Energy (DOE) Environment, Safety, and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0-25 mph regression equations than 0-50 mph regression equations. Higher wind speeds were slightly overpredicted by the 0-25 mph regression equations when compared to 0-50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweigh the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0-25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.
An Excellent Pilot Model for the Korean Air Force.
1988-12-01
Address i cirv, state, and ZIP c^-de 10 Source of Funding Numbers Proeram Element No Protect No Task No Work Inn Accession N i Title...undergraduate pilots in the Korean Air Force. ui ^t Accession Fcr • - • - ORAJcI i; U": it : • in .’ H J . m _ . ; . • fr A...Squares (OLS) method. Table 22. RESULTS OF THE REGRESSION MODEL Variables Coefficient Prob>|t| Beta Coefficient Intercept 516. SS7 (56.2S6) APT
Asano, Elio Fernando; Rasera, Irineu; Shiraga, Elisabete Cristina
2012-12-01
This is an exploratory analysis of potential variables associated with open Roux-en-Y gastric bypass (RYGB) surgery hospitalization resource use pattern. Cross-sectional study based on an administrative database (DATASUS) records. Inclusion criteria were adult patients undergoing RYGB between Jan/2008 and Jun/2011. Dependent variables were length of stay (LoS) and ICU need. Independent variables were: gender, age, region, hospital volume, surgery at certified center of excellence (CoE) by the Surgical Review Corporation (SRC), teaching hospital, and year of hospitalization. Univariate and multivariate analysis (logistic regression for ICU need and linear regression for length of stay) were performed. Data from 13,069 surgeries were analyzed. In crude analysis, hospital volume was the most impactful variable associated with log-transformed LoS (1.312 ± 0.302 high volume vs. 1.670 ± 0.581 low volume, p < 0.001), whereas for ICU need it was certified CoE (odds ratio (OR), 0.016; 95% confidence interval (CI), 0.010-0.026). After adjustment by logistic regression, certified CoE remained as the strongest predictor of ICU need (OR, 0.011; 95% CI, 0.007-0.018), followed by hospital volume (OR, 3.096; 95% CI, 2.861-3.350). Age group, male gender, and teaching hospital were also significantly associated (p < 0.001). For log-transformed LoS, final model includes hospital volume (coefficient, -0.223; 95% CI, -0.250 to -0.196) and teaching hospital (coefficient, 0.375; 95% CI, 0.351-0.398). Region of Brazil was not associated with any of the outcomes. High-volume hospital was the strongest predictor for shorter LoS, whereas SRC certification was the strongest predictor of lower ICU need. Public health policies targeting an increase of efficiency and patient access to the procedure should take into account these results.
Monitoring Energy Balance in Breast Cancer Survivors Using a Mobile App: Reliability Study
Lozano-Lozano, Mario; Galiano-Castillo, Noelia; Martín-Martín, Lydia; Pace-Bedetti, Nicolás; Fernández-Lao, Carolina; Cantarero-Villanueva, Irene
2018-01-01
Background The majority of breast cancer survivors do not meet recommendations in terms of diet and physical activity. To address this problem, we developed a mobile health (mHealth) app for assessing and monitoring healthy lifestyles in breast cancer survivors, called the Energy Balance on Cancer (BENECA) mHealth system. The BENECA mHealth system is a novel and interactive mHealth app, which allows breast cancer survivors to engage themselves in their energy balance monitoring. BENECA was designed to facilitate adherence to healthy lifestyles in an easy and intuitive way. Objective The objective of the study was to assess the concurrent validity and test-retest reliability between the BENECA mHealth system and the gold standard assessment methods for diet and physical activity. Methods A reliability study was conducted with 20 breast cancer survivors. In the study, tri-axial accelerometers (ActiGraphGT3X+) were used as gold standard for 8 consecutive days, in addition to 2, 24-hour dietary recalls, 4 dietary records, and sociodemographic questionnaires. Two-way random effect intraclass correlation coefficients, a linear regression-analysis, and a Passing-Bablok regression were calculated. Results The reliability estimates were very high for all variables (alpha≥.90). The lowest reliability was found in fruit and vegetable intakes (alpha=.94). The reliability between the accelerometer and the dietary assessment instruments against the BENECA system was very high (intraclass correlation coefficient=.90). We found a mean match rate of 93.51% between instruments and a mean phantom rate of 3.35%. The Passing-Bablok regression analysis did not show considerable bias in fat percentage, portions of fruits and vegetables, or minutes of moderate to vigorous physical activity. Conclusions The BENECA mHealth app could be a new tool to measure energy balance in breast cancer survivors in a reliable and simple way. Our results support the use of this technology to not only to encourage changes in breast cancer survivors' lifestyles, but also to remotely monitor energy balance. Trial Registration ClinicalTrials.gov NCT02817724; https://clinicaltrials.gov/ct2/show/NCT02817724 (Archived by WebCite at http://www.webcitation.org/6xVY1buCc) PMID:29588273
Inter-annual variability and long term predictability of exchanges through the Strait of Gibraltar
NASA Astrophysics Data System (ADS)
Boutov, Dmitri; Peliz, Álvaro; Miranda, Pedro M. A.; Soares, Pedro M. M.; Cardoso, Rita M.; Prieto, Laura; Ruiz, Javier; García-Lafuente, Jesus
2014-03-01
Inter-annual variability of calculated barotropic (netflow) and simulated baroclinic (inflow and outflow) exchanges through the Strait of Gibraltar is analyzed and their response to the main modes of atmospheric variability is investigated. Time series of the outflow obtained by high resolution simulations and estimated from in-situ Acoustic Doppler Current Profiler (ADCP) current measurements are compared. The time coefficients (TC) of the leading empirical orthogonal function (EOF) modes that describe zonal atmospheric circulation in the vicinity of the Strait (1st and 3rd of Sea-Level Pressure (SLP) and 1st of the wind) show significant covariance with the inflow and outflow. Based on these analyses, a regression model between these SLP TCs and outflow of the Mediterranean Water was developed. This regression outflow time series was compared with estimates based on current meter observations and the predictability and reconstruction of past exchange variability based on atmospheric pressure fields are discussed. The simple regression model seems to reproduce the outflow evolution fairly reasonably, with the exception of the year 2008, which is apparently anomalous without available physical explanation yet. The exchange time series show a reduced inter-annual variability (less than 1%, 2.6% and 3.1% of total 2-day variability, for netflow, inflow and outflow, respectively). From a statistical point of view no clear long-term tendencies were revealed. Anomalously high baroclinic fluxes are reported for the years of 2000-2001 that are coincident with strong impact on the Alboran Sea ecosystem. The origin of the anomalous flow is associated with a strong negative anomaly (~ - 9 hPa) in atmospheric pressure fields settled north of Iberian Peninsula and extending over the central Atlantic, favoring an increased zonal circulation in winter 2000/2001. These low pressure fields forced intense and durable westerly winds in the Gulf of Cadiz-Alboran system. The signal of this anomaly is also seen in time coefficients of the most significant EOF modes. The predictability of the exchanges for future climate is discussed.
Meteorological adjustment of yearly mean values for air pollutant concentration comparison
NASA Technical Reports Server (NTRS)
Sidik, S. M.; Neustadter, H. E.
1976-01-01
Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.
Homogenization Issues in the Combustion of Heterogeneous Solid Propellants
NASA Technical Reports Server (NTRS)
Chen, M.; Buckmaster, J.; Jackson, T. L.; Massa, L.
2002-01-01
We examine random packs of discs or spheres, models for ammonium-perchlorate-in-binder propellants, and discuss their average properties. An analytical strategy is described for calculating the mean or effective heat conduction coefficient in terms of the heat conduction coefficients of the individual components, and the results are verified by comparison with those of direct numerical simulations (dns) for both 2-D (disc) and 3-D (sphere) packs across which a temperature difference is applied. Similarly, when the surface regression speed of each component is related to the surface temperature via a simple Arrhenius law, an analytical strategy is developed for calculating an effective Arrhenius law for the combination, and these results are verified using dns in which a uniform heat flux is applied to the pack surface, causing it to regress. These results are needed for homogenization strategies necessary for fully integrated 2-D or 3-D simulations of heterogeneous propellant combustion.
Prediction of kinase-inhibitor binding affinity using energetic parameters
Usha, Singaravelu; Selvaraj, Samuel
2016-01-01
The combination of physicochemical properties and energetic parameters derived from protein-ligand complexes play a vital role in determining the biological activity of a molecule. In the present work, protein-ligand interaction energy along with logP values was used to predict the experimental log (IC50) values of 25 different kinase-inhibitors using multiple regressions which gave a correlation coefficient of 0.93. The regression equation obtained was tested on 93 kinase-inhibitor complexes and an average deviation of 0.92 from the experimental log IC50 values was shown. The same set of descriptors was used to predict binding affinities for a test set of five individual kinase families, with correlation values > 0.9. We show that the protein-ligand interaction energies and partition coefficient values form the major deterministic factors for binding affinity of the ligand for its receptor. PMID:28149052
Three-parameter modeling of the soil sorption of acetanilide and triazine herbicide derivatives.
Freitas, Mirlaine R; Matias, Stella V B G; Macedo, Renato L G; Freitas, Matheus P; Venturin, Nelson
2014-02-01
Herbicides have widely variable toxicity and many of them are persistent soil contaminants. Acetanilide and triazine family of herbicides have widespread use, but increasing interest for the development of new herbicides has been rising to increase their effectiveness and to diminish environmental hazard. The environmental risk of new herbicides can be accessed by estimating their soil sorption (logKoc), which is usually correlated to the octanol/water partition coefficient (logKow). However, earlier findings have shown that this correlation is not valid for some acetanilide and triazine herbicides. Thus, easily accessible quantitative structure-property relationship models are required to predict logKoc of analogues of the these compounds. Octanol/water partition coefficient, molecular weight and volume were calculated and then regressed against logKoc for two series of acetanilide and triazine herbicides using multiple linear regression, resulting in predictive and validated models.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Advanced Statistical Analyses to Reduce Inconsistency of Bond Strength Data.
Minamino, T; Mine, A; Shintani, A; Higashi, M; Kawaguchi-Uemura, A; Kabetani, T; Hagino, R; Imai, D; Tajiri, Y; Matsumoto, M; Yatani, H
2017-11-01
This study was designed to clarify the interrelationship of factors that affect the value of microtensile bond strength (µTBS), focusing on nondestructive testing by which information of the specimens can be stored and quantified. µTBS test specimens were prepared from 10 noncarious human molars. Six factors of µTBS test specimens were evaluated: presence of voids at the interface, X-ray absorption coefficient of resin, X-ray absorption coefficient of dentin, length of dentin part, size of adhesion area, and individual differences of teeth. All specimens were observed nondestructively by optical coherence tomography and micro-computed tomography before µTBS testing. After µTBS testing, the effect of these factors on µTBS data was analyzed by the general linear model, linear mixed effects regression model, and nonlinear regression model with 95% confidence intervals. By the general linear model, a significant difference in individual differences of teeth was observed ( P < 0.001). A significantly positive correlation was shown between µTBS and length of dentin part ( P < 0.001); however, there was no significant nonlinearity ( P = 0.157). Moreover, a significantly negative correlation was observed between µTBS and size of adhesion area ( P = 0.001), with significant nonlinearity ( P = 0.014). No correlation was observed between µTBS and X-ray absorption coefficient of resin ( P = 0.147), and there was no significant nonlinearity ( P = 0.089). Additionally, a significantly positive correlation was observed between µTBS and X-ray absorption coefficient of dentin ( P = 0.022), with significant nonlinearity ( P = 0.036). A significant difference was also observed between the presence and absence of voids by linear mixed effects regression analysis. Our results showed correlations between various parameters of tooth specimens and µTBS data. To evaluate the performance of the adhesive more precisely, the effect of tooth variability and a method to reduce variation in bond strength values should also be considered.
NASA Astrophysics Data System (ADS)
Dyar, M. D.; Carmosino, M. L.; Breves, E. A.; Ozanne, M. V.; Clegg, S. M.; Wiens, R. C.
2012-04-01
A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the least absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unscaled and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset. However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the response variables as possible while avoiding multicollinearity between principal components. When the selected number of principal components is projected back into the original feature space of the spectra, 6144 correlation coefficients are generated, a small fraction of which are mathematically significant to the regression. In contrast, the lasso models require only a small number (< 24) of non-zero correlation coefficients (β values) to determine the concentration of each of the ten major elements. Causality between the positively-correlated emission lines chosen by the lasso and the elemental concentration was examined. In general, the higher the lasso coefficient (β), the greater the likelihood that the selected line results from an emission of that element. Emission lines with negative β values should arise from elements that are anti-correlated with the element being predicted. For elements except Fe, Al, Ti, and P, the lasso-selected wavelength with the highest β value corresponds to the element being predicted, e.g. 559.8 nm for neutral Ca. However, the specific lines chosen by the lasso with positive β values are not always those from the element being predicted. Other wavelengths and the elements that most strongly correlate with them to predict concentration are obviously related to known geochemical correlations or close overlap of emission lines, while others must result from matrix effects. Use of the lasso technique thus directly informs our understanding of the underlying physical processes that give rise to LIBS emissions by determining which lines can best represent concentration, and which lines from other elements are causing matrix effects.
Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R
2012-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.
Anderson, Chauncey W.; Rounds, Stewart A.
2010-01-01
Management of water quality in streams of the United States is becoming increasingly complex as regulators seek to control aquatic pollution and ecological problems through Total Maximum Daily Load programs that target reductions in the concentrations of certain constituents. Sediment, nutrients, and bacteria, for example, are constituents that regulators target for reduction nationally and in the Tualatin River basin, Oregon. These constituents require laboratory analysis of discrete samples for definitive determinations of concentrations in streams. Recent technological advances in the nearly continuous, in situ monitoring of related water-quality parameters has fostered the use of these parameters as surrogates for the labor intensive, laboratory-analyzed constituents. Although these correlative techniques have been successful in large rivers, it was unclear whether they could be applied successfully in tributaries of the Tualatin River, primarily because these streams tend to be small, have rapid hydrologic response to rainfall and high streamflow variability, and may contain unique sources of sediment, nutrients, and bacteria. This report evaluates the feasibility of developing correlative regression models for predicting dependent variables (concentrations of total suspended solids, total phosphorus, and Escherichia coli bacteria) in two Tualatin River basin streams: one draining highly urbanized land (Fanno Creek near Durham, Oregon) and one draining rural agricultural land (Dairy Creek at Highway 8 near Hillsboro, Oregon), during 2002-04. An important difference between these two streams is their response to storm runoff; Fanno Creek has a relatively rapid response due to extensive upstream impervious areas and Dairy Creek has a relatively slow response because of the large amount of undeveloped upstream land. Four other stream sites also were evaluated, but in less detail. Potential explanatory variables included continuously monitored streamflow (discharge), stream stage, specific conductance, turbidity, and time (to account for seasonal processes). Preliminary multiple-regression models were identified using stepwise regression and Mallow's Cp, which maximizes regression correlation coefficients and accounts for the loss of additional degrees of freedom when extra explanatory variables are used. Several data scenarios were created and evaluated for each site to assess the representativeness of existing monitoring data and autosampler-derived data, and to assess the utility of the available data to develop robust predictive models. The goodness-of-fit of candidate predictive models was assessed with diagnostic statistics from validation exercises that compared predictions against a subset of the available data. The regression modeling met with mixed success. Functional model forms that have a high likelihood of success were identified for most (but not all) dependent variables at each site, but there were limitations in the available datasets, notably the lack of samples from high-flows. These limitations increase the uncertainty in the predictions of the models and suggest that the models are not yet ready for use in assessing these streams, particularly under high-flow conditions, without additional data collection and recalibration of model coefficients. Nonetheless, the results reveal opportunities to use existing resources more efficiently. Baseline conditions are well represented in the available data, and, for the most part, the models reproduced these conditions well. Future sampling might therefore focus on high flow conditions, without much loss of ability to characterize the baseline. Seasonal cycles, as represented by trigonometric functions of time, were not significant in the evaluated models, perhaps because the baseline conditions are well characterized in the datasets or because the other explanatory variables indirectly incorporate seasonal aspects. Multicollinearity among independent variabl
Zhou, Hua; Li, Lexin
2014-01-01
Summary Modern technologies are producing a wealth of data with complex structures. For instance, in two-dimensional digital imaging, flow cytometry and electroencephalography, matrix-type covariates frequently arise when measurements are obtained for each combination of two underlying variables. To address scientific questions arising from those data, new regression methods that take matrices as covariates are needed, and sparsity or other forms of regularization are crucial owing to the ultrahigh dimensionality and complex structure of the matrix data. The popular lasso and related regularization methods hinge on the sparsity of the true signal in terms of the number of its non-zero coefficients. However, for the matrix data, the true signal is often of, or can be well approximated by, a low rank structure. As such, the sparsity is frequently in the form of low rank of the matrix parameters, which may seriously violate the assumption of the classical lasso. We propose a class of regularized matrix regression methods based on spectral regularization. A highly efficient and scalable estimation algorithm is developed, and a degrees-of-freedom formula is derived to facilitate model selection along the regularization path. Superior performance of the method proposed is demonstrated on both synthetic and real examples. PMID:24648830
Estimating maize production in Kenya using NDVI: Some statistical considerations
Lewis, J.E.; Rowland, James; Nadeau , A.
1998-01-01
A regression model approach using a normalized difference vegetation index (NDVI) has the potential for estimating crop production in East Africa. However, before production estimation can become a reality, the underlying model assumptions and statistical nature of the sample data (NDVI and crop production) must be examined rigorously. Annual maize production statistics from 1982-90 for 36 agricultural districts within Kenya were used as the dependent variable; median area NDVI (independent variable) values from each agricultural district and year were extracted from the annual maximum NDVI data set. The input data and the statistical association of NDVI with maize production for Kenya were tested systematically for the following items: (1) homogeneity of the data when pooling the sample, (2) gross data errors and influence points, (3) serial (time) correlation, (4) spatial autocorrelation and (5) stability of the regression coefficients. The results of using a simple regression model with NDVI as the only independent variable are encouraging (r 0.75, p 0.05) and illustrate that NDVI can be a responsive indicator of maize production, especially in areas of high NDVI spatial variability, which coincide with areas of production variability in Kenya.
NASA Astrophysics Data System (ADS)
Xin, Pei; Wang, Shen S. J.; Shen, Chengji; Zhang, Zeyu; Lu, Chunhui; Li, Ling
2018-03-01
Shallow groundwater interacts strongly with surface water across a quarter of global land area, affecting significantly the terrestrial eco-hydrology and biogeochemistry. We examined groundwater behavior subjected to unimodal impulse and irregular surface water fluctuations, combining physical experiments, numerical simulations, and functional data analysis. Both the experiments and numerical simulations demonstrated a damped and delayed response of groundwater table to surface water fluctuations. To quantify this hysteretic shallow groundwater behavior, we developed a regression model with the Gamma distribution functions adopted to account for the dependence of groundwater behavior on antecedent surface water conditions. The regression model fits and predicts well the groundwater table oscillations resulting from propagation of irregular surface water fluctuations in both laboratory and large-scale aquifers. The coefficients of the Gamma distribution function vary spatially, reflecting the hysteresis effect associated with increased amplitude damping and delay as the fluctuation propagates. The regression model, in a relatively simple functional form, has demonstrated its capacity of reproducing high-order nonlinear effects that underpin the surface water and groundwater interactions. The finding has important implications for understanding and predicting shallow groundwater behavior and associated biogeochemical processes, and will contribute broadly to studies of groundwater-dependent ecology and biogeochemistry.
Antonarakis, Gregory S; Tompson, Bryan D; Fisher, David M
2016-11-01
Maxillary growth in patients with cleft lip and palate is highly variable. The authors' aim was to investigate associations between preoperative cleft lip measurements and maxillary growth determined cephalometrically in patients with complete unilateral cleft lip and palate (cUCLP). Retrospective cross-sectional study. Children with cUCLP. Preoperative cleft lip measurements were made at the time of primary cheiloplasty and available for each patient. Maxillary growth was evaluated on lateral cephalometric radiographs taken prior to any orthodontic treatment and alveolar bone grafting (8.5 ± 0.7 years). The presence of associations between preoperative cleft lip measurements and cephalometric measures of maxillary growth was determined using regression analyses. In the 58 patients included in the study, the cleft lateral lip element was deficient in height in 90% and in transverse width in 81% of patients. There was an inverse correlation between cleft lateral lip height and transverse width with a β coefficient of -0.382 (P = .003). Patients with a more deficient cleft lateral lip height displayed a shorter maxillary length (β coefficient = 0.336; P = .010), a less protruded maxilla (β coefficient = .334; P = .008), and a shorter anterior maxillary height (β coefficient = 0.306; P = .020) than those with a less deficient cleft lateral lip height. Patients with cUCLP present with varying degrees of lateral lip hypoplasia. Preoperative measures of lateral lip deficiency are related to later observed deficiencies of maxillary length, protrusion, and height.
Ramírez-Vélez, Robinson; Correa-Bautista, Jorge Enrique; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavides, Daniel Humberto; Carrillo, Hugo Alejandro; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio; García-Hermoso, Antonio
2017-01-01
Recently, a body adiposity index (BAI = (hip circumference)/((height)(1.5))−18) was developed and validated in adult populations. The aim of this study was to evaluate the performance of BAI in estimating percentage body fat (BF%) in a sample of Colombian collegiate young adults. The participants were comprised of 903 volunteers (52% females, mean age = 21.4 years ± 3.3). We used the Lin’s concordance correlation coefficient, linear regression, Bland–Altman’s agreement analysis, concordance correlation coefficient (ρc) and the coefficient of determination (R2) between BAI, and BF%; by bioelectrical impedance analysis (BIA)). The correlation between the two methods of estimating BF% was R2 = 0.384, p < 0.001. A paired-sample t-test showed a difference between the methods (BIA BF% = 16.2 ± 3.1, BAI BF% = 30.0 ± 5.4%; p < 0.001). For BIA, bias value was 6.0 ± 6.2 BF% (95% confidence interval (CI) = −6.0 to 18.2), indicating that the BAI method overestimated BF% relative to the reference method. Lin’s concordance correlation coefficient was poor (ρc = 0.014, 95% CI = −0.124 to 0.135; p = 0.414). In Colombian college students, there was poor agreement between BAI- and BIA-based estimates of BF%, and so BAI is not accurate in people with low or high body fat percentage levels. PMID:28106719
Total energy expenditure in adults with cerebral palsy as assessed by doubly labeled water.
Johnson, R K; Hildreth, H G; Contompasis, S H; Goran, M I
1997-09-01
To characterize total energy expenditure (TEE) in free-living adults with cerebral palsy (CP) using the doubly labeled water technique, and to determine those physiologic variables and characteristics of CP that were markers of TEE in adults with CP. TEE was measured using the doubly labeled water technique in 30 free-living adults with CP (12 women, 18 men). To determine the best markers of TEE, the following factors were examined: CP status, resting metabolic rate (RMR), anthropometric characteristics and body composition by means of dual-energy x-ray absorptiometry (DXA) and skinfold thickness measurements, energy cost of leisure-time activities, and oral-motor impairment. Means +/- standard deviations, t tests, Pearson product-moment correlation coefficients, Spearman rank correlation coefficients, chi 2, stepwise multiple-correlation regression analysis, and analysis of covariance were used to examine the relationships among variables of interest. TEE was highly variable in the sample (mean = 2,455 +/- 622 kcal/day for men and 1,986 +/- 363 kcal/day for women). Stepwise regression analysis showed that TEE was best predicted in the sample by RMR, percentage body fat determined by DXA, ambulation status, and sex (multiple R = .68, P = .003). When practical, easily measured variables were used, TEE was best predicted by height, ambulation status, percentage body fat by skinfold thickness measurements, and sex (multiple R = .61, P. = 018). The contribution of energy expended in physical activity to TEE was significantly higher in the ambulatory subjects than the nonambulatory subjects (25% vs 16%, respectively; P = .009). The high degree of variability in TEE, largely attributable to high interindividual variation in energy expended in physical activity, makes it difficult to provide general guidelines for energy requirements for adults with CP. Because ambulation status was an important predictor of TEE, it must be accounted for in estimating energy requirements in this population.
Boligon, A A; Baldi, F; Mercadante, M E Z; Lobo, R B; Pereira, R J; Albuquerque, L G
2011-06-28
We quantified the potential increase in accuracy of expected breeding value for weights of Nelore cattle, from birth to mature age, using multi-trait and random regression models on Legendre polynomials and B-spline functions. A total of 87,712 weight records from 8144 females were used, recorded every three months from birth to mature age from the Nelore Brazil Program. For random regression analyses, all female weight records from birth to eight years of age (data set I) were considered. From this general data set, a subset was created (data set II), which included only nine weight records: at birth, weaning, 365 and 550 days of age, and 2, 3, 4, 5, and 6 years of age. Data set II was analyzed using random regression and multi-trait models. The model of analysis included the contemporary group as fixed effects and age of dam as a linear and quadratic covariable. In the random regression analyses, average growth trends were modeled using a cubic regression on orthogonal polynomials of age. Residual variances were modeled by a step function with five classes. Legendre polynomials of fourth and sixth order were utilized to model the direct genetic and animal permanent environmental effects, respectively, while third-order Legendre polynomials were considered for maternal genetic and maternal permanent environmental effects. Quadratic polynomials were applied to model all random effects in random regression models on B-spline functions. Direct genetic and animal permanent environmental effects were modeled using three segments or five coefficients, and genetic maternal and maternal permanent environmental effects were modeled with one segment or three coefficients in the random regression models on B-spline functions. For both data sets (I and II), animals ranked differently according to expected breeding value obtained by random regression or multi-trait models. With random regression models, the highest gains in accuracy were obtained at ages with a low number of weight records. The results indicate that random regression models provide more accurate expected breeding values than the traditionally finite multi-trait models. Thus, higher genetic responses are expected for beef cattle growth traits by replacing a multi-trait model with random regression models for genetic evaluation. B-spline functions could be applied as an alternative to Legendre polynomials to model covariance functions for weights from birth to mature age.
Role of Aedes aegypti (Linnaeus) and Aedes albopictus (Skuse) in local dengue epidemics in Taiwan.
Tsai, Pui-Jen; Teng, Hwa-Jen
2016-11-09
Aedes mosquitoes in Taiwan mainly comprise Aedes albopictus and Ae. aegypti. However, the species contributing to autochthonous dengue spread and the extent at which it occurs remain unclear. Thus, in this study, we spatially analyzed real data to determine spatial features related to local dengue incidence and mosquito density, particularly that of Ae. albopictus and Ae. aegypti. We used bivariate Moran's I statistic and geographically weighted regression (GWR) spatial methods to analyze the globally spatial dependence and locally regressed relationship between (1) imported dengue incidences and Breteau indices (BIs) of Ae. albopictus, (2) imported dengue incidences and BI of Ae. aegypti, (3) autochthonous dengue incidences and BI of Ae. albopictus, (4) autochthonous dengue incidences and BI of Ae. aegypti, (5) all dengue incidences and BI of Ae. albopictus, (6) all dengue incidences and BI of Ae. aegypti, (7) BI of Ae. albopictus and human population density, and (8) BI of Ae. aegypti and human population density in 348 townships in Taiwan. In the GWR models, regression coefficients of spatially regressed relationships between the incidence of autochthonous dengue and vector density of Ae. aegypti were significant and positive in most townships in Taiwan. However, Ae. albopictus had significant but negative regression coefficients in clusters of dengue epidemics. In the global bivariate Moran's index, spatial dependence between the incidence of autochthonous dengue and vector density of Ae. aegypti was significant and exhibited positive correlation in Taiwan (bivariate Moran's index = 0.51). However, Ae. albopictus exhibited positively significant but low correlation (bivariate Moran's index = 0.06). Similar results were observed in the two spatial methods between all dengue incidences and Aedes mosquitoes (Ae. aegypti and Ae. albopictus). The regression coefficients of spatially regressed relationships between imported dengue cases and Aedes mosquitoes (Ae. aegypti and Ae. albopictus) were significant in 348 townships in Taiwan. The results indicated that local Aedes mosquitoes do not contribute to the dengue incidence of imported cases. The density of Ae. aegypti positively correlated with the density of human population. By contrast, the density of Ae. albopictus negatively correlated with the density of human population in the areas of southern Taiwan. The results indicated that Ae. aegypti has more opportunities for human-mosquito contact in dengue endemic areas in southern Taiwan. Ae. aegypti, but not Ae. albopictus, and human population density in southern Taiwan are closely associated with an increased risk of autochthonous dengue incidence.
Regression Analysis of Stage Variability for West-Central Florida Lakes
Sacks, Laura A.; Ellison, Donald L.; Swancar, Amy
2008-01-01
The variability in a lake's stage depends upon many factors, including surface-water flows, meteorological conditions, and hydrogeologic characteristics near the lake. An understanding of the factors controlling lake-stage variability for a population of lakes may be helpful to water managers who set regulatory levels for lakes. The goal of this study is to determine whether lake-stage variability can be predicted using multiple linear regression and readily available lake and basin characteristics defined for each lake. Regressions were evaluated for a recent 10-year period (1996-2005) and for a historical 10-year period (1954-63). Ground-water pumping is considered to have affected stage at many of the 98 lakes included in the recent period analysis, and not to have affected stage at the 20 lakes included in the historical period analysis. For the recent period, regression models had coefficients of determination (R2) values ranging from 0.60 to 0.74, and up to five explanatory variables. Standard errors ranged from 21 to 37 percent of the average stage variability. Net leakage was the most important explanatory variable in regressions describing the full range and low range in stage variability for the recent period. The most important explanatory variable in the model predicting the high range in stage variability was the height over median lake stage at which surface-water outflow would occur. Other explanatory variables in final regression models for the recent period included the range in annual rainfall for the period and several variables related to local and regional hydrogeology: (1) ground-water pumping within 1 mile of each lake, (2) the amount of ground-water inflow (by category), (3) the head gradient between the lake and the Upper Floridan aquifer, and (4) the thickness of the intermediate confining unit. Many of the variables in final regression models are related to hydrogeologic characteristics, underscoring the importance of ground-water exchange in controlling the stage of karst lakes in Florida. Regression equations were used to predict lake-stage variability for the recent period for 12 additional lakes, and the median difference between predicted and observed values ranged from 11 to 23 percent. Coefficients of determination for the historical period were considerably lower (maximum R2 of 0.28) than for the recent period. Reasons for these low R2 values are probably related to the small number of lakes (20) with stage data for an equivalent time period that were unaffected by ground-water pumping, the similarity of many of the lake types (large surface-water drainage lakes), and the greater uncertainty in defining historical basin characteristics. The lack of lake-stage data unaffected by ground-water pumping and the poor regression results obtained for that group of lakes limit the ability to predict natural lake-stage variability using this method in west-central Florida.
NASA Astrophysics Data System (ADS)
Deshmukh, A. A.; Kuthe, S. A.; Palikundwar, U. A.
2018-05-01
In the present paper, the consequences of variation in compositions on the electronegativity (ΔX), atomic radius difference (δ) and the thermal stability (ΔTx) of Mg-Ni-Y bulk metallic glasses (BMGs) are evaluated. In order to understand the effect of variation in compositions on ΔX, δ and ΔTx, regression analysis is performed on the experimentally available data. A linear correlation between both δ and ΔX with regression coefficient 0.93 is observed. Further, compositional variation is performed with δ and then it is correlated to the ΔTx by deriving subsequent equations. It is observed that concentration of Mg, Ni and Y are directly proportional to the δ with regression coefficients 0.93, 0.93 and 0.50 respectively. The positive slope of Ni and Y stated that ΔTx will increase if it has more contribution from both Ni and Y. On the other hand negative slope stated that composition of Mg should be selected in such a way that it will have more stability with Ni and Y. The results obtained from mathematical calculations are also tested by regression analysis of ΔTx with the compositions of individual elements in the alloy. These results conclude that there is a strong dependence of ΔTx of the alloy on the compositions of the constituting elements in the alloy.
Prediction of dimethyl disulfide levels from biosolids using statistical modeling.
Gabriel, Steven A; Vilalai, Sirapong; Arispe, Susanna; Kim, Hyunook; McConnell, Laura L; Torrents, Alba; Peot, Christopher; Ramirez, Mark
2005-01-01
Two statistical models were used to predict the concentration of dimethyl disulfide (DMDS) released from biosolids produced by an advanced wastewater treatment plant (WWTP) located in Washington, DC, USA. The plant concentrates sludge from primary sedimentation basins in gravity thickeners (GT) and sludge from secondary sedimentation basins in dissolved air flotation (DAF) thickeners. The thickened sludge is pumped into blending tanks and then fed into centrifuges for dewatering. The dewatered sludge is then conditioned with lime before trucking out from the plant. DMDS, along with other volatile sulfur and nitrogen-containing chemicals, is known to contribute to biosolids odors. These models identified oxidation/reduction potential (ORP) values of a GT and DAF, the amount of sludge dewatered by centrifuges, and the blend ratio between GT thickened sludge and DAF thickened sludge in blending tanks as control variables. The accuracy of the developed regression models was evaluated by checking the adjusted R2 of the regression as well as the signs of coefficients associated with each variable. In general, both models explained observed DMDS levels in sludge headspace samples. The adjusted R2 value of the regression models 1 and 2 were 0.79 and 0.77, respectively. Coefficients for each regression model also had the correct sign. Using the developed models, plant operators can adjust the controllable variables to proactively decrease this odorant. Therefore, these models are a useful tool in biosolids management at WWTPs.
Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications
Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric
2016-01-01
Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939
Li, Xuehua; Zhao, Wenxing; Li, Jing; Jiang, Jingqiu; Chen, Jianji; Chen, Jingwen
2013-08-01
To assess the persistence and fate of volatile organic compounds in the troposphere, the rate constants for the reaction with ozone (kO3) are needed. As kO3 values are only available for hundreds of compounds, and experimental determination of kO3 is costly and time-consuming, it is of importance to develop predictive models on kO3. In this study, a total of 379 logkO3 values at different temperatures were used to develop and validate a model for the prediction of kO3, based on quantum chemical descriptors, Dragon descriptors and structural fragments. Molecular descriptors were screened by stepwise multiple linear regression, and the model was constructed by partial least-squares regression. The cross validation coefficient QCUM(2) of the model is 0.836, and the external validation coefficient Qext(2) is 0.811, indicating that the model has high robustness and good predictive performance. The most significant descriptor explaining logkO3 is the BELm2 descriptor with connectivity information weighted atomic masses. kO3 increases with increasing BELm2, and decreases with increasing ionization potential. The applicability domain of the proposed model was visualized by the Williams plot. The developed model can be used to predict kO3 at different temperatures for a wide range of organic chemicals, including alkenes, cycloalkenes, haloalkenes, alkynes, oxygen-containing compounds, nitrogen-containing compounds (except primary amines) and aromatic compounds. Copyright © 2013 Elsevier Ltd. All rights reserved.
An update on the constitutive relation of ligament tissues with the effects of collagen types.
Wan, Chao; Hao, Zhixiu; Tong, Lingying; Lin, Jianhao; Li, Zhichang; Wen, Shizhu
2015-10-01
The musculoskeletal ligament is a kind of multiscale composite material with collagen fibers embedded in a ground matrix. As the major constituent in ligaments to bear external loads, collagens are composed mainly of two collagen contents with different mechanical properties, i.e., types I and III collagen. The constitutive relation of ligaments plays a critical role in the stability and normal function of human joints. However, collagen types have not been distinguished in the previous constitutive relations. In this paper a constitutive relation for ligament tissues was modified based on the previous constitutive relation by considering the effects of collagen types. Both the collagen contents and the mechanical properties of sixteen ligament specimens from four cadaveric human knee joints were measured for determining their material coefficients in the constitutive relation. The mechanical behaviors of ligaments were obtained from both the uniaxial tensile and simple shear tests. A linear regression between joint kinematic results from in vitro and in silico experiments was made to validate the accuracy of this constitutive relation. The high correlation coefficient (R(2)=0.93) and significance (P<0.0001) of the regression equation revealed that this modified constitutive relation of ligaments was accurate to be used in studying joint biomechanics. Another finite element analysis with collagen contents changing demonstrated that the effect of variations in collagen ratios on both joint kinematics and ligament biomechanics could be simulated by this constitutive relation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Determination of organic compounds in water using ultraviolet LED
NASA Astrophysics Data System (ADS)
Kim, Chihoon; Ji, Taeksoo; Eom, Joo Beom
2018-04-01
This paper describes a method of detecting organic compounds in water using an ultraviolet LED (280 nm) spectroscopy system and a photodetector. The LED spectroscopy system showed a high correlation between the concentration of the prepared potassium hydrogen phthalate and that calculated by multiple linear regression, indicating an adjusted coefficient of determination ranging from 0.953-0.993. In addition, a comparison between the performance of the spectroscopy system and the total organic carbon analyzer indicated that the difference in concentration was small. Based on the close correlation between the spectroscopy and photodetector absorbance values, organic measurement with a photodetector could be configured for monitoring.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keser, Saniye; Duzgun, Sebnem; Department of Geodetic and Geographic Information Technologies, Middle East Technical University, 06800 Ankara
Highlights: Black-Right-Pointing-Pointer Spatial autocorrelation exists in municipal solid waste generation rates for different provinces in Turkey. Black-Right-Pointing-Pointer Traditional non-spatial regression models may not provide sufficient information for better solid waste management. Black-Right-Pointing-Pointer Unemployment rate is a global variable that significantly impacts the waste generation rates in Turkey. Black-Right-Pointing-Pointer Significances of global parameters may diminish at local scale for some provinces. Black-Right-Pointing-Pointer GWR model can be used to create clusters of cities for solid waste management. - Abstract: In studies focusing on the factors that impact solid waste generation habits and rates, the potential spatial dependency in solid waste generation datamore » is not considered in relating the waste generation rates to its determinants. In this study, spatial dependency is taken into account in determination of the significant socio-economic and climatic factors that may be of importance for the municipal solid waste (MSW) generation rates in different provinces of Turkey. Simultaneous spatial autoregression (SAR) and geographically weighted regression (GWR) models are used for the spatial data analyses. Similar to ordinary least squares regression (OLSR), regression coefficients are global in SAR model. In other words, the effect of a given independent variable on a dependent variable is valid for the whole country. Unlike OLSR or SAR, GWR reveals the local impact of a given factor (or independent variable) on the waste generation rates of different provinces. Results show that provinces within closer neighborhoods have similar MSW generation rates. On the other hand, this spatial autocorrelation is not very high for the exploratory variables considered in the study. OLSR and SAR models have similar regression coefficients. GWR is useful to indicate the local determinants of MSW generation rates. GWR model can be utilized to plan waste management activities at local scale including waste minimization, collection, treatment, and disposal. At global scale, the MSW generation rates in Turkey are significantly related to unemployment rate and asphalt-paved roads ratio. Yet, significances of these variables may diminish at local scale for some provinces. At local scale, different factors may be important in affecting MSW generation rates.« less
Cheng, Dengmiao; Feng, Yao; Liu, Yuanwang; Li, Jinpeng; Xue, Jianming; Li, Zhaojun
2018-09-01
Understanding antibiotic adsorption in livestock manures is crucial to assess the fate and risk of antibiotics in the environment. In this study, three quantitative models developed with swine manure-water distribution coefficients (LgK d ) for oxytetracycline (OTC), ciprofloxacin (CIP) and sulfamerazine (SM1) in swine manures. Physicochemical parameters (n=12) of the swine manure were used as independent variables using partial least-squares (PLSs) analysis. The cumulative cross-validated regression coefficients (Q 2 cum ) values, standard deviations (SDs) and external validation coefficient (Q 2 ext ) ranged from 0.761 to 0.868, 0.027 to 0.064, and 0.743 to 0.827 for the three models; as such, internal and external predictability of the models were strong. The pH, soluble organic carbon (SOC) and nitrogen (SON), and Ca were important explanatory variables for the OTC-Model, pH, SOC, and SON for the CIP-model, and pH, total organic nitrogen (TON), and SOC for the SM1-model. The high VIPs (variable importance in the projections) of pH (1.178-1.396), SOC (0.968-1.034), and SON (0.822 and 0.865) established these physicochemical parameters as likely being dominant (associatively) in affecting transport of antibiotics in swine manures. Copyright © 2018 Elsevier B.V. All rights reserved.
Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J
2010-05-01
Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.
NASA Technical Reports Server (NTRS)
Tomberlin, T. J.
1985-01-01
Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.
Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis
NASA Astrophysics Data System (ADS)
Tang, Kunkun; Congedo, Pietro; Abgrall, Remi
2014-11-01
Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.
Aleksandrova, Krasimira; Bamia, Christina; Drogan, Dagmar; Lagiou, Pagona; Trichopoulou, Antonia; Jenab, Mazda; Fedirko, Veronika; Romieu, Isabelle; Bueno-de-Mesquita, H Bas; Pischon, Tobias; Tsilidis, Kostas; Overvad, Kim; Tjønneland, Anne; Bouton-Ruault, Marie-Christine; Dossus, Laure; Racine, Antoine; Kaaks, Rudolf; Kühn, Tilman; Tsironis, Christos; Papatesta, Eleni-Maria; Saitakis, George; Palli, Domenico; Panico, Salvatore; Grioni, Sara; Tumino, Rosario; Vineis, Paolo; Peeters, Petra H; Weiderpass, Elisabete; Lukic, Marko; Braaten, Tonje; Quirós, J Ramón; Luján-Barroso, Leila; Sánchez, María-José; Chilarque, Maria-Dolores; Ardanas, Eva; Dorronsoro, Miren; Nilsson, Lena Maria; Sund, Malin; Wallström, Peter; Ohlsson, Bodil; Bradbury, Kathryn E; Khaw, Kay-Tee; Wareham, Nick; Stepien, Magdalena; Duarte-Salles, Talita; Assi, Nada; Murphy, Neil; Gunter, Marc J; Riboli, Elio; Boeing, Heiner; Trichopoulos, Dimitrios
2015-12-01
Higher coffee intake has been purportedly related to a lower risk of liver cancer. However, it remains unclear whether this association may be accounted for by specific biological mechanisms. We aimed to evaluate the potential mediating roles of inflammatory, metabolic, liver injury, and iron metabolism biomarkers on the association between coffee intake and the primary form of liver cancer-hepatocellular carcinoma (HCC). We conducted a prospective nested case-control study within the European Prospective Investigation into Cancer and Nutrition among 125 incident HCC cases matched to 250 controls using an incidence-density sampling procedure. The association of coffee intake with HCC risk was evaluated by using multivariable-adjusted conditional logistic regression that accounted for smoking, alcohol consumption, hepatitis infection, and other established liver cancer risk factors. The mediating effects of 21 biomarkers were evaluated on the basis of percentage changes and associated 95% CIs in the estimated regression coefficients of models with and without adjustment for biomarkers individually and in combination. The multivariable-adjusted RR of having ≥4 cups (600 mL) coffee/d compared with <2 cups (300 mL)/d was 0.25 (95% CI: 0.11, 0.62; P-trend = 0.006). A statistically significant attenuation of the association between coffee intake and HCC risk and thereby suspected mediation was confirmed for the inflammatory biomarker IL-6 and for the biomarkers of hepatocellular injury glutamate dehydrogenase, alanine aminotransferase, aspartate aminotransferase (AST), γ-glutamyltransferase (GGT), and total bilirubin, which-in combination-attenuated the regression coefficients by 72% (95% CI: 7%, 239%). Of the investigated biomarkers, IL-6, AST, and GGT produced the highest change in the regression coefficients: 40%, 56%, and 60%, respectively. These data suggest that the inverse association of coffee intake with HCC risk was partly accounted for by biomarkers of inflammation and hepatocellular injury.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrew G. Peterson; J. Timothy Ball; Yiqi Luo
1998-09-25
Estimation of leaf photosynthetic rate (A) from leaf nitrogen content (N) is both conceptually and numerically important in models of plant, ecosystem and biosphere responses to global change. The relationship between A and N has been studied extensively at ambient CO{sub 2} but much less at elevated CO{sub 2}. This study was designed to (1) assess whether the A-N relationship was more similar for species within than between community and vegetation types, and (2) examine how growth at elevated CO{sub 2} affects the A-N relationship. Data were obtained for 39 C{sub 3} species grown at ambient CO{sub 2} and 10more » C{sub 3} species grown at ambient and elevated CO{sub 2}. A regression model was applied to each species as well as to species pooled within different community and vegetation types. Cluster analysis of the regression coefficients indicated that species measured at ambient CO{sub 2} did not separate into distinct groups matching community or vegetation type. Instead, most community and vegetation types shared the same general parameter space for regression coefficients. Growth at elevated CO{sub 2} increased photosynthetic nitrogen use efficiency for pines and deciduous trees. When species were pooled by vegetation type, the A-N relationship for deciduous trees expressed on a leaf-mass bask was not altered by elevated CO{sub 2}, while the intercept increased for pines. When regression coefficients were averaged to give mean responses for different vegetation types, elevated CO{sub 2} increased the intercept and the slope for deciduous trees but increased only the intercept for pines. There were no statistical differences between the pines and deciduous trees for the effect of CO{sub 2}. Generalizations about the effect of elevated CO{sub 2} on the A-N relationship, and differences between pines and deciduous trees will be enhanced as more data become available.« less
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
Treatment adherence and health outcomes in patients with bronchiectasis.
McCullough, Amanda R; Tunney, Michael M; Quittner, Alexandra L; Elborn, J Stuart; Bradley, Judy M; Hughes, Carmel M
2014-07-01
We aimed to determine adherence to inhaled antibiotics, other respiratory medicines and airway clearance and to determine the association between adherence to these treatments and health outcomes (pulmonary exacerbations, lung function and Quality of Life Questionnaire-Bronchiectasis [QOL-B]) in bronchiectasis after 12 months. Patients with bronchiectasis prescribed inhaled antibiotics for Pseudomonas aeruginosa infection were recruited into a one-year study. Participants were categorised as "adherent" to medication (medication possession ratio ≥80% using prescription data) or airway clearance (score ≥80% in the Modified Self-Reported Medication-Taking Scale). Pulmonary exacerbations were defined as treatment with a new course of oral or intravenous antibiotics over the one-year study. Spirometry and QOL-B were completed at baseline and 12 months. Associations between adherence to treatment and pulmonary exacerbations, lung function and QOL-B were determined by regression analyses. Seventy-five participants were recruited. Thirty-five (53%), 39 (53%) and 31 (41%) participants were adherent to inhaled antibiotics, other respiratory medicines, and airway clearance, respectively. Twelve (16%) participants were adherent to all treatments. Participants who were adherent to inhaled antibiotics had significantly fewer exacerbations compared to non-adherent participants (2.6 vs 4, p = 0.00) and adherence to inhaled antibiotics was independently associated with having fewer pulmonary exacerbations (regression co-efficient = -0.51, 95% CI [-0.81,-0.21], p < 0.001). Adherence to airway clearance was associated with lower QOL-B Treatment Burden (regression co-efficient = -15.46, 95% CI [-26.54, -4.37], p < 0.01) and Respiratory Symptoms domain scores (regression co-efficient = -10.77, 95% CI [-21.45; -0.09], p < 0.05). There were no associations between adherence to other respiratory medicines and any of the outcomes tested. Adherence to treatment was not associated with FEV1 % predicted. Treatment adherence is low in bronchiectasis and affects important health outcomes including pulmonary exacerbations. Adherence should be measured as part of bronchiectasis management and future research should evaluate bronchiectasis-specific adherence strategies.
Aleksandrova, Krasimira; Bamia, Christina; Drogan, Dagmar; Lagiou, Pagona; Trichopoulou, Antonia; Jenab, Mazda; Fedirko, Veronika; Romieu, Isabelle; Bueno-de-Mesquita, H Bas; Pischon, Tobias; Tsilidis, Kostas; Overvad, Kim; Tjønneland, Anne; Bouton-Ruault, Marie-Christine; Dossus, Laure; Racine, Antoine; Kaaks, Rudolf; Kühn, Tilman; Tsironis, Christos; Papatesta, Eleni-Maria; Saitakis, George; Palli, Domenico; Panico, Salvatore; Grioni, Sara; Tumino, Rosario; Vineis, Paolo; Peeters, Petra H; Weiderpass, Elisabete; Lukic, Marko; Braaten, Tonje; Quirós, J Ramón; Luján-Barroso, Leila; Sánchez, María-José; Chilarque, Maria-Dolores; Ardanas, Eva; Dorronsoro, Miren; Nilsson, Lena Maria; Sund, Malin; Wallström, Peter; Ohlsson, Bodil; Bradbury, Kathryn E; Khaw, Kay-Tee; Wareham, Nick; Stepien, Magdalena; Duarte-Salles, Talita; Assi, Nada; Murphy, Neil; Gunter, Marc J; Riboli, Elio; Boeing, Heiner; Trichopoulos, Dimitrios
2015-01-01
Background: Higher coffee intake has been purportedly related to a lower risk of liver cancer. However, it remains unclear whether this association may be accounted for by specific biological mechanisms. Objective: We aimed to evaluate the potential mediating roles of inflammatory, metabolic, liver injury, and iron metabolism biomarkers on the association between coffee intake and the primary form of liver cancer—hepatocellular carcinoma (HCC). Design: We conducted a prospective nested case-control study within the European Prospective Investigation into Cancer and Nutrition among 125 incident HCC cases matched to 250 controls using an incidence-density sampling procedure. The association of coffee intake with HCC risk was evaluated by using multivariable-adjusted conditional logistic regression that accounted for smoking, alcohol consumption, hepatitis infection, and other established liver cancer risk factors. The mediating effects of 21 biomarkers were evaluated on the basis of percentage changes and associated 95% CIs in the estimated regression coefficients of models with and without adjustment for biomarkers individually and in combination. Results: The multivariable-adjusted RR of having ≥4 cups (600 mL) coffee/d compared with <2 cups (300 mL)/d was 0.25 (95% CI: 0.11, 0.62; P-trend = 0.006). A statistically significant attenuation of the association between coffee intake and HCC risk and thereby suspected mediation was confirmed for the inflammatory biomarker IL-6 and for the biomarkers of hepatocellular injury glutamate dehydrogenase, alanine aminotransferase, aspartate aminotransferase (AST), γ-glutamyltransferase (GGT), and total bilirubin, which—in combination—attenuated the regression coefficients by 72% (95% CI: 7%, 239%). Of the investigated biomarkers, IL-6, AST, and GGT produced the highest change in the regression coefficients: 40%, 56%, and 60%, respectively. Conclusion: These data suggest that the inverse association of coffee intake with HCC risk was partly accounted for by biomarkers of inflammation and hepatocellular injury. PMID:26561631
Gas Chromatography Data Classification Based on Complex Coefficients of an Autoregressive Model
Zhao, Weixiang; Morgan, Joshua T.; Davis, Cristina E.
2008-01-01
This paper introduces autoregressive (AR) modeling as a novel method to classify outputs from gas chromatography (GC). The inverse Fourier transformation was applied to the original sensor data, and then an AR model was applied to transform data to generate AR model complex coefficients. This series of coefficients effectively contains a compressed version of all of the information in the original GC signal output. We applied this method to chromatograms resulting from proliferating bacteria species grown in culture. Three types of neural networks were used to classify the AR coefficients: backward propagating neural network (BPNN), radial basis function-principal component analysismore » (RBF-PCA) approach, and radial basis function-partial least squares regression (RBF-PLSR) approach. This exploratory study demonstrates the feasibility of using complex root coefficient patterns to distinguish various classes of experimental data, such as those from the different bacteria species. This cognition approach also proved to be robust and potentially useful for freeing us from time alignment of GC signals.« less
A Functional Varying-Coefficient Single-Index Model for Functional Response Data
Li, Jialiang; Huang, Chao; Zhu, Hongtu
2016-01-01
Motivated by the analysis of imaging data, we propose a novel functional varying-coefficient single index model (FVCSIM) to carry out the regression analysis of functional response data on a set of covariates of interest. FVCSIM represents a new extension of varying-coefficient single index models for scalar responses collected from cross-sectional and longitudinal studies. An efficient estimation procedure is developed to iteratively estimate varying coefficient functions, link functions, index parameter vectors, and the covariance function of individual functions. We systematically examine the asymptotic properties of all estimators including the weak convergence of the estimated varying coefficient functions, the asymptotic distribution of the estimated index parameter vectors, and the uniform convergence rate of the estimated covariance function and their spectrum. Simulation studies are carried out to assess the finite-sample performance of the proposed procedure. We apply FVCSIM to investigating the development of white matter diffusivities along the corpus callosum skeleton obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI) study. PMID:29200540