Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Application of linear regression analysis in accuracy assessment of rolling force calculations
NASA Astrophysics Data System (ADS)
Poliak, E. I.; Shim, M. K.; Kim, G. S.; Choo, W. Y.
1998-10-01
Efficient operation of the computational models employed in process control systems require periodical assessment of the accuracy of their predictions. Linear regression is proposed as a tool which allows separate systematic and random prediction errors from those related to measurements. A quantitative characteristic of the model predictive ability is introduced in addition to standard statistical tests for model adequacy. Rolling force calculations are considered as an example for the application. However, the outlined approach can be used to assess the performance of any computational model.
How is the weather? Forecasting inpatient glycemic control
Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M
2017-01-01
Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
BIODEGRADATION PROBABILITY PROGRAM (BIODEG)
The Biodegradation Probability Program (BIODEG) calculates the probability that a chemical under aerobic conditions with mixed cultures of microorganisms will biodegrade rapidly or slowly. It uses fragment constants developed using multiple linear and non-linear regressions and d...
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei
2015-05-19
To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
On vertical profile of ozone at Syowa
NASA Technical Reports Server (NTRS)
Chubachi, Shigeru
1994-01-01
The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.
NASA Astrophysics Data System (ADS)
Wang, Hongliang; Liu, Baohua; Ding, Zhongjun; Wang, Xiangxin
2017-02-01
Absorption-based optical sensors have been developed for the determination of water pH. In this paper, based on the preparation of a transparent sol-gel thin film with a phenol red (PR) indicator, several calculation methods, including simple linear regression analysis, quadratic regression analysis and dual-wavelength absorbance ratio analysis, were used to calculate water pH. Results of MSSRR show that dual-wavelength absorbance ratio analysis can improve the calculation accuracy of water pH in long-term measurement.
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Experimental and computational prediction of glass transition temperature of drugs.
Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S
2014-12-22
Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
Evaluating and Improving the SAMA (Segmentation Analysis and Market Assessment) Recruiting Model
2015-06-01
and rewarding me with your love every day. xx THIS PAGE INTENTIONALLY LEFT BLANK 1 I. INTRODUCTION A. THE UNITED STATES ARMY RECRUITING...the relationship between the calculated SAMA potential and the actual 2014 performance. The scatterplot in Figure 8 shows a strong linear... relationship between the SAMA calculated potential and the contracting achievement for 2014, with an R-squared value of 0.871. Simple Linear Regression of
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Combined analysis of magnetic and gravity anomalies using normalized source strength (NSS)
NASA Astrophysics Data System (ADS)
Li, L.; Wu, Y.
2017-12-01
Gravity field and magnetic field belong to potential fields which lead inherent multi-solution. Combined analysis of magnetic and gravity anomalies based on Poisson's relation is used to determinate homology gravity and magnetic anomalies and decrease the ambiguity. The traditional combined analysis uses the linear regression of the reduction to pole (RTP) magnetic anomaly to the first order vertical derivative of the gravity anomaly, and provides the quantitative or semi-quantitative interpretation by calculating the correlation coefficient, slope and intercept. In the calculation process, due to the effect of remanent magnetization, the RTP anomaly still contains the effect of oblique magnetization. In this case the homology gravity and magnetic anomalies display irrelevant results in the linear regression calculation. The normalized source strength (NSS) can be transformed from the magnetic tensor matrix, which is insensitive to the remanence. Here we present a new combined analysis using NSS. Based on the Poisson's relation, the gravity tensor matrix can be transformed into the pseudomagnetic tensor matrix of the direction of geomagnetic field magnetization under the homologous condition. The NSS of pseudomagnetic tensor matrix and original magnetic tensor matrix are calculated and linear regression analysis is carried out. The calculated correlation coefficient, slope and intercept indicate the homology level, Poisson's ratio and the distribution of remanent respectively. We test the approach using synthetic model under complex magnetization, the results show that it can still distinguish the same source under the condition of strong remanence, and establish the Poisson's ratio. Finally, this approach is applied in China. The results demonstrated that our approach is feasible.
Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A
2005-04-15
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
HT-FRTC: a fast radiative transfer code using kernel regression
NASA Astrophysics Data System (ADS)
Thelen, Jean-Claude; Havemann, Stephan; Lewis, Warren
2016-09-01
The HT-FRTC is a principal component based fast radiative transfer code that can be used across the electromagnetic spectrum from the microwave through to the ultraviolet to calculate transmittance, radiance and flux spectra. The principal components cover the spectrum at a very high spectral resolution, which allows very fast line-by-line, hyperspectral and broadband simulations for satellite-based, airborne and ground-based sensors. The principal components are derived during a code training phase from line-by-line simulations for a diverse set of atmosphere and surface conditions. The derived principal components are sensor independent, i.e. no extra training is required to include additional sensors. During the training phase we also derive the predictors which are required by the fast radiative transfer code to determine the principal component scores from the monochromatic radiances (or fluxes, transmittances). These predictors are calculated for each training profile at a small number of frequencies, which are selected by a k-means cluster algorithm during the training phase. Until recently the predictors were calculated using a linear regression. However, during a recent rewrite of the code the linear regression was replaced by a Gaussian Process (GP) regression which resulted in a significant increase in accuracy when compared to the linear regression. The HT-FRTC has been trained with a large variety of gases, surface properties and scatterers. Rayleigh scattering as well as scattering by frozen/liquid clouds, hydrometeors and aerosols have all been included. The scattering phase function can be fully accounted for by an integrated line-by-line version of the Edwards-Slingo spherical harmonics radiation code or approximately by a modification to the extinction (Chou scaling).
A regression-adjusted approach can estimate competing biomass
James H. Miller
1983-01-01
A method is presented for estimating above-ground herbaceous and woody biomass on competition research plots. On a set of destructively-sampled plots, an ocular estimate of biomass by vegetative component is first made, after which vegetation is clipped, dried, and weighed. Linear regressions are then calculated for each component between estimated and actual weights...
Deriving the Regression Line with Algebra
ERIC Educational Resources Information Center
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Reconstructed and projected U.S. residential natural gas consumption during 1896-2043
USDA-ARS?s Scientific Manuscript database
Using state-level monthly heating degree day (HDD) data, per-capita natural gas (NG) consumption records for each state of the continental U.S. were calculated during 1895-2014 using linear regressions. The regressed monthly NG values estimate the effects of 20th and early 21st century climate varia...
New method for calculating a mathematical expression for streamflow recession
Rutledge, Albert T.
1991-01-01
An empirical method has been devised to calculate the master recession curve, which is a mathematical expression for streamflow recession during times of negligible direct runoff. The method is based on the assumption that the storage-delay factor, which is the time per log cycle of streamflow recession, varies linearly with the logarithm of streamflow. The resulting master recession curve can be nonlinear. The method can be executed by a computer program that reads a data file of daily mean streamflow, then allows the user to select several near-linear segments of streamflow recession. The storage-delay factor for each segment is one of the coefficients of the equation that results from linear least-squares regression. Using results for each recession segment, a mathematical expression of the storage-delay factor as a function of the log of streamflow is determined by linear least-squares regression. The master recession curve, which is a second-order polynomial expression for time as a function of log of streamflow, is then derived using the coefficients of this function.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio
2016-10-01
We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Liquid electrolyte informatics using an exhaustive search with linear regression.
Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato
2018-06-14
Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
A Weighted Least Squares Approach To Robustify Least Squares Estimates.
ERIC Educational Resources Information Center
Lin, Chowhong; Davenport, Ernest C., Jr.
This study developed a robust linear regression technique based on the idea of weighted least squares. In this technique, a subsample of the full data of interest is drawn, based on a measure of distance, and an initial set of regression coefficients is calculated. The rest of the data points are then taken into the subsample, one after another,…
Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko
2018-03-01
The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.
Estimation of standard liver volume in Chinese adult living donors.
Fu-Gui, L; Lu-Nan, Y; Bo, L; Yong, Z; Tian-Fu, W; Ming-Qing, X; Wen-Tao, W; Zhe-Yu, C
2009-12-01
To determine a formula predicting the standard liver volume based on body surface area (BSA) or body weight in Chinese adults. A total of 115 consecutive right-lobe living donors not including the middle hepatic vein underwent right hemi-hepatectomy. No organs were used from prisoners, and no subjects were prisoners. Donor anthropometric data including age, gender, body weight, and body height were recorded prospectively. The weights and volumes of the right lobe liver grafts were measured at the back table. Liver weights and volumes were calculated from the right lobe graft weight and volume obtained at the back table, divided by the proportion of the right lobe on computed tomography. By simple linear regression analysis and stepwise multiple linear regression analysis, we correlated calculated liver volume and body height, body weight, or body surface area. The subjects had a mean age of 35.97 +/- 9.6 years, and a female-to-male ratio of 60:55. The mean volume of the right lobe was 727.47 +/- 136.17 mL, occupying 55.59% +/- 6.70% of the whole liver by computed tomography. The volume of the right lobe was 581.73 +/- 96.137 mL, and the estimated liver volume was 1053.08 +/- 167.56 mL. Females of the same body weight showed a slightly lower liver weight. By simple linear regression analysis and stepwise multiple linear regression analysis, a formula was derived based on body weight. All formulae except the Hong Kong formula overestimated liver volume compared to this formula. The formula of standard liver volume, SLV (mL) = 11.508 x body weight (kg) + 334.024, may be applied to estimate liver volumes in Chinese adults.
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.
Trninić, Marko; Jeličić, Mario; Papić, Vladan
2015-07-01
In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
Mapping diffuse photosynthetically active radiation from satellite data in Thailand
NASA Astrophysics Data System (ADS)
Choosri, P.; Janjai, S.; Nunez, M.; Buntoung, S.; Charuchittipan, D.
2017-12-01
In this paper, calculation of monthly average hourly diffuse photosynthetically active radiation (PAR) using satellite data is proposed. Diffuse PAR was analyzed at four stations in Thailand. A radiative transfer model was used for calculating the diffuse PAR for cloudless sky conditions. Differences between the diffuse PAR under all sky conditions obtained from the ground-based measurements and those from the model are representative of cloud effects. Two models are developed, one describing diffuse PAR only as a function of solar zenith angle, and the second one as a multiple linear regression with solar zenith angle and satellite reflectivity acting linearly and aerosol optical depth acting in logarithmic functions. When tested with an independent data set, the multiple regression model performed best with a higher coefficient of variance R2 (0.78 vs. 0.70), lower root mean square difference (RMSD) (12.92% vs. 13.05%) and the same mean bias difference (MBD) of -2.20%. Results from the multiple regression model are used to map diffuse PAR throughout the country as monthly averages of hourly data.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Equilibrium, kinetics and process design of acid yellow 132 adsorption onto red pine sawdust.
Can, Mustafa
2015-01-01
Linear and non-linear regression procedures have been applied to the Langmuir, Freundlich, Tempkin, Dubinin-Radushkevich, and Redlich-Peterson isotherms for adsorption of acid yellow 132 (AY132) dye onto red pine (Pinus resinosa) sawdust. The effects of parameters such as particle size, stirring rate, contact time, dye concentration, adsorption dose, pH, and temperature were investigated, and interaction was characterized by Fourier transform infrared spectroscopy and field emission scanning electron microscope. The non-linear method of the Langmuir isotherm equation was found to be the best fitting model to the equilibrium data. The maximum monolayer adsorption capacity was found as 79.5 mg/g. The calculated thermodynamic results suggested that AY132 adsorption onto red pine sawdust was an exothermic, physisorption, and spontaneous process. Kinetics was analyzed by four different kinetic equations using non-linear regression analysis. The pseudo-second-order equation provides the best fit with experimental data.
NASA Astrophysics Data System (ADS)
Meric de Bellefon, G.; van Duysen, J. C.; Sridharan, K.
2017-08-01
The stacking fault energy (SFE) plays an important role in deformation behavior and radiation damage of FCC metals and alloys such as austenitic stainless steels. In the present communication, existing expressions to calculate SFE in those steels from chemical composition are reviewed and an improved multivariate linear regression with random intercepts is used to analyze a new database of 144 SFE measurements collected from 30 literature references. It is shown that the use of random intercepts can account for experimental biases in these literature references. A new expression to predict SFE from austenitic stainless steel compositions is proposed.
Pease, J M; Morselli, M F
1987-01-01
This paper deals with a computer program adapted to a statistical method for analyzing an unlimited quantity of binary recorded data of an independent circular variable (e.g. wind direction), and a linear variable (e.g. maple sap flow volume). Circular variables cannot be statistically analyzed with linear methods, unless they have been transformed. The program calculates a critical quantity, the acrophase angle (PHI, phi o). The technique is adapted from original mathematics [1] and is written in Fortran 77 for easier conversion between computer networks. Correlation analysis can be performed following the program or regression which, because of the circular nature of the independent variable, becomes periodic regression. The technique was tested on a file of approximately 4050 data pairs.
Delgado, J; Liao, J C
1992-01-01
The methodology previously developed for determining the Flux Control Coefficients [Delgado & Liao (1992) Biochem. J. 282, 919-927] is extended to the calculation of metabolite Concentration Control Coefficients. It is shown that the transient metabolite concentrations are related by a few algebraic equations, attributed to mass balance, stoichiometric constraints, quasi-equilibrium or quasi-steady states, and kinetic regulations. The coefficients in these relations can be estimated using linear regression, and can be used to calculate the Control Coefficients. The theoretical basis and two examples are discussed. Although the methodology is derived based on the linear approximation of enzyme kinetics, it yields reasonably good estimates of the Control Coefficients for systems with non-linear kinetics. PMID:1497632
Double-time correlation functions of two quantum operations in open systems
NASA Astrophysics Data System (ADS)
Ban, Masashi
2017-10-01
A double-time correlation function of arbitrary two quantum operations is studied for a nonstationary open quantum system which is in contact with a thermal reservoir. It includes a usual correlation function, a linear response function, and a weak value of an observable. Time evolution of the correlation function can be derived by means of the time-convolution and time-convolutionless projection operator techniques. For this purpose, a quasidensity operator accompanied by a fictitious field is introduced, which makes it possible to derive explicit formulas for calculating a double-time correlation function in the second-order approximation with respect to a system-reservoir interaction. The derived formula explicitly shows that the quantum regression theorem for calculating the double-time correlation function cannot be used if a thermal reservoir has a finite correlation time. Furthermore, the formula is applied for a pure dephasing process and a linear dissipative process. The quantum regression theorem and the the Leggett-Garg inequality are investigated for an open two-level system. The results are compared with those obtained by exact calculation to examine whether the formula is a good approximation.
Linear Modeling and Evaluation of Controls on Flow Response in Western Post-Fire Watersheds
NASA Astrophysics Data System (ADS)
Saxe, S.; Hogue, T. S.; Hay, L.
2015-12-01
This research investigates the impact of wildfires on watershed flow regimes throughout the western United States, specifically focusing on evaluation of fire events within specified subregions and determination of the impact of climate and geophysical variables in post-fire flow response. Fire events were collected through federal and state-level databases and streamflow data were collected from U.S. Geological Survey stream gages. 263 watersheds were identified with at least 10 years of continuous pre-fire daily streamflow records and 5 years of continuous post-fire daily flow records. For each watershed, percent changes in runoff ratio (RO), annual seven day low-flows (7Q2) and annual seven day high-flows (7Q10) were calculated from pre- to post-fire. Numerous independent variables were identified for each watershed and fire event, including topographic, land cover, climate, burn severity, and soils data. The national watersheds were divided into five regions through K-clustering and a lasso linear regression model, applying the Leave-One-Out calibration method, was calculated for each region. Nash-Sutcliffe Efficiency (NSE) was used to determine the accuracy of the resulting models. The regions encompassing the United States along and west of the Rocky Mountains, excluding the coastal watersheds, produced the most accurate linear models. The Pacific coast region models produced poor and inconsistent results, indicating that the regions need to be further subdivided. Presently, RO and HF response variables appear to be more easily modeled than LF. Results of linear regression modeling showed varying importance of watershed and fire event variables, with conflicting correlation between land cover types and soil types by region. The addition of further independent variables and constriction of current variables based on correlation indicators is ongoing and should allow for more accurate linear regression modeling.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Wu, Cheng; Zhen Yu, Jian
2018-03-01
Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Evaluation of confidence intervals for a steady-state leaky aquifer model
Christensen, S.; Cooley, R.L.
1999-01-01
The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley [Vecchia, A.V. and Cooley, R.L., Water Resources Research, 1987, 23(7), 1237-1250] was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.
Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E
2002-01-01
Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.
TI-59 Programs for Multiple Regression.
1980-05-01
general linear hypothesis model of full rank [ Graybill , 19611 can be written as Y = x 8 + C , s-N(O,o 2I) nxl nxk kxl nxl where Y is the vector of n...a "reduced model " solution, and confidence intervals for linear functions of the coefficients can be obtained using (x’x) and a2, based on the t...O107)l UA.LLL. Library ModuIe NASTER -Puter 0NTINA Cards 1 PROGRAM DESCRIPTION (s s 2 ror the general linear hypothesis model Y - XO + C’ calculates
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe
2010-12-01
To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
Predicting major element mineral/melt equilibria - A statistical approach
NASA Technical Reports Server (NTRS)
Hostetler, C. J.; Drake, M. J.
1980-01-01
Empirical equations have been developed for calculating the mole fractions of NaO0.5, MgO, AlO1.5, SiO2, KO0.5, CaO, TiO2, and FeO in a solid phase of initially unknown identity given only the composition of the coexisting silicate melt. The approach involves a linear multivariate regression analysis in which solid composition is expressed as a Taylor series expansion of the liquid compositions. An internally consistent precision of approximately 0.94 is obtained, that is, the nature of the liquidus phase in the input data set can be correctly predicted for approximately 94% of the entries. The composition of the liquidus phase may be calculated to better than 5 mol % absolute. An important feature of this 'generalized solid' model is its reversibility; that is, the dependent and independent variables in the linear multivariate regression may be inverted to permit prediction of the composition of a silicate liquid produced by equilibrium partial melting of a polymineralic source assemblage.
Shillcutt, Samuel D; LeFevre, Amnesty E; Fischer-Walker, Christa L; Taneja, Sunita; Black, Robert E; Mazumder, Sarmila
2017-01-01
This study evaluates the cost-effectiveness of the DAZT program for scaling up treatment of acute child diarrhea in Gujarat India using a net-benefit regression framework. Costs were calculated from societal and caregivers' perspectives and effectiveness was assessed in terms of coverage of zinc and both zinc and Oral Rehydration Salt. Regression models were tested in simple linear regression, with a specified set of covariates, and with a specified set of covariates and interaction terms using linear regression with endogenous treatment effects was used as the reference case. The DAZT program was cost-effective with over 95% certainty above $5.50 and $7.50 per appropriately treated child in the unadjusted and adjusted models respectively, with specifications including interaction terms being cost-effective with 85-97% certainty. Findings from this study should be combined with other evidence when considering decisions to scale up programs such as the DAZT program to promote the use of ORS and zinc to treat child diarrhea.
O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H
2012-10-01
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
Independent contrasts and PGLS regression estimators are equivalent.
Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary
2012-05-01
We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Linear and nonlinear models for predicting fish bioconcentration factors for pesticides.
Yuan, Jintao; Xie, Chun; Zhang, Ting; Sun, Jinfang; Yuan, Xuejie; Yu, Shuling; Zhang, Yingbiao; Cao, Yunyuan; Yu, Xingchen; Yang, Xuan; Yao, Wu
2016-08-01
This work is devoted to the applications of the multiple linear regression (MLR), multilayer perceptron neural network (MLP NN) and projection pursuit regression (PPR) to quantitative structure-property relationship analysis of bioconcentration factors (BCFs) of pesticides tested on Bluegill (Lepomis macrochirus). Molecular descriptors of a total of 107 pesticides were calculated with the DRAGON Software and selected by inverse enhanced replacement method. Based on the selected DRAGON descriptors, a linear model was built by MLR, nonlinear models were developed using MLP NN and PPR. The robustness of the obtained models was assessed by cross-validation and external validation using test set. Outliers were also examined and deleted to improve predictive power. Comparative results revealed that PPR achieved the most accurate predictions. This study offers useful models and information for BCF prediction, risk assessment, and pesticide formulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Introductory Linear Regression Programs in Undergraduate Chemistry.
ERIC Educational Resources Information Center
Gale, Robert J.
1982-01-01
Presented are simple programs in BASIC and FORTRAN to apply the method of least squares. They calculate gradients and intercepts and express errors as standard deviations. An introduction of undergraduate students to such programs in a chemistry class is reviewed, and issues instructors should be aware of are noted. (MP)
40 CFR 86.1341-90 - Test cycle validation criteria.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 40 Protection of Environment 19 2011-07-01 2011-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...
40 CFR 86.1341-90 - Test cycle validation criteria.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 40 Protection of Environment 20 2013-07-01 2013-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...
40 CFR 86.1341-90 - Test cycle validation criteria.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 40 Protection of Environment 20 2012-07-01 2012-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...
Year-round measurements of CH4 exchange in a forested drained peatland using automated chambers
NASA Astrophysics Data System (ADS)
Korkiakoski, Mika; Koskinen, Markku; Penttilä, Timo; Arffman, Pentti; Ojanen, Paavo; Minkkinen, Kari; Laurila, Tuomas; Lohila, Annalea
2016-04-01
Pristine peatlands are usually carbon accumulating ecosystems and sources of methane (CH4). Draining peatlands for forestry increases the thickness of the oxic layer, thus enhancing CH4 oxidation which leads to decreased CH4 emissions. Closed chambers are commonly used in estimating the greenhouse gas exchange between the soil and the atmosphere. However, the closed chamber technique alters the gas concentration gradient making the concentration development against time non-linear. Selecting the correct fitting method is important as it can be the largest source of uncertainty in flux calculation. We measured CH4 exchange rates and their diurnal and seasonal variations in a nutrient-rich drained peatland located in southern Finland. The original fen was drained for forestry in 1970s and now the tree stand is a mixture of Scots pine, Norway spruce and Downy birch. Our system consisted of six transparent polycarbonate chambers and stainless steel frames, positioned on different types of field and moss layer. During winter, the frame was raised above the snowpack with extension collars and the height of the snowpack inside the chamber was measured regularly. The chambers were closed hourly and the sample gas was sucked into a cavity ring-down spectrometer and analysed for CH4, CO2 and H2O concentration with 5 second time resolution. The concentration change in time in the beginning of a closure was determined with linear and exponential fits. The results show that linear regression systematically underestimated the CH4 flux when compared to exponential regression by 20-50 %. On the other hand, the exponential regression seemed not to work reliably with small fluxes (< 3.5 μg CH4 m-2 h-1): using exponential regression in such cases typically resulted in anomalously large fluxes and high deviation. Due to these facts, we recommend first calculating the flux with the linear regression and, if the flux is high enough, calculate the flux again using the exponential regression and use this value in later analysis. The forest floor at the site (including the ground vegetation) acted as a CH4 sink most of the time. CH4 emission peaks were occasionally observed, particularly in spring during the snow melt, and during rainfall events in summer. Diurnal variation was observed mainly in summer. The net CH4 exchange for the two year measurement period in the six chambers varied from -31 to -155 mg CH4 m-2 yr-1, the average being -67 mg CH4 m-2 yr-1. However, this does not include the ditches which typically act as a significant source for CH4.
A web-based normative calculator for the uniform data set (UDS) neuropsychological test battery.
Shirk, Steven D; Mitchell, Meghan B; Shaughnessy, Lynn W; Sherman, Janet C; Locascio, Joseph J; Weintraub, Sandra; Atri, Alireza
2011-11-11
With the recent publication of new criteria for the diagnosis of preclinical Alzheimer's disease (AD), there is a need for neuropsychological tools that take premorbid functioning into account in order to detect subtle cognitive decline. Using demographic adjustments is one method for increasing the sensitivity of commonly used measures. We sought to provide a useful online z-score calculator that yields estimates of percentile ranges and adjusts individual performance based on sex, age and/or education for each of the neuropsychological tests of the National Alzheimer's Coordinating Center Uniform Data Set (NACC, UDS). In addition, we aimed to provide an easily accessible method of creating norms for other clinical researchers for their own, unique data sets. Data from 3,268 clinically cognitively-normal older UDS subjects from a cohort reported by Weintraub and colleagues (2009) were included. For all neuropsychological tests, z-scores were estimated by subtracting the raw score from the predicted mean and then dividing this difference score by the root mean squared error term (RMSE) for a given linear regression model. For each neuropsychological test, an estimated z-score was calculated for any raw score based on five different models that adjust for the demographic predictors of SEX, AGE and EDUCATION, either concurrently, individually or without covariates. The interactive online calculator allows the entry of a raw score and provides five corresponding estimated z-scores based on predictions from each corresponding linear regression model. The calculator produces percentile ranks and graphical output. An interactive, regression-based, normative score online calculator was created to serve as an additional resource for UDS clinical researchers, especially in guiding interpretation of individual performances that appear to fall in borderline realms and may be of particular utility for operationalizing subtle cognitive impairment present according to the newly proposed criteria for Stage 3 preclinical Alzheimer's disease.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
A Regression Framework for Effect Size Assessments in Longitudinal Modeling of Group Differences
Feingold, Alan
2013-01-01
The use of growth modeling analysis (GMA)--particularly multilevel analysis and latent growth modeling--to test the significance of intervention effects has increased exponentially in prevention science, clinical psychology, and psychiatry over the past 15 years. Model-based effect sizes for differences in means between two independent groups in GMA can be expressed in the same metric (Cohen’s d) commonly used in classical analysis and meta-analysis. This article first reviews conceptual issues regarding calculation of d for findings from GMA and then introduces an integrative framework for effect size assessments that subsumes GMA. The new approach uses the structure of the linear regression model, from which effect sizes for findings from diverse cross-sectional and longitudinal analyses can be calculated with familiar statistics, such as the regression coefficient, the standard deviation of the dependent measure, and study duration. PMID:23956615
Healthy life expectancy in Hong Kong Special Administrative Region of China.
Law, C. K.; Yip, P. S. F.
2003-01-01
Sullivan's method and a regression model were used to calculate healthy life expectancy (HALE) for men and women in Hong Kong Special Administrative Region (Hong Kong SAR) of China. These methods need estimates of the prevalence and information on disability distributions of 109 diseases and HALE for 191 countries by age, sex and region of the world from the WHO's health assessment of 2000. The population of Hong Kong SAR has one of the highest healthy life expectancies in the world. Sullivan's method gives higher estimates than the classic linear regression method. Although Sullivan's method accurately calculates the influence of disease prevalence within small areas and regions, the regression method can approximate HALE for all economies for which information on life expectancy is available. This paper identifies some problems of the two methods and discusses the accuracy of estimates of HALE that rely on data from the WHO assessment. PMID:12640475
Linear models for calculating digestibile energy for sheep diets.
Fonnesbeck, P V; Christiansen, M L; Harris, L E
1981-05-01
Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.
Yoneoka, Daisuke; Henmi, Masayuki
2017-06-01
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Chaurasia, Ashok; Harel, Ofer
2015-02-10
Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
Bioassay of the Nucleopolyhedrosis Virus of Neodiprion sertifer (Hymenoptera: Diprionidae)
M.A. Mohamed; J.D. Podgwaite
1982-01-01
Linear regression analysis of probit mortality versus several concentrations of nucleopolyhedrosis virus of Neodiprion sertifer resulted in the equation Y = 2.170 + 0.872X. An LC50 was calculated at 1758 PIB/ml. Also, the incubation time of the virus was dependent on its concentration. Most insect viruses possess the potential...
Predicting flight delay based on multiple linear regression
NASA Astrophysics Data System (ADS)
Ding, Yi
2017-08-01
Delay of flight has been regarded as one of the toughest difficulties in aviation control. How to establish an effective model to handle the delay prediction problem is a significant work. To solve the problem that the flight delay is difficult to predict, this study proposes a method to model the arriving flights and a multiple linear regression algorithm to predict delay, comparing with Naive-Bayes and C4.5 approach. Experiments based on a realistic dataset of domestic airports show that the accuracy of the proposed model approximates 80%, which is further improved than the Naive-Bayes and C4.5 approach approaches. The result testing shows that this method is convenient for calculation, and also can predict the flight delays effectively. It can provide decision basis for airport authorities.
Bertrand-Krajewski, J L
2004-01-01
In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.
Computerized dynamic posturography: the influence of platform stability on postural control.
Palm, Hans-Georg; Lang, Patricia; Strobel, Johannes; Riesner, Hans-Joachim; Friemert, Benedikt
2014-01-01
Postural stability can be quantified using posturography systems, which allow different foot platform stability settings to be selected. It is unclear, however, how platform stability and postural control are mathematically correlated. Twenty subjects performed tests on the Biodex Stability System at all 13 stability levels. Overall stability index, medial-lateral stability index, and anterior-posterior stability index scores were calculated, and data were analyzed using analysis of variance and linear regression analysis. A decrease in platform stability from the static level to the second least stable level was associated with a linear decrease in postural control. The overall stability index scores were 1.5 ± 0.8 degrees (static), 2.2 ± 0.9 degrees (level 8), and 3.6 ± 1.7 degrees (level 2). The slope of the regression lines was 0.17 for the men and 0.10 for the women. A linear correlation was demonstrated between platform stability and postural control. The influence of stability levels seems to be almost twice as high in men as in women.
Regression-based model of skin diffuse reflectance for skin color analysis
NASA Astrophysics Data System (ADS)
Tsumura, Norimichi; Kawazoe, Daisuke; Nakaguchi, Toshiya; Ojima, Nobutoshi; Miyake, Yoichi
2008-11-01
A simple regression-based model of skin diffuse reflectance is developed based on reflectance samples calculated by Monte Carlo simulation of light transport in a two-layered skin model. This reflectance model includes the values of spectral reflectance in the visible spectra for Japanese women. The modified Lambert Beer law holds in the proposed model with a modified mean free path length in non-linear density space. The averaged RMS and maximum errors of the proposed model were 1.1 and 3.1%, respectively, in the above range.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
NASA Technical Reports Server (NTRS)
Stephenson, J. D.
1983-01-01
Flight experiments with an augmented jet flap STOL aircraft provided data from which the lateral directional stability and control derivatives were calculated by applying a linear regression parameter estimation procedure. The tests, which were conducted with the jet flaps set at a 65 deg deflection, covered a large range of angles of attack and engine power settings. The effect of changing the angle of the jet thrust vector was also investigated. Test results are compared with stability derivatives that had been predicted. The roll damping derived from the tests was significantly larger than had been predicted, whereas the other derivatives were generally in agreement with the predictions. Results obtained using a maximum likelihood estimation procedure are compared with those from the linear regression solutions.
Fananapazir, Ghaneh; Benzl, Robert; Corwin, Michael T; Chen, Ling-Xin; Sageshima, Junichiro; Stewart, Susan L; Troppmann, Christoph
2018-07-01
Purpose To determine whether the predonation computed tomography (CT)-based volume of the future remnant kidney is predictive of postdonation renal function in living kidney donors. Materials and Methods This institutional review board-approved, retrospective, HIPAA-compliant study included 126 live kidney donors who had undergone predonation renal CT between January 2007 and December 2014 as well as 2-year postdonation measurement of estimated glomerular filtration rate (eGFR). The whole kidney volume and cortical volume of the future remnant kidney were measured and standardized for body surface area (BSA). Bivariate linear associations between the ratios of whole kidney volume to BSA and cortical volume to BSA were obtained. A linear regression model for 2-year postdonation eGFR that incorporated donor age, sex, and either whole kidney volume-to-BSA ratio or cortical volume-to-BSA ratio was created, and the coefficient of determination (R 2 ) for the model was calculated. Factors not statistically additive in assessing 2-year eGFR were removed by using backward elimination, and the coefficient of determination for this parsimonious model was calculated. Results Correlation was slightly better for cortical volume-to-BSA ratio than for whole kidney volume-to-BSA ratio (r = 0.48 vs r = 0.44, respectively). The linear regression model incorporating all donor factors had an R 2 of 0.66. The only factors that were significantly additive to the equation were cortical volume-to-BSA ratio and predonation eGFR (P = .01 and P < .01, respectively), and the final parsimonious linear regression model incorporating these two variables explained almost the same amount of variance (R 2 = 0.65) as did the full model. Conclusion The cortical volume of the future remnant kidney helped predict postdonation eGFR at 2 years. The cortical volume-to-BSA ratio should thus be considered for addition as an important variable to living kidney donor evaluation and selection guidelines. © RSNA, 2018.
Stratospheric Ozone Trends and Variability as Seen by SCIAMACHY from 2002 to 2012
NASA Technical Reports Server (NTRS)
Gebhardt, C.; Rozanov, A.; Hommel, R.; Weber, M.; Bovensmann, H.; Burrows, J. P.; Degenstein, D.; Froidevaux, L.; Thompson, A. M.
2014-01-01
Vertical profiles of the rate of linear change (trend) in the altitude range 15-50 km are determined from decadal O3 time series obtained from SCIAMACHY/ENVISAT measurements in limb-viewing geometry. The trends are calculated by using a multivariate linear regression. Seasonal variations, the quasi-biennial oscillation, signatures of the solar cycle and the El Nino-Southern Oscillation are accounted for in the regression. The time range of trend calculation is August 2002-April 2012. A focus for analysis are the zonal bands of 20 deg N - 20 deg S (tropics), 60 - 50 deg N, and 50 - 60 deg S (midlatitudes). In the tropics, positive trends of up to 5% per decade between 20 and 30 km and negative trends of up to 10% per decade between 30 and 38 km are identified. Positive O3 trends of around 5% per decade are found in the upper stratosphere in the tropics and at midlatitudes. Comparisons between SCIAMACHY and EOS MLS show reasonable agreement both in the tropics and at midlatitudes for most altitudes. In the tropics, measurements from OSIRIS/Odin and SHADOZ are also analysed. These yield rates of linear change of O3 similar to those from SCIAMACHY. However, the trends from SCIAMACHY near 34 km in the tropics are larger than MLS and OSIRIS by a factor of around two.
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
Solar energy distribution over Egypt using cloudiness from Meteosat photos
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mosalam Shaltout, M.A.; Hassen, A.H.
1990-01-01
In Egypt, there are 10 ground stations for measuring the global solar radiation, and five stations for measuring the diffuse solar radiation. Every day at noon, the Meteorological Authority in Cairo receives three photographs of cloudiness over Egypt from the Meteosat satellite, one in the visible, and two in the infra-red bands (10.5-12.5 {mu}m) and (5.7-7.1 {mu}m). The monthly average cloudiness for 24 sites over Egypt are measured and calculated from Meteosat observations during the period 1985-1986. Correlation analysis between the cloudiness observed by Meteosat and global solar radiation measured from the ground stations is carried out. It is foundmore » that, the correlation coefficients are about 0.90 for the simple linear regression, and increase for the second and third degree regressions. Also, the correlation coefficients for the cloudiness with the diffuse solar radiation are about 0.80 for the simple linear regression, and increase for the second and third degree regression. Models and empirical relations for estimating the global and diffuse solar radiation from Meteosat cloudiness data over Egypt are deduced and tested. Seasonal maps for the global and diffuse radiation over Egypt are carried out.« less
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I
2007-10-01
To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.
Estimating linear effects in ANOVA designs: the easy way.
Pinhas, Michal; Tzelgov, Joseph; Ganor-Stern, Dana
2012-09-01
Research in cognitive science has documented numerous phenomena that are approximated by linear relationships. In the domain of numerical cognition, the use of linear regression for estimating linear effects (e.g., distance and SNARC effects) became common following Fias, Brysbaert, Geypens, and d'Ydewalle's (1996) study on the SNARC effect. While their work has become the model for analyzing linear effects in the field, it requires statistical analysis of individual participants and does not provide measures of the proportions of variability accounted for (cf. Lorch & Myers, 1990). In the present methodological note, using both the distance and SNARC effects as examples, we demonstrate how linear effects can be estimated in a simple way within the framework of repeated measures analysis of variance. This method allows for estimating effect sizes in terms of both slope and proportions of variability accounted for. Finally, we show that our method can easily be extended to estimate linear interaction effects, not just linear effects calculated as main effects.
Aortic dimensions in Turner syndrome.
Quezada, Emilio; Lapidus, Jodi; Shaughnessy, Robin; Chen, Zunqiu; Silberbach, Michael
2015-11-01
In Turner syndrome, linear growth is less than the general population. Consequently, to assess stature in Turner syndrome, condition-specific comparators have been employed. Similar reference curves for cardiac structures in Turner syndrome are currently unavailable. Accurate assessment of the aorta is particularly critical in Turner syndrome because aortic dissection and rupture occur more frequently than in the general population. Furthermore, comparisons to references calculated from the taller general population with the shorter Turner syndrome population can lead to over-estimation of aortic size causing stigmatization, medicalization, and potentially over-treatment. We used echocardiography to measure aortic diameters at eight levels of the thoracic aorta in 481 healthy girls and women with Turner syndrome who ranged in age from two to seventy years. Univariate and multivariate linear regression analyses were performed to assess the influence of karyotype, age, body mass index, bicuspid aortic valve, blood pressure, history of renal disease, thyroid disease, or growth hormone therapy. Because only bicuspid aortic valve was found to independently affect aortic size, subjects with bicuspid aortic valve were excluded from the analysis. Regression equations for aortic diameters were calculated and Z-scores corresponding to 1, 2, and 3 standard deviations from the mean were plotted against body surface area. The information presented here will allow clinicians and other caregivers to calculate aortic Z-scores using a Turner-based reference population. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Oki, Ryo; Ito, Kazuto; Suzuki, Rie; Fujizuka, Yuji; Arai, Seiji; Miyazawa, Yoshiyuki; Sekine, Yoshitaka; Koike, Hidekazu; Matsui, Hiroshi; Shibata, Yasuhiro; Suzuki, Kazuhiro
2018-04-26
Japan has experienced a drastic increase in the incidence of prostate cancer (PC). To assess changes in the risk for PC, we investigated baseline prostate specific antigen (PSA) levels in first-time screened men, across a 25-year period. In total, 72,654 men, aged 50-79, underwent first-time PSA screening in Gunma prefecture between 1992 and 2016. Changes in the distribution of PSA levels were investigated, including the percentage of men with a PSA above cut-off values and linear regression analyses comparing log 10 PSA with age. The 'ultimate incidence' of PC and clinically significant PC (CSPC) were estimated using the PC risk calculator. Changes in the age-standardized incidence rate (AIR) during this period were analyzed. The calculated coefficients of linear regression for age versus log 10 PSA fluctuated during the 25-year period, but no trend was observed. In addition, the percentage of men with a PSA above cut-off values varied in each 5-year period, with no specific trend. The 'risk calculator (RC)-based AIR' of PC and CSPC were stable between 1992 and 2016. Therefore, the baseline risk for developing PC has remained unchanged in the past 25 years, in Japan. The drastic increase in the incidence of PC, beginning around 2000, may be primarily due to increased PSA screening in the country. © 2018 UICC.
Estimating cost of large-fire suppression for three Forest Service regions
Eric L. Smith; Gonz& aacute; lez-Cab& aacute; n Armando
1987-01-01
The annual costs attributable to large fire suppression in three Forest Service Regions (1970-1981) were estimated as a function of fire perimeters using linear regression. Costs calculated on a per chain of perimeterbasis were highest for the Pacific Northwest Region, next highest for the Northern Region, and lowest for the Intermountain Region. Recent costs in real...
Have the temperature time series a structural change after 1998?
NASA Astrophysics Data System (ADS)
Werner, Rolf; Valev, Dimitare; Danov, Dimitar
2012-07-01
The global and hemisphere temperature GISS and Hadcrut3 time series were analysed for structural changes. We postulate the continuity of the preceding temperature function depending from the time. The slopes are calculated for a sequence of segments limited by time thresholds. We used a standard method, the restricted linear regression with dummy variables. We performed the calculations and tests for different number of thresholds. The thresholds are searched continuously in determined time intervals. The F-statistic is used to obtain the time points of the structural changes.
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Can change in high-density lipoprotein cholesterol levels reduce cardiovascular risk?
Dean, Bonnie B; Borenstein, Jeff E; Henning, James M; Knight, Kevin; Merz, C Noel Bairey
2004-06-01
The cardiovascular risk reduction observed in many trials of lipid-lowering agents is greater than expected on the basis of observed low-density lipoprotein cholesterol (LDL-C) level reductions. Our objective was to explore the degree to which high-density lipoprotein cholesterol (HDL-C) level changes explain cardiovascular risk reduction. A systematic review identified trials of lipid-lowering agents reporting changes in HDL-C and LDL-C levels and the incidence of coronary heart disease (CHD). The observed relative risk reduction (RRR) in CHD morbidity and mortality rates was calculated. The expected RRR, given the treatment effect on total cholesterol level, was calculated for each trial with logistic regression coefficients from observational studies. The difference between observed and expected RRR was plotted against the change in HDL-C level, and a least-squares regression line was calculated. Fifty-one trials were identified. Nineteen statin trials addressed the association of HDL-C with CHD. Limited numbers of trials of other therapies precluded additional analyses. Among statin trials, therapy reduced total cholesterol levels as much as 32% and LDL-C levels as much as 45%. HDL-C level increases were <10%. Treatment effect on HDL-C levels was not a significant linear predictor of the difference in observed and expected CHD mortality rates, although we observed a trend in this direction (P =.08). Similarly, HDL-C effect was not a significant linear predictor of the difference between observed and expected RRRs for CHD morbidity (P =.20). Although a linear trend toward greater risk reduction was observed with greater effects on HDL-C, differences were not statistically significant. The narrow range of HDL-C level increases in the statin trials likely reduced our ability to detect a beneficial HDL-C effect, if present.
Hartzell, S.; Leeds, A.; Frankel, A.; Williams, R.A.; Odum, J.; Stephenson, W.; Silva, W.
2002-01-01
The Seattle fault poses a significant seismic hazard to the city of Seattle, Washington. A hybrid, low-frequency, high-frequency method is used to calculate broadband (0-20 Hz) ground-motion time histories for a M 6.5 earthquake on the Seattle fault. Low frequencies (1 Hz) are calculated by a stochastic method that uses a fractal subevent size distribution to give an ω-2 displacement spectrum. Time histories are calculated for a grid of stations and then corrected for the local site response using a classification scheme based on the surficial geology. Average shear-wave velocity profiles are developed for six surficial geologic units: artificial fill, modified land, Esperance sand, Lawton clay, till, and Tertiary sandstone. These profiles together with other soil parameters are used to compare linear, equivalent-linear, and nonlinear predictions of ground motion in the frequency band 0-15 Hz. Linear site-response corrections are found to yield unreasonably large ground motions. Equivalent-linear and nonlinear calculations give peak values similar to the 1994 Northridge, California, earthquake and those predicted by regression relationships. Ground-motion variance is estimated for (1) randomization of the velocity profiles, (2) variation in source parameters, and (3) choice of nonlinear model. Within the limits of the models tested, the results are found to be most sensitive to the nonlinear model and soil parameters, notably the over consolidation ratio.
FPGA implementation of predictive degradation model for engine oil lifetime
NASA Astrophysics Data System (ADS)
Idros, M. F. M.; Razak, A. H. A.; Junid, S. A. M. Al; Suliman, S. I.; Halim, A. K.
2018-03-01
This paper presents the implementation of linear regression model for degradation prediction on Register Transfer Logic (RTL) using QuartusII. A stationary model had been identified in the degradation trend for the engine oil in a vehicle in time series method. As for RTL implementation, the degradation model is written in Verilog HDL and the data input are taken at a certain time. Clock divider had been designed to support the timing sequence of input data. At every five data, a regression analysis is adapted for slope variation determination and prediction calculation. Here, only the negative value are taken as the consideration for the prediction purposes for less number of logic gate. Least Square Method is adapted to get the best linear model based on the mean values of time series data. The coded algorithm has been implemented on FPGA for validation purposes. The result shows the prediction time to change the engine oil.
NASA Technical Reports Server (NTRS)
Roth, D. J.; Swickard, S. M.; Stang, D. B.; Deguire, M. R.
1991-01-01
A review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials is presented. Initially, a semiempirical model is developed showing the origin of the linear relationship between ultrasonic velocity and porosity fraction. Then, from a compilation of data produced by many researchers, scatter plots of velocity versus percent porosity data are shown for Al2O3, MgO, porcelain-based ceramics, PZT, SiC, Si3N4, steel, tungsten, UO2,(U0.30Pu0.70)C, and YBa2Cu3O(7-x). Linear regression analysis produces predicted slope, intercept, correlation coefficient, level of significance, and confidence interval statistics for the data. Velocity values predicted from regression analysis of fully-dense materials are in good agreement with those calculated from elastic properties.
Chen, Wen-Yuan; Wang, Mei; Fu, Zhou-Xing
2014-06-16
Most railway accidents happen at railway crossings. Therefore, how to detect humans or objects present in the risk area of a railway crossing and thus prevent accidents are important tasks. In this paper, three strategies are used to detect the risk area of a railway crossing: (1) we use a terrain drop compensation (TDC) technique to solve the problem of the concavity of railway crossings; (2) we use a linear regression technique to predict the position and length of an object from image processing; (3) we have developed a novel strategy called calculating local maximum Y-coordinate object points (CLMYOP) to obtain the ground points of the object. In addition, image preprocessing is also applied to filter out the noise and successfully improve the object detection. From the experimental results, it is demonstrated that our scheme is an effective and corrective method for the detection of railway crossing risk areas.
Chen, Wen-Yuan; Wang, Mei; Fu, Zhou-Xing
2014-01-01
Most railway accidents happen at railway crossings. Therefore, how to detect humans or objects present in the risk area of a railway crossing and thus prevent accidents are important tasks. In this paper, three strategies are used to detect the risk area of a railway crossing: (1) we use a terrain drop compensation (TDC) technique to solve the problem of the concavity of railway crossings; (2) we use a linear regression technique to predict the position and length of an object from image processing; (3) we have developed a novel strategy called calculating local maximum Y-coordinate object points (CLMYOP) to obtain the ground points of the object. In addition, image preprocessing is also applied to filter out the noise and successfully improve the object detection. From the experimental results, it is demonstrated that our scheme is an effective and corrective method for the detection of railway crossing risk areas. PMID:24936948
NASA Technical Reports Server (NTRS)
Roth, D. J.; Swickard, S. M.; Stang, D. B.; Deguire, M. R.
1990-01-01
A review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials is presented. Initially, a semi-empirical model is developed showing the origin of the linear relationship between ultrasonic velocity and porosity fraction. Then, from a compilation of data produced by many researchers, scatter plots of velocity versus percent porosity data are shown for Al2O3, MgO, porcelain-based ceramics, PZT, SiC, Si3N4, steel, tungsten, UO2,(U0.30Pu0.70)C, and YBa2Cu3O(7-x). Linear regression analysis produced predicted slope, intercept, correlation coefficient, level of significance, and confidence interval statistics for the data. Velocity values predicted from regression analysis for fully-dense materials are in good agreement with those calculated from elastic properties.
Tools for Basic Statistical Analysis
NASA Technical Reports Server (NTRS)
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Ludbrook, John
2010-07-01
1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters
NASA Astrophysics Data System (ADS)
Huang, Lin-Shan; Chen, Yan-Guang
Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver
2015-02-01
The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.
Mirmohseni, A; Abdollahi, H; Rostamizadeh, K
2007-02-28
Net analyte signal (NAS)-based method called HLA/GO was applied for the selectively determination of binary mixture of ethanol and water by quartz crystal nanobalance (QCN) sensor. A full factorial design was applied for the formation of calibration and prediction sets in the concentration ranges 5.5-22.2 microg mL(-1) for ethanol and 7.01-28.07 microg mL(-1) for water. An optimal time range was selected by procedure which was based on the calculation of the net analyte signal regression plot in any considered time window for each test sample. A moving window strategy was used for searching the region with maximum linearity of NAS regression plot (minimum error indicator) and minimum of PRESS value. On the base of obtained results, the differences on the adsorption profiles in the time range between 1 and 600 s were used to determine mixtures of both compounds by HLA/GO method. The calculation of the net analytical signal using HLA/GO method allows determination of several figures of merit like selectivity, sensitivity, analytical sensitivity and limit of detection, for each component. To check the ability of the proposed method in the selection of linear regions of adsorption profile, a test for detecting non-linear regions of adsorption profile data in the presence of methanol was also described. The results showed that the method was successfully applied for the determination of ethanol and water.
Ito, Hiroshi; Yokoi, Takashi; Ikoma, Yoko; Shidahara, Miho; Seki, Chie; Naganawa, Mika; Takahashi, Hidehiko; Takano, Harumasa; Kimura, Yuichi; Ichise, Masanori; Suhara, Tetsuya
2010-01-01
In positron emission tomography (PET) studies with radioligands for neuroreceptors, tracer kinetics have been described by the standard two-tissue compartment model that includes the compartments of nondisplaceable binding and specific binding to receptors. In the present study, we have developed a new graphic plot analysis to determine the total distribution volume (V(T)) and nondisplaceable distribution volume (V(ND)) independently, and therefore the binding potential (BP(ND)). In this plot, Y(t) is the ratio of brain tissue activity to time-integrated arterial input function, and X(t) is the ratio of time-integrated brain tissue activity to time-integrated arterial input function. The x-intercept of linear regression of the plots for early phase represents V(ND), and the x-intercept of linear regression of the plots for delayed phase after the equilibrium time represents V(T). BP(ND) can be calculated by BP(ND)=V(T)/V(ND)-1. Dynamic PET scanning with measurement of arterial input function was performed on six healthy men after intravenous rapid bolus injection of [(11)C]FLB457. The plot yielded a curve in regions with specific binding while it yielded a straight line through all plot data in regions with no specific binding. V(ND), V(T), and BP(ND) values calculated by the present method were in good agreement with those by conventional non-linear least-squares fitting procedure. This method can be used to distinguish graphically whether the radioligand binding includes specific binding or not.
Lee, Donggil; Lee, Kyounghoon; Kim, Seonghun; Yang, Yongsu
2015-04-01
An automatic abalone grading algorithm that estimates abalone weights on the basis of computer vision using 2D images is developed and tested. The algorithm overcomes the problems experienced by conventional abalone grading methods that utilize manual sorting and mechanical automatic grading. To design an optimal algorithm, a regression formula and R(2) value were investigated by performing a regression analysis for each of total length, body width, thickness, view area, and actual volume against abalone weights. The R(2) value between the actual volume and abalone weight was 0.999, showing a relatively high correlation. As a result, to easily estimate the actual volumes of abalones based on computer vision, the volumes were calculated under the assumption that abalone shapes are half-oblate ellipsoids, and a regression formula was derived to estimate the volumes of abalones through linear regression analysis between the calculated and actual volumes. The final automatic abalone grading algorithm is designed using the abalone volume estimation regression formula derived from test results, and the actual volumes and abalone weights regression formula. In the range of abalones weighting from 16.51 to 128.01 g, the results of evaluation of the performance of algorithm via cross-validation indicate root mean square and worst-case prediction errors of are 2.8 and ±8 g, respectively. © 2015 Institute of Food Technologists®
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Christensen, Jeppe Schultz; Raaschou-Nielsen, Ole; Tjønneland, Anne; Overvad, Kim; Nordsborg, Rikke B; Ketzel, Matthias; Sørensen, Thorkild Ia; Sørensen, Mette
2016-03-01
Traffic noise has been associated with cardiovascular and metabolic disorders. Potential modes of action are through stress and sleep disturbance, which may lead to endocrine dysregulation and overweight. We aimed to investigate the relationship between residential traffic and railway noise and adiposity. In this cross-sectional study of 57,053 middle-aged people, height, weight, waist circumference, and bioelectrical impedance were measured at enrollment (1993-1997). Body mass index (BMI), body fat mass index (BFMI), and lean body mass index (LBMI) were calculated. Residential exposure to road and railway traffic noise exposure was calculated using the Nordic prediction method. Associations between traffic noise and anthropometric measures at enrollment were analyzed using general linear models and logistic regression adjusted for demographic and lifestyle factors. Linear regression models adjusted for age, sex, and socioeconomic factors showed that 5-year mean road traffic noise exposure preceding enrollment was associated with a 0.35-cm wider waist circumference (95% CI: 0.21, 0.50) and a 0.18-point higher BMI (95% CI: 0.12, 0.23) per 10 dB. Small, significant increases were also found for BFMI and LBMI. All associations followed linear exposure-response relationships. Exposure to railway noise was not linearly associated with adiposity measures. However, exposure > 60 dB was associated with a 0.71-cm wider waist circumference (95% CI: 0.23, 1.19) and a 0.19-point higher BMI (95% CI: 0.0072, 0.37) compared with unexposed participants (0-20 dB). The present study finds positive associations between residential exposure to road traffic and railway noise and adiposity.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Geographical variation of cerebrovascular disease in New York State: the correlation with income
Han, Daikwon; Carrow, Shannon S; Rogerson, Peter A; Munschauer, Frederick E
2005-01-01
Background Income is known to be associated with cerebrovascular disease; however, little is known about the more detailed relationship between cerebrovascular disease and income. We examined the hypothesis that the geographical distribution of cerebrovascular disease in New York State may be predicted by a nonlinear model using income as a surrogate socioeconomic risk factor. Results We used spatial clustering methods to identify areas with high and low prevalence of cerebrovascular disease at the ZIP code level after smoothing rates and correcting for edge effects; geographic locations of high and low clusters of cerebrovascular disease in New York State were identified with and without income adjustment. To examine effects of income, we calculated the excess number of cases using a non-linear regression with cerebrovascular disease rates taken as the dependent variable and income and income squared taken as independent variables. The resulting regression equation was: excess rate = 32.075 - 1.22*10-4(income) + 8.068*10-10(income2), and both income and income squared variables were significant at the 0.01 level. When income was included as a covariate in the non-linear regression, the number and size of clusters of high cerebrovascular disease prevalence decreased. Some 87 ZIP codes exceeded the critical value of the local statistic yielding a relative risk of 1.2. The majority of low cerebrovascular disease prevalence geographic clusters disappeared when the non-linear income effect was included. For linear regression, the excess rate of cerebrovascular disease falls with income; each $10,000 increase in median income of each ZIP code resulted in an average reduction of 3.83 observed cases. The significant nonlinear effect indicates a lessening of this income effect with increasing income. Conclusion Income is a non-linear predictor of excess cerebrovascular disease rates, with both low and high observed cerebrovascular disease rate areas associated with higher income. Income alone explains a significant amount of the geographical variance in cerebrovascular disease across New York State since both high and low clusters of cerebrovascular disease dissipate or disappear with income adjustment. Geographical modeling, including non-linear effects of income, may allow for better identification of other non-traditional risk factors. PMID:16242043
[A study on city motor vehicle emission factors by tunnel test].
Wang, B; Zhang, Y; Zhu, C; Yu, K; Chan, L; Chan, Z
2001-03-01
Applying the principle of tunnel test to run a typical across-river tunnel test in Guangzhou city, 48 h-online-monitor data include pollutant concentration, traffic activity and meteorological data were gained. The average motor vehicle emission factors of NOx, CO, SO2, PM10 and HC were calculated using mass balance which are 1.379, 15.404, 0.142, 0.637, 1.857 g/km. vehicle respectively. Based on that, combined emission factors of 8 types of city vehicles were calculated using linear regression. The result basically showed the character and level of motor vehicle emission in Chinese city.
Echocardiographic measurements of left ventricular mass by a non-geometric method
NASA Technical Reports Server (NTRS)
Parra, Beatriz; Buckey, Jay; Degraff, David; Gaffney, F. Andrew; Blomqvist, C. Gunnar
1987-01-01
The accuracy of a new nongeometric method for calculating left ventricular myocardial volumes from two-dimensional echocardiographic images was assessed in vitro using 20 formalin-fixed normal human hearts. Serial oblique short-axis images were acquired from one point at 5-deg intervals, for a total of 10-12 cross sections. Echocardiographic myocardial volumes were calculated as the difference between the volumes defined by the epi- and endocardial surfaces. Actual myocardial volumes were determined by water displacement. Volumes ranged from 80 to 174 ml (mean 130.8 ml). Linear regression analysis demonstrated excellent agreement between the echocardiographic and direct measurements.
Data mining for materials design: A computational study of single molecule magnet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dam, Hieu Chi; Faculty of Physics, Vietnam National University, 334 Nguyen Trai, Hanoi; Pham, Tien Lam
2014-01-28
We develop a method that combines data mining and first principles calculation to guide the designing of distorted cubane Mn{sup 4+} Mn {sub 3}{sup 3+} single molecule magnets. The essential idea of the method is a process consisting of sparse regressions and cross-validation for analyzing calculated data of the materials. The method allows us to demonstrate that the exchange coupling between Mn{sup 4+} and Mn{sup 3+} ions can be predicted from the electronegativities of constituent ligands and the structural features of the molecule by a linear regression model with high accuracy. The relations between the structural features and magnetic propertiesmore » of the materials are quantitatively and consistently evaluated and presented by a graph. We also discuss the properties of the materials and guide the material design basing on the obtained results.« less
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.
Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam
2017-11-01
The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
NASA Astrophysics Data System (ADS)
Raev, M. D.; Sharkov, E. A.; Tikhonov, V. V.; Repina, I. A.; Komarova, N. Yu.
2015-12-01
The GLOBAL-RT database (DB) is composed of long-term radio heat multichannel observation data received from DMSP F08-F17 satellites; it is permanently supplemented with new data on the Earth's exploration from the space department of the Space Research Institute, Russian Academy of Sciences. Arctic ice-cover areas for regions higher than 60° N latitude were calculated using the DB polar version and NASA Team 2 algorithm, which is widely used in foreign scientific literature. According to the analysis of variability of Arctic ice cover during 1987-2014, 2 months were selected when the Arctic ice cover was maximal (February) and minimal (September), and the average ice cover area was calculated for these months. Confidence intervals of the average values are in the 95-98% limits. Several approximations are derived for the time dependences of the ice-cover maximum and minimum over the period under study. Regression dependences were calculated for polynomials from the first degree (linear) to sextic. It was ascertained that the minimal root-mean-square error of deviation from the approximated curve sharply decreased for the biquadratic polynomial and then varied insignificantly: from 0.5593 for the polynomial of third degree to 0.4560 for the biquadratic polynomial. Hence, the commonly used strictly linear regression with a negative time gradient for the September Arctic ice cover minimum over 30 years should be considered incorrect.
Kayser, W; Glaze, J B; Welch, C M; Kerley, M; Hill, R A
2015-07-01
The objective of this study was to determine the effects of alternative-measurements of body weight and DMI used to evaluate residual feed intake (RFI). Weaning weight (WW), ADG, and DMI were recorded on 970 growing purebred Charolais bulls (n = 519) and heifers (n = 451) and 153 Red Angus growing steers (n = 69) and heifers (n = 84) using a GrowSafe (GrowSafe, Airdrie, Alberta, Canada) system. Averages of individual DMI were calculated in 10-d increments and compared to the overall DMI to identify the magnitude of the errors associated with measuring DMI. These incremental measurements were also used in calculation of RFI, computed from the linear regression of DMI on ADG and midtest body weight0.75 (MMWT). RFI_Regress was calculated using ADG_Regress (ADG calculated as the response of BW gain and DOF) and MMWT_PWG (metabolic midweight calculated throughout the postweaning gain test), considered the control in Red Angus. A similar calculation served as control for Charolais; RFI was calculated using 2-d consecutive start and finish weights (RFI_Calc). The RFI weaning weight (RFI_WW) was calculated using ADG_WW (ADG from weaning till the final out weight of the postweaning gain test) and MMWT_WW, calculated similarly. Overall average estimated DMI was highly correlated to the measurements derived over shorter periods, with 10 d being the least correlated and 60 d being the most correlated. The ADG_Calc (calculated using 2-d consecutive start and finish weight/DOF) and ADG_WW were highly correlated in Charolais. The ADG_Regress and ADG_Calc were highly correlated, and ADG_Regress and ADG_WW were moderately correlated in Red Angus. The control measures of RFI were highly correlated with the RFI_WW in Charolais and Red Angus. The outcomes of including abbreviated period DMI in the model with the weaning weight gain measurements showed that the model using 10 d of intake (RFI WW_10) was the least correlated with the control measures. The model with 60 d of intake had the largest correlation with the control measures. The fewest measured intake days coupled with the weaning weight values providing acceptable predictive value was RFI_WW_40, being highly correlated with the control measures. As established in the literature, at least 70 d is required to accurately measure ADG. However, we conclude that a shorter period, possibly as few as 40 d is needed to accurately estimate DMI for a reliable calculation of RFI.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Mosing, Martina; Waldmann, Andreas D.; MacFarlane, Paul; Iff, Samuel; Auer, Ulrike; Bohm, Stephan H.; Bettschart-Wolfensberger, Regula; Bardell, David
2016-01-01
This study evaluated the breathing pattern and distribution of ventilation in horses prior to and following recovery from general anaesthesia using electrical impedance tomography (EIT). Six horses were anaesthetised for 6 hours in dorsal recumbency. Arterial blood gas and EIT measurements were performed 24 hours before (baseline) and 1, 2, 3, 4, 5 and 6 hours after horses stood following anaesthesia. At each time point 4 representative spontaneous breaths were analysed. The percentage of the total breath length during which impedance remained greater than 50% of the maximum inspiratory impedance change (breath holding), the fraction of total tidal ventilation within each of four stacked regions of interest (ROI) (distribution of ventilation) and the filling time and inflation period of seven ROI evenly distributed over the dorso-ventral height of the lungs were calculated. Mixed effects multi-linear regression and linear regression were used and significance was set at p<0.05. All horses demonstrated inspiratory breath holding until 5 hours after standing. No change from baseline was seen for the distribution of ventilation during inspiration. Filling time and inflation period were more rapid and shorter in ventral and slower and longer in most dorsal ROI compared to baseline, respectively. In a mixed effects multi-linear regression, breath holding was significantly correlated with PaCO2 in both the univariate and multivariate regression. Following recovery from anaesthesia, horses showed inspiratory breath holding during which gas redistributed from ventral into dorsal regions of the lungs. This suggests auto-recruitment of lung tissue which would have been dependent and likely atelectic during anaesthesia. PMID:27331910
Catalog of Air Force Weather Technical Documents, 1941-2006
2006-05-19
radiosondes in current use in USA. Elementary discussion of statistical terms and concepts used for expressing accuracy or error is discussed. AWS TR 105...Techniques, Appendix B: Vorticity—An Elementary Discussion of the Concept, August 1956, 27pp. Formerly AWSM 105– 50/1A. Provides the necessary back...steps involved in ordinary multiple linear regression. Conditional probability is calculated using transnormalized variables in the multivariate normal
Thomas, Colleen; Swayne, David E
2009-09-01
High-pathogenicity avian influenza viruses (HPAIV) cause severe systemic disease with high mortality in chickens. Isolation of HPAIV from the internal contents of chicken eggs has been reported, and this is cause for concern because HPAIV can be spread by movement of poultry products during marketing and trade activity. This study presents thermal inactivation data for the HPAIV strain A/chicken/PA/1370/83 (H5N2) (PA/83) in dried egg white with a moisture content (7.5%) similar to that found in commercially available spray-dried egg white products. The 95% upper confidence limits for D-values calculated from linear regression of the survival curves at 54.4, 60.0, 65.5, and 71.1 degrees C were 475.4, 192.2, 141.0, and 50.1 min, respectively. The line equation y = [0.05494 x degrees C] + 5.5693 (root mean square error = 0.0711) was obtained by linear regression of experimental D-values versus temperature. Conservative predictions based on the thermal inactivation data suggest that standard industry pasteurization protocols would be very effective for HPAIV inactivation in dried egg white. For example, these calculations predict that a 7-log reduction would take only 2.6 days at 54.4 degrees C.
Linear regression crash prediction models : issues and proposed solutions.
DOT National Transportation Integrated Search
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
NASA Astrophysics Data System (ADS)
Korkiakoski, Mika; Tuovinen, Juha-Pekka; Aurela, Mika; Koskinen, Markku; Minkkinen, Kari; Ojanen, Paavo; Penttilä, Timo; Rainne, Juuso; Laurila, Tuomas; Lohila, Annalea
2017-04-01
We measured methane (CH4) exchange rates with automatic chambers at the forest floor of a nutrient-rich drained peatland in 2011-2013. The fen, located in southern Finland, was drained for forestry in 1969 and the tree stand is now a mixture of Scots pine, Norway spruce, and pubescent birch. Our measurement system consisted of six transparent chambers and stainless steel frames, positioned on a number of different field and moss layer compositions. Gas concentrations were measured with an online cavity ring-down spectroscopy gas analyzer. Fluxes were calculated with both linear and exponential regression. The use of linear regression resulted in systematically smaller CH4 fluxes by 10-45 % as compared to exponential regression. However, the use of exponential regression with small fluxes ( < 2.5 µg CH4 m-2 h-1) typically resulted in anomalously large absolute fluxes and high hour-to-hour deviations. Therefore, we recommend that fluxes are initially calculated with linear regression to determine the threshold for low
fluxes and that higher fluxes are then recalculated using exponential regression. The exponential flux was clearly affected by the length of the fitting period when this period was < 190 s, but stabilized with longer periods. Thus, we also recommend the use of a fitting period of several minutes to stabilize the results and decrease the flux detection limit. There were clear seasonal dynamics in the CH4 flux: the forest floor acted as a CH4 sink particularly from early summer until the end of the year, while in late winter the flux was very small and fluctuated around zero. However, the magnitude of fluxes was relatively small throughout the year, ranging mainly from -130 to +100 µg CH4 m-2 h-1. CH4 emission peaks were observed occasionally, mostly in summer during heavy rainfall events. Diurnal variation, showing a lower CH4 uptake rate during the daytime, was observed in all of the chambers, mainly in the summer and late spring, particularly in dry conditions. It was attributed more to changes in wind speed than air or soil temperature, which suggest that physical rather than biological phenomena are responsible for the observed variation. The annual net CH4 exchange varied from -104 ± 30 to -505 ± 39 mg CH4 m-2 yr-1 among the six chambers, with an average of -219 mg CH4 m-2 yr-1 over the 2-year measurement period.
Can Functional Cardiac Age be Predicted from ECG in a Normal Healthy Population
NASA Technical Reports Server (NTRS)
Schlegel, Todd; Starc, Vito; Leban, Manja; Sinigoj, Petra; Vrhovec, Milos
2011-01-01
In a normal healthy population, we desired to determine the most age-dependent conventional and advanced ECG parameters. We hypothesized that changes in several ECG parameters might correlate with age and together reliably characterize the functional age of the heart. Methods: An initial study population of 313 apparently healthy subjects was ultimately reduced to 148 subjects (74 men, 84 women, in the range from 10 to 75 years of age) after exclusion criteria. In all subjects, ECG recordings (resting 5-minute 12-lead high frequency ECG) were evaluated via custom software programs to calculate up to 85 different conventional and advanced ECG parameters including beat-to-beat QT and RR variability, waveform complexity, and signal-averaged, high-frequency and spatial/spatiotemporal ECG parameters. The prediction of functional age was evaluated by multiple linear regression analysis using the best 5 univariate predictors. Results: Ignoring what were ultimately small differences between males and females, the functional age was found to be predicted (R2= 0.69, P < 0.001) from a linear combination of 5 independent variables: QRS elevation in the frontal plane (p<0.001), a new repolarization parameter QTcorr (p<0.001), mean high frequency QRS amplitude (p=0.009), the variability parameter % VLF of RRV (p=0.021) and the P-wave width (p=0.10). Here, QTcorr represents the correlation between the calculated QT and the measured QT signal. Conclusions: In apparently healthy subjects with normal conventional ECGs, functional cardiac age can be estimated by multiple linear regression analysis of mostly advanced ECG results. Because some parameters in the regression formula, such as QTcorr, high frequency QRS amplitude and P-wave width also change with disease in the same direction as with increased age, increased functional age of the heart may reflect subtle age-related pathologies in cardiac electrical function that are usually hidden on conventional ECG.
Reflexion on linear regression trip production modelling method for ensuring good model quality
NASA Astrophysics Data System (ADS)
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
SigrafW: An easy-to-use program for fitting enzyme kinetic data.
Leone, Francisco Assis; Baranauskas, José Augusto; Furriel, Rosa Prazeres Melo; Borin, Ivana Aparecida
2005-11-01
SigrafW is Windows-compatible software developed using the Microsoft® Visual Basic Studio program that uses the simplified Hill equation for fitting kinetic data from allosteric and Michaelian enzymes. SigrafW uses a modified Fibonacci search to calculate maximal velocity (V), the Hill coefficient (n), and the enzyme-substrate apparent dissociation constant (K). The estimation of V, K, and the sum of the squares of residuals is performed using a Wilkinson nonlinear regression at any Hill coefficient (n). In contrast to many currently available kinetic analysis programs, SigrafW shows several advantages for the determination of kinetic parameters of both hyperbolic and nonhyperbolic saturation curves. No initial estimates of the kinetic parameters are required, a measure of the goodness-of-the-fit for each calculation performed is provided, the nonlinear regression used for calculations eliminates the statistical bias inherent in linear transformations, and the software can be used for enzyme kinetic simulations either for educational or research purposes. Persons interested in receiving a free copy of the software should contact Dr. F. A. Leone. Copyright © 2005 International Union of Biochemistry and Molecular Biology, Inc.
Calculating stage duration statistics in multistage diseases.
Komarova, Natalia L; Thalhauser, Craig J
2011-01-01
Many human diseases are characterized by multiple stages of progression. While the typical sequence of disease progression can be identified, there may be large individual variations among patients. Identifying mean stage durations and their variations is critical for statistical hypothesis testing needed to determine if treatment is having a significant effect on the progression, or if a new therapy is showing a delay of progression through a multistage disease. In this paper we focus on two methods for extracting stage duration statistics from longitudinal datasets: an extension of the linear regression technique, and a counting algorithm. Both are non-iterative, non-parametric and computationally cheap methods, which makes them invaluable tools for studying the epidemiology of diseases, with a goal of identifying different patterns of progression by using bioinformatics methodologies. Here we show that the regression method performs well for calculating the mean stage durations under a wide variety of assumptions, however, its generalization to variance calculations fails under realistic assumptions about the data collection procedure. On the other hand, the counting method yields reliable estimations for both means and variances of stage durations. Applications to Alzheimer disease progression are discussed.
Factors associated with parasite dominance in fishes from Brazil.
Amarante, Cristina Fernandes do; Tassinari, Wagner de Souza; Luque, Jose Luis; Pereira, Maria Julia Salim
2016-06-14
The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.
NASA Astrophysics Data System (ADS)
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Jayabharathi, Jayaraman; Thanikachalam, Venugopal; Venkatesh Perumal, Marimuthu
2012-09-01
The synthesized imidazole derivative 2-(2,4-difluorophenyl)-1-(4-methoxyphenyl)-1H-imidazo[4,5-f][1,10] phenanthroline (dfpmpip) has been characterized using IR, mass, (1)H, (13)C NMR and elemental analysis. The photophysical properties of dfpmpip have been studied using UV-visible and fluorescence spectroscopy in different solvents. The solvent effect on the absorption and fluorescence bands has been analyzed by a multi-component linear regression. Theoretically calculated bond lengths, bond angles and dihedral angles are found to be slightly higher than that of X-ray Diffraction (XRD) values of its parent compound. The charge distribution has been calculated from the atomic charges by non-linear optical (NLO) and natural bond orbital (NBO) analysis. Since the synthesized imidazole derivative has the largest μ(g)β(0) value, the reported imidazole can be used as potential NLO material. The energies of the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) levels and the molecular electrostatic potential (MEP) energy surface studies evidenced the existence of intramolecular charge transfer (ICT) within the molecule. Theoretical calculations regarding the chemical potential (μ), hardness (η) and electrophilicity index (ω) have also been calculated. Copyright © 2012 Elsevier B.V. All rights reserved.
Relationships between age and dental attrition in Australian aboriginals.
Richards, L C; Miller, S L
1991-02-01
Tooth wear scores (ratios of exposed dentin to total crown area) were calculated from dental casts of Australian Aboriginal subjects of known age from three populations. Linear regression equations relating attrition scores to age were derived. The slope of the regression line reflects the rate of tooth wear, and the intercept is related to the timing of first exposure of dentin. Differences in morphology between anterior and posterior teeth are reflected in a linear relationship between attrition scores and age for anterior teeth but a logarithmic relationship for posterior teeth. Correlations between age and attrition range from less than 0.40 for third molars (where differences in the eruption and occlusion of the teeth resulted in different patterns of wear) to greater than 0.80 for the premolars and first molars. Because of the generally high correlations between age and attrition, it is possible to estimate age from the extent of tooth wear with confidence limits of the order of +/- 10 years.
Acquisition Challenge: The Importance of Incompressibility in Comparing Learning Curve Models
2015-10-01
parameters for all four learning mod- els used in the study . The learning rate factor, b, is the slope of the linear regression line, which in this case is...incorporated within the DoD acquisition environment. This study tested three alternative learning models (the Stanford-B model, DeJong’s learning formula...appropriate tools to calculate accurate and reliable predictions. However, conventional learning curve methodology has been in practice since the pre
The Hydrothermal Chemistry of Gold, Arsenic, Antimony, Mercury and Silver
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bessinger, Brad; Apps, John A.
2003-03-23
A comprehensive thermodynamic database based on the Helgeson-Kirkham-Flowers (HKF) equation of state was developed for metal complexes in hydrothermal systems. Because this equation of state has been shown to accurately predict standard partial molal thermodynamic properties of aqueous species at elevated temperatures and pressures, this study provides the necessary foundation for future exploration into transport and depositional processes in polymetallic ore deposits. The HKF equation of state parameters for gold, arsenic, antimony, mercury, and silver sulfide and hydroxide complexes were derived from experimental equilibrium constants using nonlinear regression calculations. In order to ensure that the resulting parameters were internally consistent,more » those experiments utilizing incompatible thermodynamic data were re-speciated prior to regression. Because new experimental studies were used to revise the HKF parameters for H2S0 and HS-1, those metal complexes for which HKF parameters had been previously derived were also updated. It was found that predicted thermodynamic properties of metal complexes are consistent with linear correlations between standard partial molal thermodynamic properties. This result allowed assessment of several complexes for which experimental data necessary to perform regression calculations was limited. Oxygen fugacity-temperature diagrams were calculated to illustrate how thermodynamic data improves our understanding of depositional processes. Predicted thermodynamic properties were used to investigate metal transport in Carlin-type gold deposits. Assuming a linear relationship between temperature and pressure, metals are predicted to predominantly be transported as sulfide complexes at a total aqueous sulfur concentration of 0.05 m. Also, the presence of arsenic and antimony mineral phases in the deposits are shown to restrict mineralization within a limited range of chemical conditions. Finally, at a lesser aqueous sulfur concentration of 0.01 m, host rock sulfidation can explain the origin of arsenic and antimony minerals within the paragenetic sequence.« less
Katkov, Igor I
2008-10-01
Some aspects of proper linearization of the Boyle-van't Hoff (BVH) relationship for calculation of the osmotically inactive volume v(b), and Arrhenius plot (AP) for the activation energy E(a) are discussed. It is shown that the commonly used determination of the slope and the intercept (v(b)), which are presumed to be independent from each other, is invalid if the initial intracellular molality m(0) is known. Instead, the linear regression with only one independent parameter (v(b)) or the Least Square Method (LSM) with v(b) as the only fitting LSM parameter must be applied. The slope can then be calculated from the BVH relationship as the function of v(b). In case of unknown m(0) (for example, if cells are preloaded with trehalose, or electroporation caused ion leakage, etc.), it is considered as the second independent statistical parameter to be found. In this (and only) scenario, all three methods give the same results for v(b) and m(0). AP can be linearized only for water hydraulic conductivity (L(p)) and solute mobility (omega(s)) while water and solute permeabilities P(w) identical withL(p)RT and P(s) identical withomega(s)RT cannot be linearized because they have pre-exponential factor (RT) that depends on the temperature T.
Energy-free machine learning force field for aluminum.
Kruglov, Ivan; Sergeev, Oleg; Yanilkin, Alexey; Oganov, Artem R
2017-08-17
We used the machine learning technique of Li et al. (PRL 114, 2015) for molecular dynamics simulations. Atomic configurations were described by feature matrix based on internal vectors, and linear regression was used as a learning technique. We implemented this approach in the LAMMPS code. The method was applied to crystalline and liquid aluminum and uranium at different temperatures and densities, and showed the highest accuracy among different published potentials. Phonon density of states, entropy and melting temperature of aluminum were calculated using this machine learning potential. The results are in excellent agreement with experimental data and results of full ab initio calculations.
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Thompson, A. P.; Swiler, L. P.; Trott, C. R.; Foiles, S. M.; Tucker, G. J.
2015-03-01
We present a new interatomic potential for solids and liquids called Spectral Neighbor Analysis Potential (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projected onto a basis of hyperspherical harmonics in four dimensions. The bispectrum components are the same bond-orientational order parameters employed by the GAP potential [1]. The SNAP potential, unlike GAP, assumes a linear relationship between atom energy and bispectrum components. The linear SNAP coefficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. We demonstrate that a previously unnoticed symmetry property can be exploited to reduce the computational cost of the force calculations by more than one order of magnitude. We present results for a SNAP potential for tantalum, showing that it accurately reproduces a range of commonly calculated properties of both the crystalline solid and the liquid phases. In addition, unlike simpler existing potentials, SNAP correctly predicts the energy barrier for screw dislocation migration in BCC tantalum.
1974-01-01
REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression?
Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J; Ma, Keping
2013-01-01
Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees.
Ochi, H; Ikuma, I; Toda, H; Shimada, T; Morioka, S; Moriyama, K
1989-12-01
In order to determine whether isovolumic relaxation period (IRP) reflects left ventricular relaxation under different afterload conditions, 17 anesthetized, open chest dogs were studied, and the left ventricular pressure decay time constant (T) was calculated. In 12 dogs, angiotensin II and nitroprusside were administered, with the heart rate constant at 90 beats/min. Multiple linear regression analysis showed that the aortic dicrotic notch pressure (AoDNP) and T were major determinants of IRP, while left ventricular end-diastolic pressure was a minor determinant. Multiple linear regression analysis, correlating T with IRP and AoDNP, did not further improve the correlation coefficient compared with that between T and IRP. We concluded that correction of the IRP by AoDNP is not necessary to predict T from additional multiple linear regression. The effects of ascending aortic constriction or angiotensin II on IRP were examined in five dogs, after pretreatment with propranolol. Aortic constriction caused a significant decrease in IRP and T, while angiotensin II produced a significant increase in IRP and T. IRP was affected by the change of afterload. However, the IRP and T values were always altered in the same direction. These results demonstrate that IRP is substituted for T and it reflects left ventricular relaxation even in different afterload conditions. We conclude that IRP is a simple parameter easily used to evaluate left ventricular relaxation in clinical situations.
Adachi, Daiki; Nishiguchi, Shu; Fukutani, Naoto; Hotta, Takayuki; Tashiro, Yuto; Morino, Saori; Shirooka, Hidehiko; Nozaki, Yuma; Hirata, Hinako; Yamaguchi, Moe; Yorozu, Ayanori; Takahashi, Masaki; Aoyama, Tomoki
2017-05-01
The purpose of this study was to investigate which spatial and temporal parameters of the Timed Up and Go (TUG) test are associated with motor function in elderly individuals. This study included 99 community-dwelling women aged 72.9 ± 6.3 years. Step length, step width, single support time, variability of the aforementioned parameters, gait velocity, cadence, reaction time from starting signal to first step, and minimum distance between the foot and a marker placed to 3 in front of the chair were measured using our analysis system. The 10-m walk test, five times sit-to-stand (FTSTS) test, and one-leg standing (OLS) test were used to assess motor function. Stepwise multivariate linear regression analysis was used to determine which TUG test parameters were associated with each motor function test. Finally, we calculated a predictive model for each motor function test using each regression coefficient. In stepwise linear regression analysis, step length and cadence were significantly associated with the 10-m walk test, FTSTS and OLS test. Reaction time was associated with the FTSTS test, and step width was associated with the OLS test. Each predictive model showed a strong correlation with the 10-m walk test and OLS test (P < 0.01), which was not significant higher correlation than TUG test time. We showed which TUG test parameters were associated with each motor function test. Moreover, the TUG test time regarded as the lower extremity function and mobility has strong predictive ability in each motor function test. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Optimal Estimation of Clock Values and Trends from Finite Data
NASA Technical Reports Server (NTRS)
Greenhall, Charles
2005-01-01
We show how to solve two problems of optimal linear estimation from a finite set of phase data. Clock noise is modeled as a stochastic process with stationary dth increments. The covariance properties of such a process are contained in the generalized autocovariance function (GACV). We set up two principles for optimal estimation: with the help of the GACV, these principles lead to a set of linear equations for the regression coefficients and some auxiliary parameters. The mean square errors of the estimators are easily calculated. The method can be used to check the results of other methods and to find good suboptimal estimators based on a small subset of the available data.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Van Looy, Stijn; Verplancke, Thierry; Benoit, Dominique; Hoste, Eric; Van Maele, Georges; De Turck, Filip; Decruyenaere, Johan
2007-01-01
Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR). Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis). Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR. Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.
Multi-objective Optimization of Solar Irradiance and Variance at Pertinent Inclination Angles
NASA Astrophysics Data System (ADS)
Jain, Dhanesh; Lalwani, Mahendra
2018-05-01
The performance of photovoltaic panel gets highly affected bychange in atmospheric conditions and angle of inclination. This article evaluates the optimum tilt angle and orientation angle (surface azimuth angle) for solar photovoltaic array in order to get maximum solar irradiance and to reduce variance of radiation at different sets or subsets of time periods. Non-linear regression and adaptive neural fuzzy interference system (ANFIS) methods are used for predicting the solar radiation. The results of ANFIS are more accurate in comparison to non-linear regression. These results are further used for evaluating the correlation and applied for estimating the optimum combination of tilt angle and orientation angle with the help of general algebraic modelling system and multi-objective genetic algorithm. The hourly average solar irradiation is calculated at different combinations of tilt angle and orientation angle with the help of horizontal surface radiation data of Jodhpur (Rajasthan, India). The hourly average solar irradiance is calculated for three cases: zero variance, with actual variance and with double variance at different time scenarios. It is concluded that monthly collected solar radiation produces better result as compared to bimonthly, seasonally, half-yearly and yearly collected solar radiation. The profit obtained for monthly varying angle has 4.6% more with zero variance and 3.8% more with actual variance, than the annually fixed angle.
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Sano, Yuko; Kandori, Akihiko; Shima, Keisuke; Yamaguchi, Yuki; Tsuji, Toshio; Noda, Masafumi; Higashikawa, Fumiko; Yokoe, Masaru; Sakoda, Saburo
2016-06-01
We propose a novel index of Parkinson's disease (PD) finger-tapping severity, called "PDFTsi," for quantifying the severity of symptoms related to the finger tapping of PD patients with high accuracy. To validate the efficacy of PDFTsi, the finger-tapping movements of normal controls and PD patients were measured by using magnetic sensors, and 21 characteristics were extracted from the finger-tapping waveforms. To distinguish motor deterioration due to PD from that due to aging, the aging effect on finger tapping was removed from these characteristics. Principal component analysis (PCA) was applied to the age-normalized characteristics, and principal components that represented the motion properties of finger tapping were calculated. Multiple linear regression (MLR) with stepwise variable selection was applied to the principal components, and PDFTsi was calculated. The calculated PDFTsi indicates that PDFTsi has a high estimation ability, namely a mean square error of 0.45. The estimation ability of PDFTsi is higher than that of the alternative method, MLR with stepwise regression selection without PCA, namely a mean square error of 1.30. This result suggests that PDFTsi can quantify PD finger-tapping severity accurately. Furthermore, the result of interpreting a model for calculating PDFTsi indicated that motion wideness and rhythm disorder are important for estimating PD finger-tapping severity.
Malignant testicular tumour incidence and mortality trends
Wojtyła-Buciora, Paulina; Więckowska, Barbara; Krzywinska-Wiewiorowska, Małgorzata; Gromadecka-Sutkiewicz, Małgorzata
2016-01-01
Aim of the study In Poland testicular tumours are the most frequent cancer among men aged 20–44 years. Testicular tumour incidence since the 1980s and 1990s has been diversified geographically, with an increased risk of mortality in Wielkopolska Province, which was highlighted at the turn of the 1980s and 1990s. The aim of the study was the comparative analysis of the tendencies in incidence and death rates due to malignant testicular tumours observed among men in Poland and in Wielkopolska Province. Material and methods Data from the National Cancer Registry were used for calculations. The incidence/mortality rates among men due to malignant testicular cancer as well as the tendencies in incidence/death ratio observed in Poland and Wielkopolska were established based on regression equation. The analysis was deepened by adopting the multiple linear regression model. A p-value < 0.05 was arbitrarily adopted as the criterion of statistical significance, and for multiple comparisons it was modified according to the Bonferroni adjustment to a value of p < 0.0028. Calculations were performed with the use of PQStat v1.4.8 package. Results The incidence of malignant testicular neoplasms observed among men in Poland and in Wielkopolska Province indicated a significant rising tendency. The multiple linear regression model confirmed that the year variable is a strong incidence forecast factor only within the territory of Poland. A corresponding analysis of mortality rates among men in Poland and in Wielkopolska Province did not show any statistically significant correlations. Conclusions Late diagnosis of Polish patients calls for undertaking appropriate educational activities that would facilitate earlier reporting of the patients, thus increasing their chances for recovery. Introducing preventive examinations in the regions of increased risk of testicular tumour may allow earlier diagnosis. PMID:27095941
Morse Code, Scrabble, and the Alphabet
ERIC Educational Resources Information Center
Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss
2004-01-01
In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
NASA Astrophysics Data System (ADS)
Arcenegui, V.; Morugán, A.; García-Orenes, F.; Zornoza, R.; Mataix-Solera, J.; Navarro, M. A.; Guerrero, C.; Mataix-Beneyto, J.
2009-04-01
The use of treated wastewater for the irrigation of agricultural soils is an alternative to utilizing better-quality water, especially in semiarid regions where water shortage is a very serious problem. However, this practise can modify the soil equilibrium and affect its quality. In this work two soil quality indices (models) are used to evaluate the effects of long-term irrigation with treated wastewater in soil. The models were developed studying different soil properties in undisturbed forest soils in SE Spain, and the relationships between soil parameters were established using multiple linear regressions. Model 1, that explained 92% of the variance in soil organic carbon (SOC) showed that the SOC can be calculated by the linear combination of 6 physical, chemical and biochemical properties (acid phosphatase, water holding capacity (WHC), electrical conductivity (EC), available phosphorus (P), cation exchange capacity (CEC) and aggregate stability (AS)). Model 2 explains 89% of the SOC variance, which can be calculated by means of 7 chemical and biochemical properties (urease, phosphatase, and
Validation and application of single breath cardiac output determinations in man
NASA Technical Reports Server (NTRS)
Loeppky, J. A.; Fletcher, E. R.; Myhre, L. G.; Luft, U. C.
1986-01-01
The results of a procedure for estimating cardiac output by a single-breath technique (Qsb), obtained in healthy males during supine rest and during exercise on a bicycle ergometer, were compared with the results on cardiac output obtained by the direct Fick method (QF). The single breath maneuver consisted of a slow exhalation to near residual volume following an inspiration somewhat deeper than normal. The Qsb calculations incorporated an equation of the CO2 dissociation curve and a 'moving spline' sequential curve-fitting technique to calculate the instantaneous R from points on the original expirogram. The resulting linear regression equation indicated a 24-percent underestimation of QF by the Qsb technique. After applying a correction, the Qsb-QF relationship was improved. A subsequent study during upright rest and exercise to 80 percent of VO2(max) in 6 subjects indicated a close linear relationship between Qsb and VO2 for all 95 values obtained, with slope and intercept close to those in published studies in which invasive cardiac output measurements were used.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Analysis and selection of magnitude relations for the Working Group on Utah Earthquake Probabilities
Duross, Christopher; Olig, Susan; Schwartz, David
2015-01-01
Prior to calculating time-independent and -dependent earthquake probabilities for faults in the Wasatch Front region, the Working Group on Utah Earthquake Probabilities (WGUEP) updated a seismic-source model for the region (Wong and others, 2014) and evaluated 19 historical regressions on earthquake magnitude (M). These regressions relate M to fault parameters for historical surface-faulting earthquakes, including linear fault length (e.g., surface-rupture length [SRL] or segment length), average displacement, maximum displacement, rupture area, seismic moment (Mo ), and slip rate. These regressions show that significant epistemic uncertainties complicate the determination of characteristic magnitude for fault sources in the Basin and Range Province (BRP). For example, we found that M estimates (as a function of SRL) span about 0.3–0.4 units (figure 1) owing to differences in the fault parameter used; age, quality, and size of historical earthquake databases; and fault type and region considered.
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J
2010-05-01
Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.
Fisher, Charles K; Mehta, Pankaj
2015-06-01
Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Albumin, a marker for post-operative myocardial damage in cardiac surgery.
van Beek, Dianne E C; van der Horst, Iwan C C; de Geus, A Fred; Mariani, Massimo A; Scheeren, Thomas W L
2018-06-06
Low serum albumin (SA) is a prognostic factor for poor outcome after cardiac surgery. The aim of this study was to estimate the association between pre-operative SA, early post-operative SA and postoperative myocardial injury. This single center cohort study included adult patients undergoing cardiac surgery during 4 consecutive years. Postoperative myocardial damage was defined by calculating the area under the curve (AUC) of troponin (Tn) values during the first 72 h after surgery and its association with SA analyzed using linear regression and with multivariable linear regression to account for patient related and procedural confounders. The association between SA and the secondary outcomes (peri-operative myocardial infarction [PMI], requiring ventilation >24 h, rhythm disturbances, 30-day mortality) was studied using (multivariable) log binomial regression analysis. In total 2757 patients were included. The mean pre-operative SA was 29 ± 13 g/l and the mean post-operative SA was 26 ± 6 g/l. Post-operative SA levels (on average 26 min after surgery) were inversely associated with postoperative myocardial damage in both univariable analysis (regression coefficient - 0.019, 95%CI -0.022/-0.015, p < 0.005) and after adjustment for patient related and surgical confounders (regression coefficient - 0.014 [95% CI -0.020/-0.008], p < 0.0005). Post-operative albumin levels were significantly correlated with the amount of postoperative myocardial damage in patients undergoing cardiac surgery independent of typical confounders. Copyright © 2018. Published by Elsevier Inc.
Zhang, J; Feng, J-Y; Ni, Y-L; Wen, Y-J; Niu, Y; Tamba, C L; Yue, C; Song, Q; Zhang, Y-M
2017-06-01
Multilocus genome-wide association studies (GWAS) have become the state-of-the-art procedure to identify quantitative trait nucleotides (QTNs) associated with complex traits. However, implementation of multilocus model in GWAS is still difficult. In this study, we integrated least angle regression with empirical Bayes to perform multilocus GWAS under polygenic background control. We used an algorithm of model transformation that whitened the covariance matrix of the polygenic matrix K and environmental noise. Markers on one chromosome were included simultaneously in a multilocus model and least angle regression was used to select the most potentially associated single-nucleotide polymorphisms (SNPs), whereas the markers on the other chromosomes were used to calculate kinship matrix as polygenic background control. The selected SNPs in multilocus model were further detected for their association with the trait by empirical Bayes and likelihood ratio test. We herein refer to this method as the pLARmEB (polygenic-background-control-based least angle regression plus empirical Bayes). Results from simulation studies showed that pLARmEB was more powerful in QTN detection and more accurate in QTN effect estimation, had less false positive rate and required less computing time than Bayesian hierarchical generalized linear model, efficient mixed model association (EMMA) and least angle regression plus empirical Bayes. pLARmEB, multilocus random-SNP-effect mixed linear model and fast multilocus random-SNP-effect EMMA methods had almost equal power of QTN detection in simulation experiments. However, only pLARmEB identified 48 previously reported genes for 7 flowering time-related traits in Arabidopsis thaliana.
Bias due to two-stage residual-outcome regression analysis in genetic association studies.
Demissie, Serkalem; Cupples, L Adrienne
2011-11-01
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.
Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra
2010-01-01
In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Sunkara, Vasu; Hébert, James R.
2015-01-01
BACKGROUND Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. METHODS The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the “divergents” removed. RESULTS The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. CONCLUSIONS The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. PMID:25572676
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-25
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
Sunkara, Vasu; Hébert, James R
2015-05-15
Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the "divergents" removed. The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. © 2015 American Cancer Society.
Use of probabilistic weights to enhance linear regression myoelectric control
NASA Astrophysics Data System (ADS)
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
NASA Astrophysics Data System (ADS)
Cambra-López, María; Winkel, Albert; Mosquera, Julio; Ogink, Nico W. M.; Aarnink, André J. A.
2015-06-01
The objective of this study was to compare co-located real-time light scattering devices and equivalent gravimetric samplers in poultry and pig houses for PM10 mass concentration, and to develop animal-specific calibration factors for light scattering samplers. These results will contribute to evaluate the comparability of different sampling instruments for PM10 concentrations. Paired DustTrak light scattering device (DustTrak aerosol monitor, TSI, U.S.) and PM10 gravimetric cyclone sampler were used for measuring PM10 mass concentrations during 24 h periods (from noon to noon) inside animal houses. Sampling was conducted in 32 animal houses in the Netherlands, including broilers, broiler breeders, layers in floor and in aviary system, turkeys, piglets, growing-finishing pigs in traditional and low emission housing with dry and liquid feed, and sows in individual and group housing. A total of 119 pairs of 24 h measurements (55 for poultry and 64 for pigs) were recorded and analyzed using linear regression analysis. Deviations between samplers were calculated and discussed. In poultry, cyclone sampler and DustTrak data fitted well to a linear regression, with a regression coefficient equal to 0.41, an intercept of 0.16 mg m-3 and a correlation coefficient of 0.91 (excluding turkeys). Results in turkeys showed a regression coefficient equal to 1.1 (P = 0.49), an intercept of 0.06 mg m-3 (P < 0.0001) and a correlation coefficient of 0.98. In pigs, we found a regression coefficient equal to 0.61, an intercept of 0.05 mg m-3 and a correlation coefficient of 0.84. Measured PM10 concentrations using DustTraks were clearly underestimated (approx. by a factor 2) in both poultry and pig housing systems compared with cyclone pre-separators. Absolute, relative, and random deviations increased with concentration. DustTrak light scattering devices should be self-calibrated to investigate PM10 mass concentrations accurately in animal houses. We recommend linear regression equations as animal-specific calibration factors for DustTraks instead of manufacturer calibration factors, especially in heavily dusty environments such as animal houses.
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thompson, Aidan P.; Swiler, Laura P.; Trott, Christian R.
2015-03-15
Here, we present a new interatomic potential for solids and liquids called Spectral Neighbor Analysis Potential (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projected onto a basis of hyperspherical harmonics in four dimensions. The bispectrum components are the same bond-orientational order parameters employed by the GAP potential [1].more » The SNAP potential, unlike GAP, assumes a linear relationship between atom energy and bispectrum components. The linear SNAP coefficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. We demonstrate that a previously unnoticed symmetry property can be exploited to reduce the computational cost of the force calculations by more than one order of magnitude. We present results for a SNAP potential for tantalum, showing that it accurately reproduces a range of commonly calculated properties of both the crystalline solid and the liquid phases. In addition, unlike simpler existing potentials, SNAP correctly predicts the energy barrier for screw dislocation migration in BCC tantalum.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thompson, A.P., E-mail: athomps@sandia.gov; Swiler, L.P., E-mail: lpswile@sandia.gov; Trott, C.R., E-mail: crtrott@sandia.gov
2015-03-15
We present a new interatomic potential for solids and liquids called Spectral Neighbor Analysis Potential (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projected onto a basis of hyperspherical harmonics in four dimensions. The bispectrum components are the same bond-orientational order parameters employed by the GAP potential [1]. Themore » SNAP potential, unlike GAP, assumes a linear relationship between atom energy and bispectrum components. The linear SNAP coefficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. We demonstrate that a previously unnoticed symmetry property can be exploited to reduce the computational cost of the force calculations by more than one order of magnitude. We present results for a SNAP potential for tantalum, showing that it accurately reproduces a range of commonly calculated properties of both the crystalline solid and the liquid phases. In addition, unlike simpler existing potentials, SNAP correctly predicts the energy barrier for screw dislocation migration in BCC tantalum.« less
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Marrero-Ponce, Yovani; Martínez-Albelo, Eugenio R; Casañola-Martín, Gerardo M; Castillo-Garit, Juan A; Echevería-Díaz, Yunaimy; Zaldivar, Vicente Romero; Tygat, Jan; Borges, José E Rodriguez; García-Domenech, Ramón; Torrens, Francisco; Pérez-Giménez, Facundo
2010-11-01
Novel bond-level molecular descriptors are proposed, based on linear maps similar to the ones defined in algebra theory. The kth edge-adjacency matrix (E(k)) denotes the matrix of bond linear indices (non-stochastic) with regard to canonical basis set. The kth stochastic edge-adjacency matrix, ES(k), is here proposed as a new molecular representation easily calculated from E(k). Then, the kth stochastic bond linear indices are calculated using ES(k) as operators of linear transformations. In both cases, the bond-type formalism is developed. The kth non-stochastic and stochastic total linear indices are calculated by adding the kth non-stochastic and stochastic bond linear indices, respectively, of all bonds in molecule. First, the new bond-based molecular descriptors (MDs) are tested for suitability, for the QSPRs, by analyzing regressions of novel indices for selected physicochemical properties of octane isomers (first round). General performance of the new descriptors in this QSPR studies is evaluated with regard to the well-known sets of 2D/3D MDs. From the analysis, we can conclude that the non-stochastic and stochastic bond-based linear indices have an overall good modeling capability proving their usefulness in QSPR studies. Later, the novel bond-level MDs are also used for the description and prediction of the boiling point of 28 alkyl-alcohols (second round), and to the modeling of the specific rate constant (log k), partition coefficient (log P), as well as the antibacterial activity of 34 derivatives of 2-furylethylenes (third round). The comparison with other approaches (edge- and vertices-based connectivity indices, total and local spectral moments, and quantum chemical descriptors as well as E-state/biomolecular encounter parameters) exposes a good behavior of our method in this QSPR studies. Finally, the approach described in this study appears to be a very promising structural invariant, useful not only for QSPR studies but also for similarity/diversity analysis and drug discovery protocols.
Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Gimelfarb, A.; Willis, J. H.
1994-01-01
An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Vyskocil, Erich; Gruther, Wolfgang; Steiner, Irene; Schuhfried, Othmar
2014-07-01
Disease-specific categories of the International Classification of Functioning, Disability and Health have not yet been described for patients with chronic peripheral arterial obstructive disease (PAD). The authors examined the relationship between the categories of the Brief Core Sets for ischemic heart diseases with the Peripheral Artery Questionnaire and the ankle-brachial index to determine which International Classification of Functioning, Disability and Health categories are most relevant for patients with PAD. This is a retrospective cohort study including 77 patients with verified PAD. Statistical analyses of the relationship between International Classification of Functioning, Disability and Health categories as independent variables and the endpoints Peripheral Artery Questionnaire or ankle-brachial index were carried out by simple and stepwise linear regression models adjusting for age, sex, and leg (left vs. right). The stepwise linear regression model with the ankle-brachial index as dependent variable revealed a significant effect of the variables blood vessel functions and muscle endurance functions. Calculating a stepwise linear regression model with the Peripheral Artery Questionnaire as dependent variable, a significant effect of age, emotional functions, energy and drive functions, carrying out daily routine, as well as walking could be observed. This study identifies International Classification of Functioning, Disability and Health categories in the Brief Core Sets for ischemic heart diseases that show a significant effect on the ankle-brachial index and the Peripheral Artery Questionnaire score in patients with PAD. These categories provide fundamental information on functioning of patients with PAD and patient-centered outcomes for rehabilitation interventions.
Rha, Koon H; Abdel Raheem, Ali; Park, Sung Y; Kim, Kwang H; Kim, Hyung J; Koo, Kyo C; Choi, Young D; Jung, Byung H; Lee, Sang K; Lee, Won K; Krishnan, Jayram; Shin, Tae Y; Cho, Jin-Seon
2017-11-01
To assess the correlation of the resected and ischaemic volume (RAIV), which is a preoperatively calculated volume of nephron loss, with the amount of postoperative renal function (PRF) decline after minimally invasive partial nephrectomy (PN) in a multi-institutional dataset. We identified 348 patients from March 2005 to December 2013 at six institutions. Data on all cases of laparoscopic (n = 85) and robot-assisted PN (n = 263) performed were retrospectively gathered. Univariable and multivariable linear regression analyses were used to identify the associations between various time points of PRF and the RAIV, as a continuous variable. The mean (sd) RAIV was 24.2 (29.2) cm 3 . The mean preoperative estimated glomerular filtration rate (eGFR) and the eGFRs at postoperative day 1, 6 and 36 months after PN were 91.0 and 76.8, 80.2 and 87.7 mL/min/1.73 m 2 , respectively. In multivariable linear regression analysis, the amount of decline in PRF at follow-up was significantly correlated with the RAIV (β 0.261, 0.165, 0.260 at postoperative day 1, 6 and 36 months after PN, respectively). This study has the limitation of its retrospective nature. Preoperatively calculated RAIV significantly correlates with the amount of decline in PRF during long-term follow-up. The RAIV could lead our research to the level of prediction of the amount of PRF decline after PN and thus would be appropriate for assessing the technical advantages of emerging techniques. © 2017 The Authors BJU International © 2017 BJU International Published by John Wiley & Sons Ltd.
Schüle, Steffen Andreas; Gabriel, Katharina M A; Bolte, Gabriele
2017-06-01
The environmental justice framework states that besides environmental burdens also resources may be social unequally distributed both on the individual and on the neighbourhood level. This ecological study investigated whether neighbourhood socioeconomic position (SEP) was associated with neighbourhood public green space availability in a large German city with more than 1 million inhabitants. Two different measures were defined for green space availability. Firstly, percentage of green space within neighbourhoods was calculated with the additional consideration of various buffers around the boundaries. Secondly, percentage of green space was calculated based on various radii around the neighbourhood centroid. An index of neighbourhood SEP was calculated with principal component analysis. Log-gamma regression from the group of generalized linear models was applied in order to consider the non-normal distribution of the response variable. All models were adjusted for population density. Low neighbourhood SEP was associated with decreasing neighbourhood green space availability including 200m up to 1000m buffers around the neighbourhood boundaries. Low neighbourhood SEP was also associated with decreasing green space availability based on catchment areas measured from neighbourhood centroids with different radii (1000m up to 3000 m). With an increasing radius the strength of the associations decreased. Social unequally distributed green space may amplify environmental health inequalities in an urban context. Thus, the identification of vulnerable neighbourhoods and population groups plays an important role for epidemiological research and healthy city planning. As a methodical aspect, log-gamma regression offers an adequate parametric modelling strategy for positively distributed environmental variables. Copyright © 2017 Elsevier GmbH. All rights reserved.
Bouti, Khalid; Benamor, Jouda; Bourkadi, Jamal Eddine
2017-08-01
Peak Expiratory Flow (PEF) has never been characterised among healthy Moroccan school children. To study the relationship between PEF and anthropometric parameters (sex, age, height and weight) in healthy Moroccan school children, to establish predictive equations of PEF; and to compare flowmetric and spirometric PEF with Forced Expiratory Volume in 1 second (FEV1). This cross-sectional study was conducted between April, 2016 and May, 2016. It involved 222 (122 boys and 100 girls) healthy school children living in Ksar el-Kebir, Morocco. We used mobile equipments for realisation of spirometry and peak expiratory flow measurements. SPSS (Version 22.0) was used to calculate Student's t-test, Pearson's correlation coefficient and linear regression. Significant linear correlation was seen between PEF, age and height in boys and girls. The equation for prediction of flowmetric PEF in boys was calculated as 'F-PEF = -187+ 24.4 Age + 1.61 Height' (p-value<0.001, r=0.86), and for girls as 'F-PEF = -151 + 17Age + 1.59Height' (p-value<0.001, r=0.86). The equation for prediction of spirometric PEF in boys was calculated as 'S-PEF = -199+ 9.8Age + 2.67Height' (p-value<0.05, r=0.77), and for girls as 'S-PEF = -181 + 8.5Age + 2.5Height' (p-value<0.001, r=0.83). The boys had higher values than the girls. The performance of the Mini Wright Peak Flow Meter was lower than that of a spirometer. Our study established PEF predictive equations in Moroccan children. Our results appeared to be reliable, as evident by the high correlation coefficient in this sample. PEF can be an alternative of FEV1 in centers without spirometry.
Daily commuting to work is not associated with variables of health.
Mauss, Daniel; Jarczok, Marc N; Fischer, Joachim E
2016-01-01
Commuting to work is thought to have a negative impact on employee health. We tested the association of work commute and different variables of health in German industrial employees. Self-rated variables of an industrial cohort (n = 3805; 78.9 % male) including absenteeism, presenteeism and indices reflecting stress and well-being were assessed by a questionnaire. Fasting blood samples, heart-rate variability and anthropometric data were collected. Commuting was grouped into one of four categories: 0-19.9, 20-44.9, 45-59.9, ≥60 min travelling one way to work. Bivariate associations between commuting and all variables under study were calculated. Linear regression models tested this association further, controlling for potential confounders. Commuting was positively correlated with waist circumference and inversely with triglycerides. These associations did not remain statistically significant in linear regression models controlling for age, gender, marital status, and shiftwork. No other association with variables of physical, psychological, or mental health and well-being could be found. The results indicate that commuting to work has no significant impact on well-being and health of German industrial employees.
NASA Astrophysics Data System (ADS)
Rose, R.; Aizenman, H.; Mei, E.; Choudhury, N.
2013-12-01
High School students interested in the STEM fields benefit most when actively participating, so I created a series of learning modules on how to analyze complex systems using machine-learning that give automated feedback to students. The automated feedbacks give timely responses that will encourage the students to continue testing and enhancing their programs. I have designed my modules to take the tactical learning approach in conveying the concepts behind correlation, linear regression, and vector distance based classification and clustering. On successful completion of these modules, students will learn how to calculate linear regression, Pearson's correlation, and apply classification and clustering techniques to a dataset. Working on these modules will allow the students to take back to the classroom what they've learned and then apply it to the Earth Science curriculum. During my research this summer, we applied these lessons to analyzing river deltas; we looked at trends in the different variables over time, looked for similarities in NDVI, precipitation, inundation, runoff and discharge, and attempted to predict floods based on the precipitation, waves mean, area of discharge, NDVI, and inundation.
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models
1990-09-01
contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Yangho; Lee, Byung-Kook, E-mail: bklee@sch.ac.kr
Introduction: The objective of this study was to evaluate associations between blood lead, cadmium, and mercury levels with estimated glomerular filtration rate in a general population of South Korean adults. Methods: This was a cross-sectional study based on data obtained in the Korean National Health and Nutrition Examination Survey (KNHANES) (2008-2010). The final analytical sample consisted of 5924 participants. Estimated glomerular filtration rate (eGFR) was calculated using the MDRD Study equation as an indicator of glomerular function. Results: In multiple linear regression analysis of log2-transformed blood lead as a continuous variable on eGFR, after adjusting for covariates including cadmium andmore » mercury, the difference in eGFR levels associated with doubling of blood lead were -2.624 mL/min per 1.73 m Superscript-Two (95% CI: -3.803 to -1.445). In multiple linear regression analysis using quartiles of blood lead as the independent variable, the difference in eGFR levels comparing participants in the highest versus the lowest quartiles of blood lead was -3.835 mL/min per 1.73 m Superscript-Two (95% CI: -5.730 to -1.939). In a multiple linear regression analysis using blood cadmium and mercury, as continuous or categorical variables, as independent variables, neither metal was a significant predictor of eGFR. Odds ratios (ORs) and 95% CI values for reduced eGFR calculated for log2-transformed blood metals and quartiles of the three metals showed similar trends after adjustment for covariates. Discussion: In this large, representative sample of South Korean adults, elevated blood lead level was consistently associated with lower eGFR levels and with the prevalence of reduced eGFR even in blood lead levels below 10 {mu}g/dL. In conclusion, elevated blood lead level was associated with lower eGFR in a Korean general population, supporting the role of lead as a risk factor for chronic kidney disease.« less
Thomas, Colleen; Swayne, David E
2007-03-01
Thermal inactivation of the H5N1 high pathogenicity avian influenza (HPAI) virus strain A/chicken/Korea/ES/2003 (Korea/03) was quantitatively measured in thigh and breast meat harvested from infected chickens. The Korea/03 titers were recorded as the mean embryo infectious dose (EID50) and were 10(8.0) EID50/g in uncooked thigh samples and 10(7.5) EID50/g in uncooked breast samples. Survival curves were constructed for Korea/03 in chicken thigh and breast meat at 1 degrees C intervals for temperatures of 57 to 61 degrees C. Although some curves had a slightly biphasic shape, a linear model provided a fair-to-good fit at all temperatures, with R2 values of 0.85 to 0.93. Stepwise linear regression revealed that meat type did not contribute significantly to the regression model and generated a single linear regression equation for z-value calculations and D-value predictions for Korea/03 in both meat types. The z-value and the upper limit of the 95% confidence interval for the z-value were 4.64 and 5.32 degrees C, respectively. From the lowest temperature to the highest, the predicted D-values and the upper limits of their 95% prediction intervals (conservative D-values) for 57 to 61 degrees C were 241.2 and 321.1 s, 146.8 and 195.4 s, 89.3 and 118.9 s, 54.4 and 72.4 s, and 33.1 and 44.0 s. D-values and conservative D-values predicted for higher temperatures were 0.28 and 0.50 s for 70 degrees C and 0.041 and 0.073 s for 73.9 degrees C. Calculations with the conservative D-values predicted that cooking chicken meat according to current U.S. Department of Agriculture Food Safety and Inspection Service time-temperature guidelines will inactivate Korea/03 in a heavily contaminated meat sample, such as those tested in this study, with a large margin of safety.
Validity of Treadmill-Derived Critical Speed on Predicting 5000-Meter Track-Running Performance.
Nimmerichter, Alfred; Novak, Nina; Triska, Christoph; Prinz, Bernhard; Breese, Brynmor C
2017-03-01
Nimmerichter, A, Novak, N, Triska, C, Prinz, B, and Breese, BC. Validity of treadmill-derived critical speed on predicting 5,000-meter track-running performance. J Strength Cond Res 31(3): 706-714, 2017-To evaluate 3 models of critical speed (CS) for the prediction of 5,000-m running performance, 16 trained athletes completed an incremental test on a treadmill to determine maximal aerobic speed (MAS) and 3 randomly ordered runs to exhaustion at the [INCREMENT]70% intensity, at 110% and 98% of MAS. Critical speed and the distance covered above CS (D') were calculated using the hyperbolic speed-time (HYP), the linear distance-time (LIN), and the linear speed inverse-time model (INV). Five thousand meter performance was determined on a 400-m running track. Individual predictions of 5,000-m running time (t = [5,000-D']/CS) and speed (s = D'/t + CS) were calculated across the 3 models in addition to multiple regression analyses. Prediction accuracy was assessed with the standard error of estimate (SEE) from linear regression analysis and the mean difference expressed in units of measurement and coefficient of variation (%). Five thousand meter running performance (speed: 4.29 ± 0.39 m·s; time: 1,176 ± 117 seconds) was significantly better than the predictions from all 3 models (p < 0.0001). The mean difference was 65-105 seconds (5.7-9.4%) for time and -0.22 to -0.34 m·s (-5.0 to -7.5%) for speed. Predictions from multiple regression analyses with CS and D' as predictor variables were not significantly different from actual running performance (-1.0 to 1.1%). The SEE across all models and predictions was approximately 65 seconds or 0.20 m·s and is therefore considered as moderate. The results of this study have shown the importance of aerobic and anaerobic energy system contribution to predict 5,000-m running performance. Using estimates of CS and D' is valuable for predicting performance over race distances of 5,000 m.
Wang, Chao-Qun; Jia, Xiu-Hong; Zhu, Shu; Komatsu, Katsuko; Wang, Xuan; Cai, Shao-Qing
2015-03-01
A new quantitative analysis of multi-component with single marker (QAMS) method for 11 saponins (ginsenosides Rg1, Rb1, Rg2, Rh1, Rf, Re and Rd; notoginsenosides R1, R4, Fa and K) in notoginseng was established, when 6 of these saponins were individually used as internal referring substances to investigate the influences of chemical structure, concentrations of quantitative components, and purities of the standard substances on the accuracy of the QAMS method. The results showed that the concentration of the analyte in sample solution was the major influencing parameter, whereas the other parameters had minimal influence on the accuracy of the QAMS method. A new method for calculating the relative correction factors by linear regression was established (linear regression method), which demonstrated to decrease standard method differences of the QAMS method from 1.20%±0.02% - 23.29%±3.23% to 0.10%±0.09% - 8.84%±2.85% in comparison with the previous method. And the differences between external standard method and the QAMS method using relative correction factors calculated by linear regression method were below 5% in the quantitative determination of Rg1, Re, R1, Rd and Fa in 24 notoginseng samples and Rb1 in 21 notoginseng samples. And the differences were mostly below 10% in the quantitative determination of Rf, Rg2, R4 and N-K (the differences of these 4 constituents bigger because their contents lower) in all the 24 notoginseng samples. The results indicated that the contents assayed by the new QAMS method could be considered as accurate as those assayed by external standard method. In addition, a method for determining applicable concentration ranges of the quantitative components assayed by QAMS method was established for the first time, which could ensure its high accuracy and could be applied to QAMS methods of other TCMs. The present study demonstrated the practicability of the application of the QAMS method for the quantitative analysis of multi-component and the quality control of TCMs and TCM prescriptions. Copyright © 2014 Elsevier B.V. All rights reserved.
Baqué, Michèle; Amendt, Jens
2013-01-01
Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Automated Algorithms for Quantum-Level Accuracy in Atomistic Simulations: LDRD Final Report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thompson, Aidan Patrick; Schultz, Peter Andrew; Crozier, Paul
2014-09-01
This report summarizes the result of LDRD project 12-0395, titled "Automated Algorithms for Quantum-level Accuracy in Atomistic Simulations." During the course of this LDRD, we have developed an interatomic potential for solids and liquids called Spectral Neighbor Analysis Poten- tial (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projectedmore » on to a basis of hyperspherical harmonics in four dimensions. The SNAP coef- ficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. Global optimization methods in the DAKOTA software package are used to seek out good choices of hyperparameters that define the overall structure of the SNAP potential. FitSnap.py, a Python-based software pack- age interfacing to both LAMMPS and DAKOTA is used to formulate the linear regression problem, solve it, and analyze the accuracy of the resultant SNAP potential. We describe a SNAP potential for tantalum that accurately reproduces a variety of solid and liquid properties. Most significantly, in contrast to existing tantalum potentials, SNAP correctly predicts the Peierls barrier for screw dislocation motion. We also present results from SNAP potentials generated for indium phosphide (InP) and silica (SiO 2 ). We describe efficient algorithms for calculating SNAP forces and energies in molecular dynamics simulations using massively parallel computers and advanced processor ar- chitectures. Finally, we briefly describe the MSM method for efficient calculation of electrostatic interactions on massively parallel computers.« less
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
Uveal Melanoma Regression after Brachytherapy: Relationship with Chromosome 3 Monosomy Status.
Salvi, Sachin M; Aziz, Hassan A; Dar, Suhail; Singh, Nakul; Hayden-Loreck, Brandy; Singh, Arun D
2017-07-01
The objective was to evaluate the relationship between the regression rate of ciliary body melanoma and choroidal melanoma after brachytherapy and chromosome 3 monosomy status. We conducted a prospective and consecutive case series of patients who underwent biopsy and brachytherapy for ciliary/choroidal melanoma. Tumor biopsy performed at the time of radiation plaque placement was analyzed with fluorescence in situ hybridization to determine the percentage of tumor cells with chromosome 3 monosomy. The regression rate was calculated as the percent change in tumor height at months 3, 6, and 12. The relationship between regression rate and tumor location, initial tumor height, and chromosome 3 monosomy (percentage) was assessed by univariate linear regression (R version 3.1.0). Of the 75 patients included in the study, 8 had ciliary body melanoma, and 67 were choroidal melanomas. The mean tumor height at the time of diagnosis was 5.2 mm (range: 1.90-13.00). The percentage composition of chromosome 3 monosomy ranged from 0-20% (n = 35) to 81-100% (n = 40). The regression of tumor height at months 3, 6, and 12 did not statistically correlate with tumor location (ciliary or choroidal), initial tumor height, or chromosome 3 monosomy (percentage). The regression rate of choroidal melanoma following brachytherapy did not correlate with chromosome 3 monosomy status.
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Control Variate Selection for Multiresponse Simulation.
1987-05-01
M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
ERIC Educational Resources Information Center
Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael
2011-01-01
This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.
Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D
2018-05-30
NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
A pocket-sized metabolic analyzer for assessment of resting energy expenditure.
Zhao, Di; Xian, Xiaojun; Terrera, Mirna; Krishnan, Ranganath; Miller, Dylan; Bridgeman, Devon; Tao, Kevin; Zhang, Lihua; Tsow, Francis; Forzani, Erica S; Tao, Nongjian
2014-04-01
The assessment of metabolic parameters related to energy expenditure has a proven value for weight management; however these measurements remain too difficult and costly for monitoring individuals at home. The objective of this study is to evaluate the accuracy of a new pocket-sized metabolic analyzer device for assessing energy expenditure at rest (REE) and during sedentary activities (EE). The new device performs indirect calorimetry by measuring an individual's oxygen consumption (VO2) and carbon dioxide production (VCO2) rates, which allows the determination of resting- and sedentary activity-related energy expenditure. VO2 and VCO2 values of 17 volunteer adult subjects were measured during resting and sedentary activities in order to compare the metabolic analyzer with the Douglas bag method. The Douglas bag method is considered the Gold Standard method for indirect calorimetry. Metabolic parameters of VO2, VCO2, and energy expenditure were compared using linear regression analysis, paired t-tests, and Bland-Altman plots. Linear regression analysis of measured VO2 and VCO2 values, as well as calculated energy expenditure assessed with the new analyzer and Douglas bag method, had the following linear regression parameters (linear regression slope LRS0, and R-squared coefficient, r(2)) with p = 0: LRS0 (SD) = 1.00 (0.01), r(2) = 0.9933 for VO2; LRS0 (SD) = 1.00 (0.01), r(2) = 0.9929 for VCO2; and LRS0 (SD) = 1.00 (0.01), r(2) = 0.9942 for energy expenditure. In addition, results from paired t-tests did not show statistical significant difference between the methods with a significance level of α = 0.05 for VO2, VCO2, REE, and EE. Furthermore, the Bland-Altman plot for REE showed good agreement between methods with 100% of the results within ±2SD, which was equivalent to ≤10% error. The findings demonstrate that the new pocket-sized metabolic analyzer device is accurate for determining VO2, VCO2, and energy expenditure. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Methodology for the development of normative data for Spanish-speaking pediatric populations.
Rivera, D; Arango-Lasprilla, J C
2017-01-01
To describe the methodology utilized to calculate reliability and the generation of norms for 10 neuropsychological tests for children in Spanish-speaking countries. The study sample consisted of over 4,373 healthy children from nine countries in Latin America (Chile, Cuba, Ecuador, Guatemala, Honduras, Mexico, Paraguay, Peru, and Puerto Rico) and Spain. Inclusion criteria for all countries were to have between 6 to 17 years of age, an Intelligence Quotient of≥80 on the Test of Non-Verbal Intelligence (TONI-2), and score of <19 on the Children's Depression Inventory. Participants completed 10 neuropsychological tests. Reliability and norms were calculated for all tests. Test-retest analysis showed excellent or good- reliability on all tests (r's>0.55; p's<0.001) except M-WCST perseverative errors whose coefficient magnitude was fair. All scores were normed using multiple linear regressions and standard deviations of residual values. Age, age2, sex, and mean level of parental education (MLPE) were included as predictors in the models by country. The non-significant variables (p > 0.05) were removed and the analysis were run again. This is the largest Spanish-speaking children and adolescents normative study in the world. For the generation of normative data, the method based on linear regression models and the standard deviation of residual values was used. This method allows determination of the specific variables that predict test scores, helps identify and control for collinearity of predictive variables, and generates continuous and more reliable norms than those of traditional methods.
Climate patterns as predictors of amphibians species richness and indicators of potential stress
Battaglin, W.; Hay, L.; McCabe, G.; Nanjappa, P.; Gallant, Alisa L.
2005-01-01
Amphibians occupy a range of habitats throughout the world, but species richness is greatest in regions with moist, warm climates. We modeled the statistical relations of anuran and urodele species richness with mean annual climate for the conterminous United States, and compared the strength of these relations at national and regional levels. Model variables were calculated for county and subcounty mapping units, and included 40-year (1960-1999) annual mean and mean annual climate statistics, mapping unit average elevation, mapping unit land area, and estimates of anuran and urodele species richness. Climate data were derived from more than 7,500 first-order and cooperative meteorological stations and were interpolated to the mapping units using multiple linear regression models. Anuran and urodele species richness were calculated from the United States Geological Survey's Amphibian Research and Monitoring Initiative (ARMI) National Atlas for Amphibian Distributions. The national multivariate linear regression (MLR) model of anuran species richness had an adjusted coefficient of determination (R2) value of 0.64 and the national MLR model for urodele species richness had an R2 value of 0.45. Stratifying the United States by coarse-resolution ecological regions provided models for anUrans that ranged in R2 values from 0.15 to 0.78. Regional models for urodeles had R2 values. ranging from 0.27 to 0.74. In general, regional models for anurans were more strongly influenced by temperature variables, whereas precipitation variables had a larger influence on urodele models.
SU-F-R-20: Image Texture Features Correlate with Time to Local Failure in Lung SBRT Patients
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, M; Abazeed, M; Woody, N
Purpose: To explore possible correlation between CT image-based texture and histogram features and time-to-local-failure in early stage non-small cell lung cancer (NSCLC) patients treated with stereotactic body radiotherapy (SBRT).Methods and Materials: From an IRB-approved lung SBRT registry for patients treated between 2009–2013 we selected 48 (20 male, 28 female) patients with local failure. Median patient age was 72.3±10.3 years. Mean time to local failure was 15 ± 7.1 months. Physician-contoured gross tumor volumes (GTV) on the planning CT images were processed and 3D gray-level co-occurrence matrix (GLCM) based texture and histogram features were calculated in Matlab. Data were exported tomore » R and a multiple linear regression model was used to examine the relationship between texture features and time-to-local-failure. Results: Multiple linear regression revealed that entropy (p=0.0233, multiple R2=0.60) from GLCM-based texture analysis and the standard deviation (p=0.0194, multiple R2=0.60) from the histogram-based features were statistically significantly correlated with the time-to-local-failure. Conclusion: Image-based texture analysis can be used to predict certain aspects of treatment outcomes of NSCLC patients treated with SBRT. We found entropy and standard deviation calculated for the GTV on the CT images displayed a statistically significant correlation with and time-to-local-failure in lung SBRT patients.« less
Experimental paleotemperature equation for planktonic foraminifera
NASA Astrophysics Data System (ADS)
Erez, Jonathan; Luz, Boaz
1983-06-01
Small live individuals of Globigerinoides sacculifer which were cultured in the laboratory reached maturity and produced garnets. Fifty to ninety percent of their skeleton weight was deposited under controlled water temperature (14° to 30°C) and water isotopic composition, and a correction was made to account for the isotopic composition of the original skeleton using control groups. Comparison of. the actual growth temperatures with the calculated temperature based on paleotemperature equations for inorganic CaCO 3 indicate that the foraminifera precipitate their CaCO 3 in isotopic equilibrium. Comparison with equations developed for biogenic calcite give a similarly good fit. Linear regression with CRAIG'S (1965) equation yields: t = -0.07 + 1.01 t̂ (r= 0.95) where t is the actual growth temperature and t̂ Is the calculated paleotemperature. The intercept and the slope of this linear equation show that the familiar paleotemperature equation developed originally for mollusca carbonate, is equally applicable for the planktonic foraminifer G. sacculifer. Second order regression of the culture temperature and the delta difference ( δ18Oc - δ18Ow) yield a correlation coefficient of r = 0.95: t̂ = 17.0 - 4.52(δ 18Oc - δ 18Ow) + 0.03(δ 18Oc - δ 18Ow) 2t̂, δ 18Oc and δ18Ow are the estimated temperature, the isotopic composition of the shell carbonate and the sea water respectively. A possible cause for nonequilibnum isotopic compositions reported earlier for living planktonic foraminifera is the improper combustion of the organic matter.
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Applicability of Cameriere's and Drusini's age estimation methods to a sample of Turkish adults.
Hatice, Boyacioglu Dogru; Nihal, Avcu; Nursel, Akkaya; Humeyra Ozge, Yilanci; Goksuluk, Dincer
2017-10-01
The aim of this study was to investigate the applicability of Drusini's and Cameriere's methods to a sample of Turkish people. Panoramic images of 200 individuals were allocated into two groups as study and test groups and examined by two observers. Tooth coronal indexes (TCI), which is the ratio between coronal pulp cavity height and crown height, were calculated in the mandibular first and second premolars and molars. Pulp/tooth area ratios (ARs) were calculated in the maxillary and mandibular canine teeth. Study group measurements were used to derive a regression model. Test group measurements were used to evaluate the accuracy of the regression model. Pearson's correlation coefficients and regression analysis were used. The correlations between TCIs and age were -0.230, -0.301, -0.344 and -0.257 for mandibular first premolar, second premolar, first molar and second molar, respectively. Those for the maxillary canine (MX) and mandibular canine (MN) ARs were -0.716 and -0.514, respectively. The MX ARs were used to build the linear regression model that explained 51.2% of the total variation, with a standard error of 9.23 years. The mean error of the estimates in test group was 8 years and age of 64% of the individuals were estimated with an error of <±10 years which is acceptable in forensic age prediction. The low correlation coefficients between age and TCI indicate that Drusini's method was not applicable to the estimation of age in a Turkish population. Using Cameriere's method, we derived a regression model.
A kinetic energy model of two-vehicle crash injury severity.
Sobhani, Amir; Young, William; Logan, David; Bahrololoom, Sareh
2011-05-01
An important part of any model of vehicle crashes is the development of a procedure to estimate crash injury severity. After reviewing existing models of crash severity, this paper outlines the development of a modelling approach aimed at measuring the injury severity of people in two-vehicle road crashes. This model can be incorporated into a discrete event traffic simulation model, using simulation model outputs as its input. The model can then serve as an integral part of a simulation model estimating the crash potential of components of the traffic system. The model is developed using Newtonian Mechanics and Generalised Linear Regression. The factors contributing to the speed change (ΔV(s)) of a subject vehicle are identified using the law of conservation of momentum. A Log-Gamma regression model is fitted to measure speed change (ΔV(s)) of the subject vehicle based on the identified crash characteristics. The kinetic energy applied to the subject vehicle is calculated by the model, which in turn uses a Log-Gamma Regression Model to estimate the Injury Severity Score of the crash from the calculated kinetic energy, crash impact type, presence of airbag and/or seat belt and occupant age. Copyright © 2010 Elsevier Ltd. All rights reserved.
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
NASA Astrophysics Data System (ADS)
Walawender, Jakub; Kothe, Steffen; Trentmann, Jörg; Pfeifroth, Uwe; Cremer, Roswitha
2017-04-01
The purpose of this study is to create a 1 km2 gridded daily sunshine duration data record for Germany covering the period from 1983 to 2015 (33 years) based on satellite estimates of direct normalised surface solar radiation and in situ sunshine duration observations using a geostatistical approach. The CM SAF SARAH direct normalized irradiance (DNI) satellite climate data record and in situ observations of sunshine duration from 121 weather stations operated by DWD are used as input datasets. The selected period of 33 years is associated with the availability of satellite data. The number of ground stations is limited to 121 as there are only time series with less than 10% of missing observations over the selected period included to keep the long-term consistency of the output sunshine duration data record. In the first step, DNI data record is used to derive sunshine hours by applying WMO threshold of 120 W/m2 (SDU = DNI ≥ 120 W/m2) and weighting of sunny slots to correct the sunshine length between two instantaneous image data due to cloud movement. In the second step, linear regression between SDU and in situ sunshine duration is calculated to adjust the satellite product to the ground observations and the output regression coefficients are applied to create a regression grid. In the last step regression residuals are interpolated with ordinary kriging and added to the regression grid. A comprehensive accuracy assessment of the gridded sunshine duration data record is performed by calculating prediction errors (cross-validation routine). "R" is used for data processing. A short analysis of the spatial distribution and temporal variability of sunshine duration over Germany based on the created dataset will be presented. The gridded sunshine duration data are useful for applications in various climate-related studies, agriculture and solar energy potential calculations.
Depuydt, Christophe E; Thys, Sofie; Beert, Johan; Jonckheere, Jef; Salembier, Geert; Bogers, Johannes J
2016-11-01
Persistent high-risk human papillomavirus (HPV) infection is strongly associated with development of high-grade cervical intraepithelial neoplasia or cancer (CIN3+). In single type infections, serial type-specific viral-load measurements predict the natural history of the infection. In infections with multiple HPV-types, the individual type-specific viral-load profile could distinguish progressing HPV-infections from regressing infections. A case-cohort natural history study was established using samples from untreated women with multiple HPV-infections who developed CIN3+ (n = 57) or cleared infections (n = 88). Enriched cell pellet from liquid based cytology samples were subjected to a clinically validated real-time qPCR-assay (18 HPV-types). Using serial type-specific viral-load measurements (≥3) we calculated HPV-specific slopes and coefficient of determination (R(2) ) by linear regression. For each woman slopes and R(2) were used to calculate which HPV-induced processes were ongoing (progression, regression, serial transient, transient). In transient infections with multiple HPV-types, each single HPV-type generated similar increasing (0.27copies/cell/day) and decreasing (-0.27copies/cell/day) viral-load slopes. In CIN3+, at least one of the HPV-types had a clonal progressive course (R(2) ≥ 0.85; 0.0025copies/cell/day). In selected CIN3+ cases (n = 6), immunostaining detecting type-specific HPV 16, 31, 33, 58 and 67 RNA showed an even staining in clonal populations (CIN3+), whereas in transient virion-producing infections the RNA-staining was less in the basal layer compared to the upper layer where cells were ready to desquamate and release newly-formed virions. RNA-hybridization patterns matched the calculated ongoing processes measured by R(2) and slope in serial type-specific viral-load measurements preceding the biopsy. In women with multiple HPV-types, serial type-specific viral-load measurements predict the natural history of the different HPV-types and elucidates HPV-genotype attribution. © 2016 UICC.
Caruso, Rosario; Scordino, Monica; Traulo, Pasqualino; Gagliano, Giacomo
2012-01-01
A capillary GC-flame ionization detection (FID) method to determine volatile compounds (ethyl acetate, 1,1-diethoxyethane, methyl alcohol, 1-propanol, 2-methyl-1-propanol, 2-methyl-1-butanol, 3-methyl-1-butanol, 1-butanol, and 2-butanol) in wine was investigated in terms of calculation of detection limits and calibration method. The main objectives were: (1) calculation of regression coefficient parameters by ordinary least-squares (OLS) and bivariate least-squares (BLS) regression models, taking into account errors in both axes; (2) estimation of linear dynamic range (LDR) according to International Conference on Harmonization recommendations; (3) performance evaluation of a method by using three different internal standards (ISs) such as acetonitrile, acetone, and 1-pentanol; (4) evaluation of LODs according to the U.S. Environmental Protection Agency (EPA) 3sigma approach and the Hubaux-Vos (H-V) method; (5) application of H-V theory to a gas chromatographic analytical method and to a food matrix; and (6) accuracy assessment of the method relative to methyl alcohol content through a Unione Italiana Vini (UIV) interlaboratory proficiency test. Calibration curves calculated via BLS and OLS show similar slopes, while intercepts are closer to zero in the first case, independent of the chosen IS. The studied ISs show a substantially equivalent behavior, even though the IS closer to the analyte retention time seems to be more appropriate in terms of LDR and LOD. Results indicate an underestimation of LODs using the EPA 3sigma approach instead of the more realistic H-V method, both with OLS and BLS regression models. Methanol contents compared with UIV average values indicate recovery between 90 and 110%.
Geochemistry of some rare earth elements in groundwater, Vierlingsbeek, The Netherlands.
Janssen, René P T; Verweij, Wilko
2003-03-01
Groundwater samples were taken from seven bore holes at depths ranging from 2 to 41m nearby drinking water pumping station Vierlingsbeek, The Netherlands and analysed for Y, La, Ce, Pr, Nd, Sm and Eu. Shale-normalized patterns were generally flat and showed that the observed rare earth elements (REE) were probably of natural origin. In the shallow groundwaters the REEs were light REE (LREE) enriched, probably caused by binding of LREEs to colloids. To improve understanding of the behaviour of the REE, two approaches were used: calculations of the speciation and a statistical approach. For the speciation calculations, complexation and precipitation reactions including inorganic and dissolved organic carbon (DOC) compounds, were taken into account. The REE speciation showed REE(3+), REE(SO(4))(+), REE(CO(3))(+) and REE(DOC) being the major species. Dissolution of pure REE precipitates and REE-enriched solid phases did not account for the observed REEs in groundwater. Regulation of REE concentrations by adsorption-desorption processes to Fe(III)(OH)(3) and Al(OH)(3) minerals, which were calculated to be present in nearly all groundwaters, is a probable explanation. The statistical approach (multiple linear regression) showed that pH is by far the most significant groundwater characteristic which contributes to the variation in REE concentrations. Also DOC, SO(4), Fe and Al contributed significantly, although to a much lesser extent, to the variation in REE concentrations. This is in line with the calculated REE-species in solution and REE-adsorption to iron and aluminium (hydr)oxides. Regression equations including only pH, were derived to predict REE concentrations in groundwater. External validation showed that these regression equations were reasonably successful to predict REE concentrations of groundwater of another drinking water pumping station in quite different region of The Netherlands.
Potential pitfalls when denoising resting state fMRI data using nuisance regression.
Bright, Molly G; Tench, Christopher R; Murphy, Kevin
2017-07-01
In resting state fMRI, it is necessary to remove signal variance associated with noise sources, leaving cleaned fMRI time-series that more accurately reflect the underlying intrinsic brain fluctuations of interest. This is commonly achieved through nuisance regression, in which the fit is calculated of a noise model of head motion and physiological processes to the fMRI data in a General Linear Model, and the "cleaned" residuals of this fit are used in further analysis. We examine the statistical assumptions and requirements of the General Linear Model, and whether these are met during nuisance regression of resting state fMRI data. Using toy examples and real data we show how pre-whitening, temporal filtering and temporal shifting of regressors impact model fit. Based on our own observations, existing literature, and statistical theory, we make the following recommendations when employing nuisance regression: pre-whitening should be applied to achieve valid statistical inference of the noise model fit parameters; temporal filtering should be incorporated into the noise model to best account for changes in degrees of freedom; temporal shifting of regressors, although merited, should be achieved via optimisation and validation of a single temporal shift. We encourage all readers to make simple, practical changes to their fMRI denoising pipeline, and to regularly assess the appropriateness of the noise model used. By negotiating the potential pitfalls described in this paper, and by clearly reporting the details of nuisance regression in future manuscripts, we hope that the field will achieve more accurate and precise noise models for cleaning the resting state fMRI time-series. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Fisz, Jacek J
2006-12-07
The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear combinations of nonlinear functions, is indicated. The VP algorithm does not distinguish the weakly nonlinear parameters from the nonlinear ones and it does not apply to the model functions which are multi-linear combinations of nonlinear functions.
2015-07-15
Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
ERIC Educational Resources Information Center
Li, Deping; Oranje, Andreas
2007-01-01
Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Ernst, Anja F; Albers, Casper J
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Ernst, Anja F.
2017-01-01
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Gardner, Lytt I.; Marks, Gary; Wilson, Tracey E.; Giordano, Thomas P.; Sullivan, Meg; Raper, James L.; Rodriguez, Allan E.; Keruly, Jeanne; Malitz, Faye
2016-01-01
We calculated the financial impact in 6 HIV clinics of a low-effort retention in care intervention involving brief motivational messages from providers, patient brochures, and posters. We used a linear regression model to calculate absolute changes in kept primary care visits from the preintervention year (2008–2009) to the intervention year (2009–2010). Revenue from patients’ insurance was also assessed by clinic. Kept visits improved significantly in the intervention year versus the preintervention year (P < 0.0001). We found a net-positive effect on clinic revenue of +$24,000/year for an average-size clinic (7400 scheduled visits/year). We encourage HIV clinic administrators to consider implementing this low-effort intervention. PMID:25559605
Kawalilak, C E; Lanovaz, J L; Johnston, J D; Kontulainen, S A
2014-09-01
To assess the linearity and sex-specificity of damping coefficients used in a single-damper-model (SDM) when predicting impact forces during the worst-case falling scenario from fall heights up to 25 cm. Using 3-dimensional motion tracking and an integrated force plate, impact forces and impact velocities were assessed from 10 young adults (5 males; 5 females), falling from planted knees onto outstretched arms, from a random order of drop heights: 3, 5, 7, 10, 15, 20, and 25 cm. We assessed the linearity and sex-specificity between impact forces and impact velocities across all fall heights using analysis of variance linearity test and linear regression, respectively. Significance was accepted at P<0.05. Association between impact forces and impact velocities up to 25 cm was linear (P=0.02). Damping coefficients appeared sex-specific (males: 627 Ns/m, R(2)=0.70; females: 421 Ns/m; R(2)=0.81; sex combined: 532 Ns/m, R(2)=0.61). A linear damping coefficient used in the SDM proved valid for predicting impact forces from fall heights up to 25 cm. RESULTS suggested the use of sex-specific damping coefficients when estimating impact force using the SDM and calculating the factor-of-risk for wrist fractures.
Estimating linear temporal trends from aggregated environmental monitoring data
Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.
2017-01-01
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Linear and nonlinear spectroscopy from quantum master equations.
Fetherolf, Jonathan H; Berkelbach, Timothy C
2017-12-28
We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
Linear and nonlinear spectroscopy from quantum master equations
NASA Astrophysics Data System (ADS)
Fetherolf, Jonathan H.; Berkelbach, Timothy C.
2017-12-01
We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
NASA Astrophysics Data System (ADS)
Plegnière, Sabrina; Casper, Markus; Hecker, Benjamin; Müller-Fürstenberger, Georg
2014-05-01
The basis of many models to calculate and assess climate change and its consequences are annual means of temperature and precipitation. This method leads to many uncertainties especially at the regional or local level: the results are not realistic or too coarse. Particularly in agriculture, single events and the distribution of precipitation and temperature during the growing season have enormous influences on plant growth. Therefore, the temporal distribution of climate variables should not be ignored. To reach this goal, a high-resolution ecological-economic model was developed which combines a complex plant growth model (STICS) and an economic model. In this context, input data of the plant growth model are daily climate values for a specific climate station calculated by the statistical climate model (WETTREG). The economic model is deduced from the results of the plant growth model STICS. The chosen plant is corn because corn is often cultivated and used in many different ways. First of all, a sensitivity analysis showed that the plant growth model STICS is suitable to calculate the influences of different cultivation methods and climate on plant growth or yield as well as on soil fertility, e.g. by nitrate leaching, in a realistic way. Additional simulations helped to assess a production function that is the key element of the economic model. Thereby the problems when using mean values of temperature and precipitation in order to compute a production function by linear regression are pointed out. Several examples show why a linear regression to assess a production function based on mean climate values or smoothed natural distribution leads to imperfect results and why it is not possible to deduce a unique climate factor in the production function. One solution for this problem is the additional consideration of stress indices that show the impairment of plants by water or nitrate shortage. Thus, the resulting model takes into account not only the ecological factors (e.g. the plant growth) or the economical factors as a simple monetary calculation, but also their mutual influences. Finally, the ecological-economic model enables us to make a risk assessment or evaluate adaptation strategies.
Modeling Laterality of the Globus Pallidus Internus in Patients With Parkinson's Disease.
Sharim, Justin; Yazdi, Daniel; Baohan, Amy; Behnke, Eric; Pouratian, Nader
2017-04-01
Neurosurgical interventions such as deep brain stimulation surgery of the globus pallidus internus (GPi) play an important role in the treatment of medically refractory Parkinson's disease (PD), and require high targeting accuracy. Variability in the laterality of the GPi across patients with PD has not been well characterized. The aim of this report is to identify factors that may contribute to differences in position of the motor region of GPi. The charts and operative reports of 101 PD patients following deep brain stimulation surgery (70 males, aged 11-78 years) representing 201 GPi were retrospectively reviewed. Data extracted for each subject include age, gender, anterior and posterior commissures (AC-PC) distance, and third ventricular width. Multiple linear regression, stepwise regression, and relative importance of regressors analysis were performed to assess the predictive ability of these variables on GPi laterality. Multiple linear regression for target vs. third ventricular width, gender, AC-PC distance, and age were significant for normalized linear regression coefficients of 0.333 (p < 0.0001), 0.206 (p = 0.00219), 0.168 (p = 0.0119), and 0.159 (p = 0.0136), respectively. Third ventricular width, gender, AC-PC distance, and age each account for 44.06% (21.38-65.69%, 95% CI), 20.82% (10.51-35.88%), 21.46% (8.28-37.05%), and 13.66% (2.62-28.64%) of the R 2 value, respectively. Effect size calculation was significant for a change in the GPi laterality of 0.19 mm per mm of ventricular width, 0.11 mm per mm of AC-PC distance, 0.017 mm per year in age, and 0.54 mm increase for male gender. This variability highlights the limitations of indirect targeting alone, and argues for the continued use of MRI as well as intraoperative physiological testing to account for such factors that contribute to patient-specific variability in GPi localization. © 2016 International Neuromodulation Society.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis
2009-02-01
Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
The effect of clouds on the earth's radiation budget
NASA Technical Reports Server (NTRS)
Ziskin, Daniel; Strobel, Darrell F.
1991-01-01
The radiative fluxes from the Earth Radiation Budget Experiment (ERBE) and the cloud properties from the International Satellite Cloud Climatology Project (ISCCP) over Indonesia for the months of June and July of 1985 and 1986 were analyzed to determine the cloud sensitivity coefficients. The method involved a linear least squares regression between co-incident flux and cloud coverage measurements. The calculated slope is identified as the cloud sensitivity. It was found that the correlations between the total cloud fraction and radiation parameters were modest. However, correlations between cloud fraction and IR flux were improved by separating clouds by height. Likewise, correlations between the visible flux and cloud fractions were improved by distinguishing clouds based on optical depth. Calculating correlations between the net fluxes and either height or optical depth segregated cloud fractions were somewhat improved. When clouds were classified in terms of their height and optical depth, correlations among all the radiation components were improved. Mean cloud sensitivities based on the regression of radiative fluxes against height and optical depth separated cloud types are presented. Results are compared to a one-dimensional radiation model with a simple cloud parameterization scheme.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Georgopoulos, Michael; Zehetmayer, Martin; Ruhswurm, Irene; Toma-Bstaendig, Sabine; Ségur-Eltz, Nikolaus; Sacu, Stefan; Menapace, Rupert
2003-01-01
This study assesses differences in relative tumour regression and internal acoustic reflectivity after 3 methods of radiotherapy for uveal melanoma: (1) brachytherapy with ruthenium-106 radioactive plaques (RU), (2) fractionated high-dose gamma knife stereotactic irradiation in 2-3 fractions (GK) or (3) fractionated linear-accelerator-based stereotactic teletherapy in 5 fractions (Linac). Ultrasound measurements of tumour thickness and internal reflectivity were performed with standardised A scan pre-operatively and 3, 6, 9, 12, 18, 24 and 36 months postoperatively. Of 211 patients included in the study, 111 had a complete 3-year follow-up (RU: 41, GK: 37, Linac: 33). Differences in tumour thickness and internal reflectivity were assessed with analysis of variance, and post hoc multiple comparisons were calculated with Tukey's honestly significant difference test. Local tumour control was excellent with all 3 methods (>93%). At 36 months, relative tumour height reduction was 69, 50 and 30% after RU, GK and Linac, respectively. In all 3 treatment groups, internal reflectivity increased from about 30% initially to 60-70% 3 years after treatment. Brachytherapy with ruthenium-106 plaques results in a faster tumour regression as compared to teletherapy with gamma knife or Linac. Internal reflectivity increases comparably in all 3 groups. Besides tumour growth arrest, increasing internal reflectivity is considered as an important factor indicating successful treatment. Copyright 2003 S. Karger AG, Basel
NASA Astrophysics Data System (ADS)
Arantes Camargo, Livia; Marques, José, Jr.
2015-04-01
The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.
Bebbington, Emily; Furniss, Dominic
2015-02-01
We integrated two factors, demographic population shifts and changes in prevalence of disease, to predict future trends in demand for hand surgery in England, to facilitate workforce planning. We analysed Hospital Episode Statistics data for Dupuytren's disease, carpal tunnel syndrome, cubital tunnel syndrome, and trigger finger from 1998 to 2011. Using linear regression, we estimated trends in both diagnosis and surgery until 2030. We integrated this regression with age specific population data from the Office for National Statistics in order to estimate how this will contribute to a change in workload over time. There has been a significant increase in both absolute numbers of diagnoses and surgery for all four conditions. Combined with future population data, we calculate that the total operative burden for these four conditions will increase from 87,582 in 2011 to 170,166 (95% confidence interval 144,517-195,353) in 2030. The prevalence of these diseases in the ageing population, and increasing prevalence of predisposing factors such as obesity and diabetes, may account for the predicted increase in workload. The most cost effective treatments must be sought, which requires high quality clinical trials. Our methodology can be applied to other sub-specialties to help anticipate the need for future service provision. Copyright © 2014 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
Nilsson, Lars B; Skansen, Patrik
2012-06-30
The investigations in this article were triggered by two observations in the laboratory; for some liquid chromatography/tandem mass spectrometry (LC/MS/MS) systems it was possible to obtain linear calibration curves for extreme concentration ranges and for some systems seemingly linear calibration curves gave good accuracy at low concentrations only when using a quadratic regression function. The absolute and relative responses were tested for three different LC/MS/MS systems by injecting solutions of a model compound and a stable isotope labeled internal standard. The analyte concentration range for the solutions was 0.00391 to 500 μM (128,000×), giving overload of the chromatographic column at the highest concentrations. The stable isotope labeled internal standard concentration was 0.667 μM in all samples. The absolute response per concentration unit decreased rapidly as higher concentrations were injected. The relative response, the ratio for the analyte peak area to the internal standard peak area, per concentration unit was calculated. For system 1, the ionization process was found to limit the response and the relative response per concentration unit was constant. For systems 2 and 3, the ion detection process was the limiting factor resulting in decreasing relative response at increasing concentrations. For systems behaving like system 1, simple linear regression can be used for any concentration range while, for systems behaving like systems 2 and 3, non-linear regression is recommended for all concentration ranges. Another consequence is that the ionization capacity limited systems will be insensitive to matrix ion suppression when an ideal internal standard is used while the detection capacity limited systems are at risk of giving erroneous results at high concentrations if the matrix ion suppression varies for different samples in a run. Copyright © 2012 John Wiley & Sons, Ltd.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
2017-10-01
ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Association of dentine hypersensitivity with different risk factors - a cross sectional study.
Vijaya, V; Sanjay, Venkataraam; Varghese, Rana K; Ravuri, Rajyalakshmi; Agarwal, Anil
2013-12-01
This study was done to assess the prevalence of Dentine hypersensitivity (DH) and its associated risk factors. This epidemiological study was done among patients coming to dental college regarding prevalence of DH. A self structured questionnaire along with clinical examination was done for assessment. Descriptive statistics were obtained and frequency distribution was calculated using Chi square test at p value <0.05. Stepwise multiple linear regression was also done to access frequency of DH with different factors. The study population was comprised of 655 participants with different age groups. Our study showed prevalence as 55% and it was more common among males. Similarly smokers and those who use hard tooth brush had more cases of DH. Step wise multiple linear regression showed that best predictor for DH was age followed by habit of smoking and type of tooth brush. Most aggravating factors were cold water (15.4%) and sweet foods (14.7%), whereas only 5% of the patients had it while brushing. A high level of dental hypersensitivity has been in this study and more common among males. A linear finding was shown with age, smoking and type of tooth brush. How to cite this article: Vijaya V, Sanjay V, Varghese RK, Ravuri R, Agarwal A. Association of Dentine Hypersensitivity with Different Risk Factors - A Cross Sectional Study. J Int Oral Health 2013;5(6):88-92 .
Geszke-Moritz, Małgorzata; Moritz, Michał
2016-12-01
The present study deals with the adsorption of boldine onto pure and propyl-sulfonic acid-functionalized SBA-15, SBA-16 and mesocellular foam (MCF) materials. Siliceous adsorbents were characterized by nitrogen sorption analysis, transmission electron microscopy (TEM), scanning electron microscopy (SEM), Fourier-transform infrared (FT-IR) spectroscopy and thermogravimetric analysis. The equilibrium adsorption data were analyzed using the Langmuir, Freundlich, Redlich-Peterson, and Temkin isotherms. Moreover, the Dubinin-Radushkevich and Dubinin-Astakhov isotherm models based on the Polanyi adsorption potential were employed. The latter was calculated using two alternative formulas including solubility-normalized (S-model) and empirical C-model. In order to find the best-fit isotherm, both linear regression and nonlinear fitting analysis were carried out. The Dubinin-Astakhov (S-model) isotherm revealed the best fit to the experimental points for adsorption of boldine onto pure mesoporous materials using both linear and nonlinear fitting analysis. Meanwhile, the process of boldine sorption onto modified silicas was described the best by the Langmuir and Temkin isotherms using linear regression and nonlinear fitting analysis, respectively. The values of adsorption energy (below 8kJ/mol) indicate the physical nature of boldine adsorption onto unmodified silicas whereas the ionic interactions seem to be the main force of alkaloid adsorption onto functionalized sorbents (energy of adsorption above 8kJ/mol). Copyright © 2016 Elsevier B.V. All rights reserved.
Sandhu, Rupninder; Chollet-Hinton, Lynn; Kirk, Erin L; Midkiff, Bentley; Troester, Melissa A
2016-02-01
Complete age-related regression of mammary epithelium, often termed postmenopausal involution, is associated with decreased breast cancer risk. However, most studies have qualitatively assessed involution. We quantitatively analyzed epithelium, stroma, and adipose tissue from histologically normal breast tissue of 454 patients in the Normal Breast Study. High-resolution digital images of normal breast hematoxylin and eosin-stained slides were partitioned into epithelium, adipose tissue, and nonfatty stroma. Percentage area and nuclei per unit area (nuclear density) were calculated for each component. Quantitative data were evaluated in association with age using linear regression and cubic spline models. Stromal area decreased (P = 0.0002), and adipose tissue area increased (P < 0.0001), with an approximate 0.7% change in area for each component, until age 55 years when these area measures reached a steady state. Although epithelial area did not show linear changes with age, epithelial nuclear density decreased linearly beginning in the third decade of life. No significant age-related trends were observed for stromal or adipose nuclear density. Digital image analysis offers a high-throughput method for quantitatively measuring tissue morphometry and for objectively assessing age-related changes in adipose tissue, stroma, and epithelium. Epithelial nuclear density is a quantitative measure of age-related breast involution that begins to decline in the early premenopausal period. Copyright © 2015 Elsevier Inc. All rights reserved.
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Granato, Gregory E.
2012-01-01
A nationwide study to better define triangular-hydrograph statistics for use with runoff-quality and flood-flow studies was done by the U.S. Geological Survey (USGS) in cooperation with the Federal Highway Administration. Although the triangular hydrograph is a simple linear approximation, the cumulative distribution of stormflow with a triangular hydrograph is a curvilinear S-curve that closely approximates the cumulative distribution of stormflows from measured data. The temporal distribution of flow within a runoff event can be estimated using the basin lagtime, (which is the time from the centroid of rainfall excess to the centroid of the corresponding runoff hydrograph) and the hydrograph recession ratio (which is the ratio of the duration of the falling limb to the rising limb of the hydrograph). This report documents results of the study, methods used to estimate the variables, and electronic files that facilitate calculation of variables. Ten viable multiple-linear regression equations were developed to estimate basin lagtimes from readily determined drainage basin properties using data published in 37 stormflow studies. Regression equations using the basin lag factor (BLF, which is a variable calculated as the main-channel length, in miles, divided by the square root of the main-channel slope in feet per mile) and two variables describing development in the drainage basin were selected as the best candidates, because each equation explains about 70 percent of the variability in the data. The variables describing development are the USGS basin development factor (BDF, which is a function of the amount of channel modifications, storm sewers, and curb-and-gutter streets in a basin) and the total impervious area variable (IMPERV) in the basin. Two datasets were used to develop regression equations. The primary dataset included data from 493 sites that have values for the BLF, BDF, and IMPERV variables. This dataset was used to develop the best-fit regression equation using the BLF and BDF variables. The secondary dataset included data from 896 sites that have values for the BLF and IMPERV variables. This dataset was used to develop the best-fit regression equation using the BLF and IMPERV variables. Analysis of hydrograph recession ratios and basin characteristics for 41 sites indicated that recession ratios are random variables. Thus, recession ratios cannot be estimated quantitatively using multiple linear regression equations developed using the data available for these sites. The minimums of recession ratios for different streamgages are well characterized by a value of one. The most probable values and maximum values of recession ratios for different streamgages are, however, more variable than the minimums. The most probable values of recession ratios for the 41 streamgages analyzed ranged from 1.0 to 3.52 and had a median of 1.85. The maximum values ranged from 2.66 to 11.3 and had a median of 4.36.
Simplified solution for point contact deformation between two elastic solids
NASA Technical Reports Server (NTRS)
Brewe, D. E.; Hamrock, B. J.
1976-01-01
A linear-regression by the method of least squares is made on the geometric variables that occur in the equation for point contact deformation. The ellipticity and the complete eliptic integrals of the first and second kind are expressed as a function of the x, y-plane principal radii. The ellipticity was varied from 1 (circular contact) to 10 (a configuration approaching line contact). These simplified equations enable one to calculate easily the point-contact deformation to within 3 percent without resorting to charts or numerical methods.
Calculating the Solubilities of Drugs and Drug-Like Compounds in Octanol.
Alantary, Doaa; Yalkowsky, Samuel
2016-09-01
A modification of the Van't Hoff equation is used to predict the solubility of organic compounds in dry octanol. The new equation describes a linear relationship between the logarithm of the solubility of a solute in octanol to its melting temperature. More than 620 experimentally measured octanol solubilities, collected from the literature, are used to validate the equation without using any regression or fitting. The average absolute error of the prediction is 0.66 log units. Copyright © 2016 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Carsin-Vu, Aline; Corouge, Isabelle; Commowick, Olivier; Bouzillé, Guillaume; Barillot, Christian; Ferré, Jean-Christophe; Proisy, Maia
2018-04-01
To investigate changes in cerebral blood flow (CBF) in gray matter (GM) between 6 months and 15 years of age and to provide CBF values for the brain, GM, white matter (WM), hemispheres and lobes. Between 2013 and 2016, we retrospectively included all clinical MRI examinations with arterial spin labeling (ASL). We excluded subjects with a condition potentially affecting brain perfusion. For each subject, mean values of CBF in the brain, GM, WM, hemispheres and lobes were calculated. GM CBF was fitted using linear, quadratic and cubic polynomial regression against age. Regression models were compared with Akaike's information criterion (AIC), and Likelihood Ratio tests. 84 children were included (44 females/40 males). Mean CBF values were 64.2 ± 13.8 mL/100 g/min in GM, and 29.3 ± 10.0 mL/100 g/min in WM. The best-fit model of brain perfusion was the cubic polynomial function (AIC = 672.7, versus respectively AIC = 673.9 and AIC = 674.1 with the linear negative function and the quadratic polynomial function). A statistically significant difference between the tested models demonstrating the superiority of the quadratic (p = 0.18) or cubic polynomial model (p = 0.06), over the negative linear regression model was not found. No effect of general anesthesia (p = 0.34) or of gender (p = 0.16) was found. we provided values for ASL CBF in the brain, GM, WM, hemispheres, and lobes over a wide pediatric age range, approximately showing inverted U-shaped changes in GM perfusion over the course of childhood. Copyright © 2018 Elsevier B.V. All rights reserved.
Zhang, Xu; Wang, Dongqing; Yu, Zaiyang; Chen, Xiang; Li, Sheng; Zhou, Ping
2017-11-01
This study examines the electromyogram (EMG)-torque relation for chronic stroke survivors using a novel EMG complexity representation. Ten stroke subjects performed a series of submaximal isometric elbow flexion tasks using their affected and contralateral arms, respectively, while a 20-channel linear electrode array was used to record surface EMG from the biceps brachii muscles. The sample entropy (SampEn) of surface EMG signals was calculated with both global and local tolerance schemes. A regression analysis was performed between SampEn of each channel's surface EMG and elbow flexion torque. It was found that a linear regression can be used to well describe the relation between surface EMG SampEn and the torque. Each channel's root mean square (RMS) amplitude of surface EMG signal in the different torque level was computed to determine the channel with the highest EMG amplitude. The slope of the regression (observed from the channel with the highest EMG amplitude) was smaller on the impaired side than on the nonimpaired side in 8 of the 10 subjects, regardless of the tolerance scheme (global or local) and the range of torques (full or matched range) used for comparison. The surface EMG signals from the channels above the estimated muscle innervation zones demonstrated significantly lower levels of complexity compared with other channels between innervation zones and muscle tendons. The study provides a novel point of view of the EMG-torque relation in the complexity domain, and reveals its alterations post stroke, which are associated with complex neural and muscular changes post stroke. The slope difference between channels with regard to innervation zones also confirms the relevance of electrode position in surface EMG analysis.
Murata, Hiroshi; Araie, Makoto; Asaoka, Ryo
2014-11-20
We generated a variational Bayes model to predict visual field (VF) progression in glaucoma patients. This retrospective study included VF series from 911 eyes of 547 glaucoma patients as test data, and VF series from 5049 eyes of 2858 glaucoma patients as training data. Using training data, variational Bayes linear regression (VBLR) was created to predict VF progression. The performance of VBLR was compared against ordinary least-squares linear regression (OLSLR) by predicting VFs in the test dataset. The total deviation (TD) values of test patients' 11th VFs were predicted using TD values from their second to 10th VFs (VF2-10), the root mean squared error (RMSE) associated with each approach then was calculated. Similarly, mean TD (mTD) of test patients' 11th VFs was predicted using VBLR and OLSLR, and the absolute prediction errors compared. The RMSE resulting from VBLR averaged 3.9 ± 2.1 (SD) and 4.9 ± 2.6 dB for prediction based on the second to 10th VFs (VF2-10) and the second to fourth VFs (VF2-4), respectively. The RMSE resulting from OLSLR was 4.1 ± 2.0 (VF2-10) and 19.9 ± 12.0 (VF2-4) dB. The absolute prediction error (SD) for mTD using VBLR was 1.2 ± 1.3 (VF2-10) and 1.9 ± 2.0 (VF2-4) dB, while the prediction error resulting from OLSLR was 1.2 ± 1.3 (VF2-10) and 6.2 ± 6.6 (VF2-4) dB. The VBLR more accurately predicts future VF progression in glaucoma patients compared to conventional OLSLR, especially in short VF series. © ARVO.
NASA Astrophysics Data System (ADS)
Dias, L. G.; Shimizu, K.; Farah, J. P. S.; Chaimovich, H.
2002-09-01
We propose and demonstrate the usefulness of a method, defined as generalized Born electronegativity equalization method (GBEEM) to estimate solvent-induced charge redistribution. The charges obtained by GBEEM, in a representative series of small organic molecules, were compared to PM3-CM1 charges in vacuum and in water. Linear regressions with appropriate correlation coefficients and standard deviations between GBEEM and PM3-CM1 methods were obtained ( R=0.94,SD=0.15, Ftest=234, N=32, in vacuum; R=0.94,SD=0.16, Ftest=218, N=29, in water). In order to test the GBEEM response when intermolecular interactions are involved we calculated a water dimer in dielectric water using both GBEEM and PM3-CM1 and the results were similar. Hence, the method developed here is comparable to established calculation methods.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Fought, Ellie L; Sundriyal, Vaibhav; Sosonkina, Masha; Windus, Theresa L
2017-04-30
In this work, the effect of oversubscription is evaluated, via calling 2n, 3n, or 4n processes for n physical cores, on semi-direct MP2 energy and gradient calculations and RI-MP2 energy calculations with the cc-pVTZ basis using NWChem. Results indicate that on both Intel and AMD platforms, oversubscription reduces total time to solution on average for semi-direct MP2 energy calculations by 25-45% and reduces total energy consumed by the CPU and DRAM on average by 10-15% on the Intel platform. Semi-direct gradient time to solution is shortened on average by 8-15% and energy consumption is decreased by 5-10%. Linear regression analysis shows a strong correlation between time to solution and total energy consumed. Oversubscribing during RI-MP2 calculations results in performance degradations of 30-50% at the 4n level. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Marrero-Ponce, Yovani
2004-01-01
This report describes a new set of molecular descriptors of relevance to QSAR/QSPR studies and drug design, atom linear indices fk(xi). These atomic level chemical descriptors are based on the calculation of linear maps on Rn[fk(xi): Rn--> Rn] in canonical basis. In this context, the kth power of the molecular pseudograph's atom adjacency matrix [Mk(G)] denotes the matrix of fk(xi) with respect to the canonical basis. In addition, a local-fragment (atom-type) formalism was developed. The kth atom-type linear indices are calculated by summing the kth atom linear indices of all atoms of the same atom type in the molecules. Moreover, total (whole-molecule) linear indices are also proposed. This descriptor is a linear functional (linear form) on Rn. That is, the kth total linear indices is a linear map from Rn to the scalar R[ fk(x): Rn --> R]. Thus, the kth total linear indices are calculated by summing the atom linear indices of all atoms in the molecule. The features of the kth total and local linear indices are illustrated by examples of various types of molecular structures, including chain-lengthening, branching, heteroatoms-content, and multiple bonds. Additionally, the linear independence of the local linear indices to other 0D, 1D, 2D, and 3D molecular descriptors is demonstrated by using principal component analysis for 42 very heterogeneous molecules. Much redundancy and overlapping was found among total linear indices and most of the other structural indices presently in use in the QSPR/QSAR practice. On the contrary, the information carried by atom-type linear indices was strikingly different from that codified in most of the 229 0D-3D molecular descriptors used in this study. It is concluded that the local linear indices are an independent indices containing important structural information to be used in QSPR/QSAR and drug design studies. In this sense, atom, atom-type, and total linear indices were used for the prediction of pIC50 values for the cleavage process of a set of flavone derivatives inhibitors of HIV-1 integrase. Quantitative models found are significant from a statistical point of view (R of 0.965, 0.902, and 0.927, respectively) and permit a clear interpretation of the studied properties in terms of the structural features of molecules. A LOO cross-validation procedure revealed that the regression models had a fairly good predictability (q2 of 0.679, 0.543, and 0.721, respectively). The comparison with other approaches reveals good behavior of the method proposed. The approach described in this paper appears to be an excellent alternative or guides for discovery and optimization of new lead compounds.
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.
ERIC Educational Resources Information Center
Schafer, William D.; Wang, Yuh-Yin
A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Christensen, Jeppe Schultz; Raaschou-Nielsen, Ole; Tjønneland, Anne; Overvad, Kim; Nordsborg, Rikke B.; Ketzel, Matthias; Sørensen, Thorkild IA; Sørensen, Mette
2015-01-01
Background Traffic noise has been associated with cardiovascular and metabolic disorders. Potential modes of action are through stress and sleep disturbance, which may lead to endocrine dysregulation and overweight. Objectives We aimed to investigate the relationship between residential traffic and railway noise and adiposity. Methods In this cross-sectional study of 57,053 middle-aged people, height, weight, waist circumference, and bioelectrical impedance were measured at enrollment (1993–1997). Body mass index (BMI), body fat mass index (BFMI), and lean body mass index (LBMI) were calculated. Residential exposure to road and railway traffic noise exposure was calculated using the Nordic prediction method. Associations between traffic noise and anthropometric measures at enrollment were analyzed using general linear models and logistic regression adjusted for demographic and lifestyle factors. Results Linear regression models adjusted for age, sex, and socioeconomic factors showed that 5-year mean road traffic noise exposure preceding enrollment was associated with a 0.35-cm wider waist circumference (95% CI: 0.21, 0.50) and a 0.18-point higher BMI (95% CI: 0.12, 0.23) per 10 dB. Small, significant increases were also found for BFMI and LBMI. All associations followed linear exposure–response relationships. Exposure to railway noise was not linearly associated with adiposity measures. However, exposure > 60 dB was associated with a 0.71-cm wider waist circumference (95% CI: 0.23, 1.19) and a 0.19-point higher BMI (95% CI: 0.0072, 0.37) compared with unexposed participants (0–20 dB). Conclusions The present study finds positive associations between residential exposure to road traffic and railway noise and adiposity. Citation Christensen JS, Raaschou-Nielsen O, Tjønneland A, Overvad K, Nordsborg RB, Ketzel M, Sørensen TI, Sørensen M. 2016. Road traffic and railway noise exposures and adiposity in adults: a cross-sectional analysis of the Danish Diet, Cancer, and Health cohort. Environ Health Perspect 124:329–335; http://dx.doi.org/10.1289/ehp.1409052 PMID:26241990
Locally linear regression for pose-invariant face recognition.
Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen
2007-07-01
The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide
2009-01-01
The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
Yang, Chieh-Hou; Lee, Wei-Feng
2002-01-01
Ground water reservoirs in the Choshuichi alluvial fan, central western Taiwan, were investigated using direct-current (DC) resistivity soundings at 190 locations, combined with hydrogeological measurements from 37 wells. In addition, attempts were made to calculate aquifer transmissivity from both surface DC resistivity measurements and geostatistically derived predictions of aquifer properties. DC resistivity sounding data are highly correlated to the hydraulic parameters in the Choshuichi alluvial fan. By estimating the spatial distribution of hydraulic conductivity from the kriged well data and the cokriged thickness of the correlative aquifer from both resistivity sounding data and well information, the transmissivity of the aquifer at each location can be obtained from the product of kriged hydraulic conductivity and computed thickness of the geoelectric layer. Thus, the spatial variation of the transmissivities in the study area is obtained. Our work is more comparable to Ahmed et al. (1988) than to the work of Niwas and Singhal (1981). The first "constraint" from Niwas and Singhal's work is a result of their use of linear regression. The geostatistical approach taken here (and by Ahmed et al. [1988]) is a natural improvement on the linear regression approach.
Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Barros, L M; Martins, R T; Ferreira-Keppler, R L; Gutjahr, A L N
2017-08-04
Information on biomass is substantial for calculating growth rates and may be employed in the medicolegal and economic importance of Hermetia illucens (Linnaeus, 1758). Although biomass is essential to understanding many ecological processes, it is not easily measured. Biomass may be determined by directly weighing or indirectly through regression models of fresh/dry mass versus body dimensions. In this study, we evaluated the association between morphometry and fresh/dry mass of immature H. illucens using linear, exponential, and power regression models. We measured width and length of the cephalic capsule, overall body length, and width of the largest abdominal segment of 280 larvae. Overall body length and width of the largest abdominal segment were the best predictors for biomass. Exponential models best fitted body dimensions and biomass (both fresh and dry), followed by power and linear models. In all models, fresh and dry biomass were strongly correlated (>75%). Values estimated by the models did not differ from observed ones, and prediction power varied from 27 to 79%. Accordingly, the correspondence between biomass and body dimensions should facilitate and motivate the development of applied studies involving H. illucens in the Amazon region.
Live Donor Renal Anatomic Asymmetry and Posttransplant Renal Function.
Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S; Newhouse, Jeffrey H; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J; Carroll, Maureen A; Sharif, Sairah; Cohen, David J; Ratner, Lloyd E; Hardy, Mark A
2015-08-01
Relationship between live donor renal anatomic asymmetry and posttransplant recipient function has not been studied extensively. We analyzed 96 live kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from computerized tomography angiograms) and their matching recipients. Split function differences (SFD) were quantified with technetium-dimercaptosuccinic acid renography. Implantation biopsies at time 0 were semiquantitatively scored. A comprehensive model using donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at 1 year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60 mL/min/1.73 m(2) at 1 year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the chronic kidney disease-epidemiology collaboration formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). In the study cohort, the mean Vol/Wgt and eGFR at 1 year were 2.04 mL/kg and 60.4 mL/min/1.73 m(2), respectively. Volume and split ratios between 2 donor kidneys were strongly correlated (r = 0.79, P < 0.001). The biopsy scores among SFD categories (<5%, 5%-10%, >10%) were not different (P = 0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR > 60 mL/min/1.73 m (odds ratio, 8.94, 95% CI 2.47-32.25, P = 0.001) and had a strong discriminatory power in predicting the risk of eGFR less than 60 mL/min/1.73 m(2) at 1 year [receiver operating curve (ROC curve), 0.78, 95% CI, 0.68-0.89]. In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at 1 year after transplantation. Renography can be replaced with CT volume calculation in estimating split renal function.
Live Donor Renal Anatomic Asymmetry and Post-Transplant Renal Function
Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S.; Newhouse, Jeffrey H.; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J.; Carroll, Maureen A.; Sharif, Sairah; Cohen, David J.; Ratner, Lloyd E.; Hardy, Mark A.
2014-01-01
Background Relationship between live donor renal anatomic asymmetry and post-transplant recipient function has not been studied extensively. Methods We analyzed 96 live-kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from CT angiograms) and their matching recipients. Split function differences (SFD) were quantified with 99mTc-DMSA renography. Implantation biopsies at time-zero were semi-quantitatively scored. A comprehensive model utilizing donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at one-year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60ml/min/1.73 m2 at one-year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the CKD-EPI formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). Results In the study cohort, the mean Vol/Wgt and eGFR at one-year were 2.04 ml/kg and 60.4 ml/min/1.73m2, respectively. Volume and split ratios between two donor kidneys were strongly correlated (r=0.79, p-value<0.001). The biopsy scores among SFD categories (<5%, 5–10%, >10%) were not different (p=0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR>60ml/min/1.73 m2 (OR=8.94, 95% CI 2.47–32.25, p=0.001) and had a strong discriminatory power in predicting the risk of eGFR<60ml/min/1.73m2 at one-year (ROC curve=0.78, 95% CI 0.68–0.89). Conclusion In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at one-year post-transplantation. Renography can be replaced with CT volume calculation in estimating split renal function. PMID:25719258
Saqr, Mohammed; Fors, Uno; Tedre, Matti
2018-02-06
Collaborative learning facilitates reflection, diversifies understanding and stimulates skills of critical and higher-order thinking. Although the benefits of collaborative learning have long been recognized, it is still rarely studied by social network analysis (SNA) in medical education, and the relationship of parameters that can be obtained via SNA with students' performance remains largely unknown. The aim of this work was to assess the potential of SNA for studying online collaborative clinical case discussions in a medical course and to find out which activities correlate with better performance and help predict final grade or explain variance in performance. Interaction data were extracted from the learning management system (LMS) forum module of the Surgery course in Qassim University, College of Medicine. The data were analyzed using social network analysis. The analysis included visual as well as a statistical analysis. Correlation with students' performance was calculated, and automatic linear regression was used to predict students' performance. By using social network analysis, we were able to analyze a large number of interactions in online collaborative discussions and gain an overall insight of the course social structure, track the knowledge flow and the interaction patterns, as well as identify the active participants and the prominent discussion moderators. When augmented with calculated network parameters, SNA offered an accurate view of the course network, each user's position, and level of connectedness. Results from correlation coefficients, linear regression, and logistic regression indicated that a student's position and role in information relay in online case discussions, combined with the strength of that student's network (social capital), can be used as predictors of performance in relevant settings. By using social network analysis, researchers can analyze the social structure of an online course and reveal important information about students' and teachers' interactions that can be valuable in guiding teachers, improve students' engagement, and contribute to learning analytics insights.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
Ren, Jianqiang; Chen, Zhongxin; Tang, Huajun
2006-12-01
Taking Jining City of Shandong Province, one of the most important winter wheat production regions in Huanghuaihai Plain as an example, the winter wheat yield was estimated by using the 250 m MODIS-NDVI data smoothed by Savitzky-Golay filter. The NDVI values between 0. 20 and 0. 80 were selected, and the sum of NDVI value for each county was calculated to build its relation with winter wheat yield. By using stepwise regression method, the linear regression model between NDVI and winter wheat yield was established, with the precision validated by the ground survey data. The results showed that the relative error of predicted yield was between -3.6% and 3.9%, suggesting that the method was relatively accurate and feasible.
Forsberg, Flemming; Ro, Raymond J.; Fox, Traci B; Liu, Ji-Bin; Chiou, See-Ying; Potoczek, Magdalena; Goldberg, Barry B
2010-01-01
The purpose of this study was to prospectively compare noninvasive, quantitative measures of vascularity obtained from 4 contrast enhanced ultrasound (US) techniques to 4 invasive immunohistochemical markers of tumor angiogenesis in a large group of murine xenografts. Glioma (C6) or breast cancer (NMU) cells were implanted in 144 rats. The contrast agent Optison (GE Healthcare, Princeton, NJ) was injected in a tail vein (dose: 0.4ml/kg). Power Doppler imaging (PDI), pulse-subtraction harmonic imaging (PSHI), flash-echo imaging (FEI), and Microflow imaging (MFI; a technique creating maximum intensity projection images over time) was performed with an Aplio scanner (Toshiba America Medical Systems, Tustin, CA) and a 7.5 MHz linear array. Fractional tumor neovascularity was calculated from digital clips of contrast US, while the relative area stained was calculated from specimens. Results were compared using a factorial, repeated measures ANOVA, linear regression and z-tests. The tortuous morphology of tumor neovessels was visualized better with MFI than with the other US modes. Cell line, implantation method and contrast US imaging technique were significant parameters in the ANOVA model (p<0.05). The strongest correlation determined by linear regression in the C6 model was between PSHI and percent area stained with CD31 (r=0.37, p<0.0001). In the NMU model the strongest correlation was between FEI and COX-2 (r=0.46, p<0.0001). There were no statistically significant differences between correlations obtained with the various US methods (p>0.05). In conclusion, the largest study of contrast US of murine xenografts to date has been conducted and quantitative contrast enhanced US measures of tumor neovascularity in glioma and breast cancer xenograft models appear to provide a noninvasive marker for angiogenesis; although the best method for monitoring angiogenesis was not conclusively established. PMID:21144542
Relationship between masticatory performance using a gummy jelly and masticatory movement.
Uesugi, Hanako; Shiga, Hiroshi
2017-10-01
The purpose of this study was to clarify the relationship between masticatory performance using a gummy jelly and masticatory movement. Thirty healthy males were asked to chew a gummy jelly on their habitual chewing side for 20s, and the parameters of masticatory performance and masticatory movement were calculated as follows. For evaluating the masticatory performance, the amount of glucose extraction during chewing of a gummy jelly was measured. For evaluating the masticatory movement, the movement of the mandibular incisal point was recorded using the MKG K6-I, and ten parameters of the movement path (opening distance and masticatory width), movement rhythm (opening time, closing time, occluding time, and cycle time), stability of movement (stability of path and stability of rhythm), and movement velocity (opening maximum velocity and closing maximum velocity) were calculated from 10 cycles of chewing beginning with the fifth cycle. The relationship between the amount of glucose extraction and parameters representing masticatory movement was investigated and then stepwise multiple linear regression analysis was performed. The amount of glucose extraction was associated with 7 parameters representing the masticatory movement. Stepwise multiple linear regression analysis showed that the opening distance, closing time, stability of rhythm, and closing maximum velocity were the most important factors affecting the glucose extraction. From these results it was suggested that there was a close relation between masticatory performance and masticatory movement, and that the masticatory performance could be increased by rhythmic, rapid and stable mastication with a large opening distance. Copyright © 2017 Japan Prosthodontic Society. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Banse, Karl; Yong, Marina
1990-05-01
As a proxy for satellite (coastal zone color scanner) observations and concurrent measurements of primary production rates, data from 138 stations occupied seasonally during 1967-1968 in the offshore, eastern tropical Pacific were analyzed in terms of six temporal groups and four current regimes. In multiple linear regressions on column production Pt, we found that simulated satellite pigment is generally weakly correlated, but sometimes not correlated with Pt, and that incident irradiance, sea surface temperature, nitrate, transparency, and depths of mixed layer or nitracline assume little or no importance. After a proxy for the light-saturated chlorophyll-specific photosynthetic rate pmax is added, the coefficient of determination (r2) ranges from 0.55 to 0.91 (median of 0.85) for the 10 cases. In stepwise multiple linear regressions the pmax proxy is the best predictor for Pt. Pt can be calculated fairly accurately (on the average, within 10-20%) from satellite pigment, the 10% light depth, and station values (but not from regional or seasonal means) of the pmax proxy; for individual stations the precision is 35-84% (median of 57% for the 10 groupings; p = 0.05) of the means of observed values. At present, pmax cannot be estimated from space; in the data set it is not even highly correlated with irradiance, temperature, and nitrate at depth of occurrence. Therefore extant models for calculating Pt in this tropical ocean have inherent limits of accuracy as well as of precision owing to ignorance about a physiological parameter.
Pimentel, Alan Santos; Alves, Eduardo da Silva; Alvim, Rafael de Oliveira; Nunes, Rogério Tasca; Costa, Carlos Magno Amaral; Lovisi, Júlio Cesar Moraes; Perrout de Lima, Jorge Roberto
2010-05-01
The 4-second exercise test (T4s) evaluates the cardiac vagal tone during the initial heart rate (HR) transient at sudden dynamic exercise, through the identification of the cardiac vagal index (CVI) obtained from the electrocardiogram (ECG). To evaluate the use of the Polar S810 heart rate monitor (HRM) as an alternative resource to the use of the electrocardiogram in the 4-second exercise test. In this study, 49 male individuals (25 +/- 20 years, 176 +/-12 cm, 74 +/- 6 kg) underwent the 4-second exercise test. The RR intervals were recorded simultaneously by ECG and HRM. We calculated the mean and the standard deviation of the last RR interval of the pre-exercise period, or of the first RR interval of the exercise period, whichever was longer (RRB), of the shortest RR interval of the exercise period (RRC), and of the CVI obtained by ECG and HRM. We used the Student t-test for dependent samples (p < or 0.05) to test the significance of the differences between means. To identify the correlation between the ECG and the HRM, we used the linear regression to calculate the Pearson's correlation coefficient and the strategy proposed by Bland and Altman. Linear regression showed r(2) of 0.9999 for RRB, 0.9997 for RRC, and 0.9996 for CVI. Bland e Altman strategy presented standard deviation of 0.92 ms for RRB, 0.86 ms for RRC, and 0.002 for CVI. Polar S810 HRM was more efficient in the application of T4s compared to the ECG.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Rothenberg, Stephen J; Rothenberg, Jesse C
2005-09-01
Statistical evaluation of the dose-response function in lead epidemiology is rarely attempted. Economic evaluation of health benefits of lead reduction usually assumes a linear dose-response function, regardless of the outcome measure used. We reanalyzed a previously published study, an international pooled data set combining data from seven prospective lead studies examining contemporaneous blood lead effect on IQ (intelligence quotient) of 7-year-old children (n = 1,333). We constructed alternative linear multiple regression models with linear blood lead terms (linear-linear dose response) and natural-log-transformed blood lead terms (log-linear dose response). We tested the two lead specifications for nonlinearity in the models, compared the two lead specifications for significantly better fit to the data, and examined the effects of possible residual confounding on the functional form of the dose-response relationship. We found that a log-linear lead-IQ relationship was a significantly better fit than was a linear-linear relationship for IQ (p = 0.009), with little evidence of residual confounding of included model variables. We substituted the log-linear lead-IQ effect in a previously published health benefits model and found that the economic savings due to U.S. population lead decrease between 1976 and 1999 (from 17.1 microg/dL to 2.0 microg/dL) was 2.2 times (319 billion dollars) that calculated using a linear-linear dose-response function (149 billion dollars). The Centers for Disease Control and Prevention action limit of 10 microg/dL for children fails to protect against most damage and economic cost attributable to lead exposure.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Standardization of domestic frying processes by an engineering approach.
Franke, K; Strijowski, U
2011-05-01
An approach was developed to enable a better standardization of domestic frying of potato products. For this purpose, 5 domestic fryers differing in heating power and oil capacity were used. A very defined frying process using a highly standardized model product and a broad range of frying conditions was carried out in these fryers and the development of browning representing an important quality parameter was measured. Product-to-oil ratio, oil temperature, and frying time were varied. Quite different color changes were measured in the different fryers although the same frying process parameters were applied. The specific energy consumption for water evaporation (spECWE) during frying related to product amount was determined for all frying processes to define an engineering parameter for characterizing the frying process. A quasi-linear regression approach was applied to calculate this parameter from frying process settings and fryer properties. The high significance of the regression coefficients and a coefficient of determination close to unity confirmed the suitability of this approach. Based on this regression equation, curves for standard frying conditions (SFC curves) were calculated which describe the frying conditions required to obtain the same level of spECWE in the different domestic fryers. Comparison of browning results from the different fryers operated at conditions near the SFC curves confirmed the applicability of the approach. © 2011 Institute of Food Technologists®
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Manzoori, Jamshid L.; Amjadi, Mohammad
2003-03-01
The characteristics of host-guest complexation between β-cyclodextrin (β-CD) and two forms of ibuprofen (protonated and deprotonated) were investigated by fluorescence spectrometry. 1:1 stoichiometries for both complexes were established and their association constants at different temperatures were calculated by applying a non-linear regression method to the change in the fluorescence of ibuprofen that brought about by the presence of β-CD. The thermodynamic parameters (Δ H, Δ S and Δ G) associated with the inclusion process were also determined. Based on the obtained results, a sensitive spectrofluorimetric method for the determination of ibuprofen was developed with a linear range of 0.1-2 μg ml -1 and a detection limit of 0.03 μg ml -1. The method was applied satisfactorily to the determination of ibuprofen in pharmaceutical preparations.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry
1993-04-01
as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
NASA Astrophysics Data System (ADS)
Kawase, H.; Nakano, K.
2015-12-01
We investigated the characteristics of strong ground motions separated from acceleration Fourier spectra and acceleration response spectra of 5% damping calculated from weak and moderate ground motions observed by K-NET, KiK-net, and the JMA Shindokei Network in Japan using the generalized spectral inversion method. The separation method used the outcrop motions at YMGH01 as reference where we extracted site responses due to shallow weathered layers. We include events with JMA magnitude equal to or larger than 4.5 observed from 1996 to 2011. We find that our frequency-dependent Q values are comparable to those of previous studies. From the corner frequencies of Fourier source spectra, we calculate Brune's stress parameters and found a clear magnitude dependence, in which smaller events tend to spread over a wider range while maintaining the same maximum value. We confirm that this is exactly the case for several mainshock-aftershock sequences. The average stress parameters for crustal earthquakes are much smaller than those of subduction zone, which can be explained by their depth dependence. We then compared the strong motion characteristics based on the acceleration response spectra and found that the separated characteristics of strong ground motions are different, especially in the lower frequency range less than 1Hz. These differences comes from the difference between Fourier spectra and response spectra found in the observed data; that is, predominant components in high frequency range of Fourier spectra contribute to increase the response in lower frequency range with small Fourier amplitude because strong high frequency component acts as an impulse to a Single-Degree-of-Freedom system. After the separation of the source terms for 5% damping response spectra we can obtain regression coefficients with respect to the magnitude, which lead to a new GMPE as shown in Fig.1 on the left. Although stress drops for inland earthquakes are 1/7 of the subduction-zone earthquakes, we can see linear regression works quite well. After this linear regression we correlate residuals as a function of Brune's stress parameters of corresponding events as shown in Fig.1 on the right for the case of 1Hz. We found quite good linear correlation, which makes aleatoric uncertainty 40 to 60 % smaller than the original.
Graphical Calculation of Estimated Energy Expenditure in Burn Patients.
Egro, Francesco M; Manders, Ernest C; Manders, Ernest K
2018-03-01
Historically, estimated energy expenditure (EEE) has been related to the percent of body surface area burned. Subsequent evaluations of these estimates have indicated that the earlier formulas may overestimate the amount of caloric support necessary for burn-injured patients. Ireton-Jones et al derived 2 equations for determining the EEE required to support burn patients, 1 for ventilator-dependent patients and 1 for spontaneously breathing patients. Evidence has proved their reliability, but they remain challenging to apply in a clinical setting given the difficult and cumbersome mathematics involved. This study aims to introduce a graphical calculation of EEE in burn patients that can be easily used in the clinical setting. The multivariant linear regression analysis from Ireton-Jones et al yielded equations that were rearranged into the form of a simple linear equation of the type y = mx + b. By choosing an energy expenditure and the age of the subject, the weight was calculated. The endpoints were then calculated, and a graph was mapped by means of Adobe FrameMaker. A graphical representation of Ireton-Jones et al's equations was obtained by plotting the weight (kg) on the y axis, the age (years) on the x axis, and a series of parallel lines representing the EEE in burn patients. The EEE has been displayed graphically on a grid to allow rapid determination of the EEE needed for a given patient of a designated weight and age. Two graphs were plotted: 1 for ventilator-dependent patients and 1 for spontaneously breathing patients. Correction factors for sex, the presence of additional trauma, and obesity are indicated on the graphical calculators. We propose a graphical tool to calculate caloric requirements in a fast, easy, and portable manner.
Gómez Navarro, Rafael
2009-01-01
To study the renal function (FR) of the hypertensive patients by means of estimating equations and serum creatinine (Crp). To calculate the percentage of patients with chronic kidney disease (ERC) that present normal values of Crp. To analyze which factors collaborate in the deterioration of the FR. Descriptive cross-sectional study of patients with HTA. Crp and arterial tension (TA) were determined. The glomerular filtration rate was calculated by means of Cockroft-Gault and MDRD's formula. The years of evolution of the HTA were registered. A descriptive study of the variables and the possible dependence among them was completed, using several times linear multiple regression. 52 patients were studied (57,7% women). Average age 72,4 +/- 10,8. 32,6% (Cockcroft-Gault) or 21,5% (MDRD) were fulfilling ERC criterion. The ERC was mainly diagnosed in females. 21,4% (Cockcroft-Gault) and 9,5 % patients (MDRD) with ERC had normal Crp values. We do not find linear dependence between the numbers of TA and the FR. The TA check-up objectives do not suppose less development of ERC. In males we find linear dependence within the FR (MDRD) and the years of evolution of the HTA. The ERC is a frequent pathology in the hypertense persons. The systematical utilization of estimating equations facilitates the detection of hidden ERC in patients with normal Crp.
Heun, Manfred; Abbo, Shahal; Lev-Yadun, Simcha; Gopher, Avi
2012-07-01
The recent review by Fuller et al. (2012a) in this journal is part of a series of papers maintaining that plant domestication in the Near East was a slow process lasting circa 4000 years and occurring independently in different locations across the Fertile Crescent. Their protracted domestication scenario is based entirely on linear regression derived from the percentage of domesticated plant remains at specific archaeological sites and the age of these sites themselves. This paper discusses why estimates like haldanes and darwins cannot be applied to the seven founder crops in the Near East (einkorn and emmer wheat, barley, peas, chickpeas, lentils, and bitter vetch). All of these crops are self-fertilizing plants and for this reason they do not fulfil the requirements for performing calculations of this kind. In addition, the percentage of domesticates at any site may be the result of factors other than those that affect the selection for domesticates growing in the surrounding area. These factors are unlikely to have been similar across prehistoric sites of habitation, societies, and millennia. The conclusion here is that single crop analyses are necessary rather than general reviews drawing on regression analyses based on erroneous assumptions. The fact that all seven of these founder crops are self-fertilizers should be incorporated into a comprehensive domestication scenario for the Near East, as self-fertilization naturally isolates domesticates from their wild progenitors.
Inflammation, homocysteine and carotid intima-media thickness.
Baptista, Alexandre P; Cacdocar, Sanjiva; Palmeiro, Hugo; Faísca, Marília; Carrasqueira, Herménio; Morgado, Elsa; Sampaio, Sandra; Cabrita, Ana; Silva, Ana Paula; Bernardo, Idalécio; Gome, Veloso; Neves, Pedro L
2008-01-01
Cardiovascular disease is the main cause of morbidity and mortality in chronic renal patients. Carotid intima-media thickness (CIMT) is one of the most accurate markers of atherosclerosis risk. In this study, the authors set out to evaluate a population of chronic renal patients to determine which factors are associated with an increase in intima-media thickness. We included 56 patients (F=22, M=34), with a mean age of 68.6 years, and an estimated glomerular filtration rate of 15.8 ml/min (calculated by the MDRD equation). Various laboratory and inflammatory parameters (hsCRP, IL-6 and TNF-alpha) were evaluated. All subjects underwent measurement of internal carotid artery intima-media thickness by high-resolution real-time B-mode ultrasonography using a 10 MHz linear transducer. Intima-media thickness was used as a dependent variable in a simple linear regression model, with the various laboratory parameters as independent variables. Only parameters showing a significant correlation with CIMT were evaluated in a multiple regression model: age (p=0.001), hemoglobin (p=00.3), logCRP (p=0.042), logIL-6 (p=0.004) and homocysteine (p=0.002). In the multiple regression model we found that age (p=0.001) and homocysteine (p=0.027) were independently correlated with CIMT. LogIL-6 did not reach statistical significance (p=0.057), probably due to the small population size. The authors conclude that age and homocysteine correlate with carotid intima-media thickness, and thus can be considered as markers/risk factors in chronic renal patients.
Massachusetts Shoreline Change Mapping and Analysis Project, 2013 Update
Thieler, E. Robert; Smith, Theresa L.; Knisel, Julia M.; Sampson, Daniel W.
2013-01-01
Information on rates and trends of shoreline change can be used to improve the understanding of the underlying causes and potential effects of coastal erosion on coastal populations and infrastructure and can support informed coastal management decisions. In this report, we summarize the changes in the historical positions of the shoreline of the Massachusetts coast for the 165 years from 1844 through 2009. The study area includes the Massachusetts coastal region from Salisbury to Westport, including Cape Cod, as well as Martha’s Vineyard, Nantucket, and the Elizabeth Islands. New statewide shoreline data were developed for approximately 1,804 kilometers (1,121 miles) of shoreline using color aerial orthoimagery from 2008 and 2009 and topographic lidar from 2007. The shoreline data were integrated with existing historical shoreline data from the U.S. Geological Survey (USGS) and Massachusetts Office of Coastal Zone Management (CZM) to compute long- (about 150 years) and short-term (about 30 years) rates of shoreline change. A linear regression method was used to calculate long- and short-term rates of shoreline change at 26,510 transects along the Massachusetts coast. In locations where shoreline data were insufficient to use the linear regression method, short-term rates were calculated using an end-point method. Long-term rates of shoreline change are calculated with (LTw) and without (LTwo) shorelines from the 1970s and 1994 to examine the effect of removing these data on measured rates of change. Regionally averaged rates are used to assess the general characteristics of the two-rate computations, and we find that (1) the rates of change for both LTw and LTwo are essentially the same; (2) including more data slightly reduces the uncertainty of the rate, which is expected as the number of shorelines increases; and (3) the data for the shorelines from the 1970s and 1994 are not outliers with respect to the long-term trend. These findings are true for regional averages, but may not hold at specific transects.
ERIC Educational Resources Information Center
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.
2009-01-01
In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
Verhelst, Stefanie; Poppe, Willy A J; Bogers, Johannes J; Depuydt, Christophe E
2017-03-01
This retrospective study examined whether human papillomavirus (HPV) type-specific viral load changes measured in two or three serial cervical smears are predictive for the natural evolution of HPV infections and correlate with histological grades of cervical intraepithelial neoplasia (CIN), allowing triage of HPV-positive women. A cervical histology database was used to select consecutive women with biopsy-proven CIN in 2012 who had at least two liquid-based cytology samples before the diagnosis of CIN. Before performing cytology, 18 different quantitative PCRs allowed HPV type-specific viral load measurement. Changes in HPV-specific load between measurements were assessed by linear regression, with calculation of coefficient of determination (R) and slope. All infections could be classified into one of five categories: (i) clonal progressing process (R≥0.85; positive slope), (ii) simultaneously occurring clonal progressive and transient infection, (iii) clonal regressing process (R≥0.85; negative slope), (iv) serial transient infection with latency [R<0.85; slopes (two points) between 0.0010 and -0.0010 HPV copies/cell/day], and (v) transient productive infection (R<0.85; slope: ±0.0099 HPV copies/cell/day). Three hundred and seven women with CIN were included; 124 had single-type infections and 183 had multiple HPV types. Only with three consecutive measurements could a clonal process be identified in all CIN3 cases. We could clearly demonstrate clonal regressing lesions with a persistent linear decrease in viral load (R≥0.85; -0.003 HPV copies/cell/day) in all CIN categories. Type-specific viral load increase/decrease in three consecutive measurements enabled classification of CIN lesions in clonal HPV-driven transformation (progression/regression) and nonclonal virion-productive (serial transient/transient) processes.
Gene set analysis using variance component tests.
Huang, Yen-Tsung; Lin, Xihong
2013-06-28
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
NASA Astrophysics Data System (ADS)
Alp, E.; Yücel, Ö.; Özcan, Z.
2014-12-01
Turkey has been making many legal arrangements for sustainable water management during the harmonization process with the European Union. In order to make cost effective and efficient decisions, monitoring network in Turkey has been expanding. However, due to time and budget constraints, desired number of monitoring campaigns can not be carried. Hence, in this study, independent parameters that can be measured easily and quickly are used to estimate water quality parameters in Lake Mogan and Eymir using linear regression. Nonpoint sources are one of the major pollutant components in Eymir and Mogan lakes. In this paper, a correlation between easily measurable parameters, DO, temperature, electrical conductivity, pH, precipitation and dependent variables, TN, TP, COD, Chl-a, TSS, Total Coliform is investigated. Simple regression analysis is performed for each season in Eymir and Mogan lakes by using SPSS Statistical program using the water quality data collected between 2006-2012. Regression analysis demonstrated significant linear relationship between measured and simulated concentrations for TN (R2=0.86), TP (R2=0.85), TSS (R2=0.91), Chl-a (R2=0.94), COD (R2=0.99), T. Coliform (R2=0.97) which are the best results in each season for Eymir and Mogan Lakes. The overall results of this study shows that by using easily measurable parameters even in ungauged situation the water quality of lakes can be predicted. Moreover, the outputs obtained from the regression equations can be used as an input for water quality models such as phosphorus budget model which is used to calculate the required reduction in the external phosphorus load to Lake Mogan to meet the water quality standards.
NASA Astrophysics Data System (ADS)
Tiberi, Lara; Costa, Giovanni
2017-04-01
The possibility to directly associate the damages to the ground motion parameters is always a great challenge, in particular for civil protections. Indeed a ground motion parameter, estimated in near real time that can express the damages occurred after an earthquake, is fundamental to arrange the first assistance after an event. The aim of this work is to contribute to the estimation of the ground motion parameter that better describes the observed intensity, immediately after an event. This can be done calculating for each ground motion parameter estimated in a near real time mode a regression law which correlates the above-mentioned parameter to the observed macro-seismic intensity. This estimation is done collecting high quality accelerometric data in near field, filtering them at different frequency steps. The regression laws are calculated using two different techniques: the non linear least-squares (NLLS) Marquardt-Levenberg algorithm and the orthogonal distance methodology (ODR). The limits of the first methodology are the needed of initial values for the parameters a and b (set 1.0 in this study), and the constraint that the independent variable must be known with greater accuracy than the dependent variable. While the second algorithm is based on the estimation of the errors perpendicular to the line, rather than just vertically. The vertical errors are just the errors in the 'y' direction, so only for the dependent variable whereas the perpendicular errors take into account errors for both the variables, the dependent and the independent. This makes possible also to directly invert the relation, so the a and b values can be used also to express the gmps as function of I. For each law the standard deviation and R2 value are estimated in order to test the quality and the reliability of the found relation. The Amatrice earthquake of 24th August of 2016 is used as case of study to test the goodness of the calculated regression laws.
Ihl, R; Grass-Kapanke, B; Jänner, M; Weyer, G
1999-11-01
In clinical and drug studies, different neuropsychometric tests are used. So far, no empirical data have been published to compare studies using different tests. The purpose of this study was to calculate a regression formula allowing a comparison of cross-sectional and longitudinal data from three neuropsychometric tests that are frequently used in drug studies (Alzheimer's Disease Assessment Scale, ADAS-cog; Syndrom Kurz Test, SKT; Mini Mental State Examination, MMSE). 177 patients with dementia according to ICD10 criteria were studied for the cross sectional and 61 for the longitudinal analysis. Correlations and linear regressions were calculated between tests. Significance was proven with ANOVA and t-tests using the SPSS statistical package. Significant Spearman correlations and slopes in the regression occurred in the cross sectional analysis (ADAS-cog-SKT r(s) = 0.77, slope = 0.45, SKT-ADAS-cog slope = 1.3, r2 = 0.59; ADAS-cog-MMSE r2 = 0.76, slope = -0.42, MMSE-ADAS-cog slope = -1.5, r2 = 0.64; MMSE-SKT r(s) = -0.79, slope = -0.87, SKT-MMSE slope = -0.71, r2 = 0.62; p<0.001 after Bonferroni correction; N = 177) and in the longitudinal analysis (SKT-ADAS-cog, r(s) = 0.48, slope = 0.69, ADAS-cog-SKT slope = 0.69, p<0.001, r2 = 0.32, MMSE-SKT, r(s) = 0.44, slope = -0.41, SKT-MMSE, slope = -0.55, p<0.001, r2 = 0.21). The results allow calculation of ADAS-scores when SKT scores are given, and vice versa. In longitudinal studies or in the course of the disease, scores assessed with the ADAS-cog and the SKT may now be statistically compared. In all comparisons, bottom and ceiling effects of the tests have to be taken into account.
Desai, Rishi J; Solomon, Daniel H; Weinblatt, Michael E; Shadick, Nancy; Kim, Seoyoung C
2015-04-13
We conducted an external validation study to examine the correlation of a previously published claims-based index for rheumatoid arthritis severity (CIRAS) with disease activity score in 28 joints calculated by using C-reactive protein (DAS28-CRP) and the multi-dimensional health assessment questionnaire (MD-HAQ) physical function score. Patients enrolled in the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) and Medicare were identified and their data from these two sources were linked. For each patient, DAS28-CRP measurement and MD-HAQ physical function scores were extracted from BRASS, and CIRAS was calculated from Medicare claims for the period of 365 days prior to the DAS28-CRP measurement. Pearson correlation coefficient between CIRAS and DAS28-CRP as well as MD-HAQ physical function scores were calculated. Furthermore, we considered several additional pharmacy and medical claims-derived variables as predictors for DAS28-CRP in a multivariable linear regression model in order to assess improvement in the performance of the original CIRAS algorithm. In total, 315 patients with enrollment in both BRASS and Medicare were included in this study. The majority (81%) of the cohort was female, and the mean age was 70 years. The correlation between CIRAS and DAS28-CRP was low (Pearson correlation coefficient = 0.07, P = 0.24). The correlation between the calculated CIRAS and MD-HAQ physical function scores was also found to be low (Pearson correlation coefficient = 0.08, P = 0.17). The linear regression model containing additional claims-derived variables yielded model R(2) of 0.23, suggesting limited ability of this model to explain variation in DAS28-CRP. In a cohort of Medicare-enrolled patients with established RA, CIRAS showed low correlation with DAS28-CRP as well as MD-HAQ physical function scores. Claims-based algorithms for disease activity should be rigorously tested in distinct populations in order to establish their generalizability before widespread adoption.
Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study
Vijaya, V; Sanjay, Venkataraam; Varghese, Rana K; Ravuri, Rajyalakshmi; Agarwal, Anil
2013-01-01
Background: This study was done to assess the prevalence of Dentine hypersensitivity (DH) and its associated risk factors. Materials & Methods: This epidemiological study was done among patients coming to dental college regarding prevalence of DH. A self structured questionnaire along with clinical examination was done for assessment. Descriptive statistics were obtained and frequency distribution was calculated using Chi square test at p value <0.05. Stepwise multiple linear regression was also done to access frequency of DH with different factors. Results: The study population was comprised of 655 participants with different age groups. Our study showed prevalence as 55% and it was more common among males. Similarly smokers and those who use hard tooth brush had more cases of DH. Step wise multiple linear regression showed that best predictor for DH was age followed by habit of smoking and type of tooth brush. Most aggravating factors were cold water (15.4%) and sweet foods (14.7%), whereas only 5% of the patients had it while brushing. Conclusion: A high level of dental hypersensitivity has been in this study and more common among males. A linear finding was shown with age, smoking and type of tooth brush. How to cite this article: Vijaya V, Sanjay V, Varghese RK, Ravuri R, Agarwal A. Association of Dentine Hypersensitivity with Different Risk Factors – A Cross Sectional Study. J Int Oral Health 2013;5(6):88-92 . PMID:24453451
Dry-heat Resistance of Bacillus Subtilis Var. Niger Spores on Mated Surfaces
NASA Technical Reports Server (NTRS)
Simko, G. J.; Devlin, J. D.; Wardle, M. D.
1971-01-01
Bacillus subtilis var. niger spores were placed on the surfaces of test coupons manufactured from typical spacecraft materials including stainless steel, magnesium, titanium, and aluminum. These coupons were then juxtaposed at the inoculated surfaces and subjected to test pressures of 0, 1000, 5000, and 10,000 psi. Tests were conducted in ambient, nitrogen, and helium atmospheres. While under the test pressure condition, the spores were exposed to 125 C for intervals of 5, 10, 20, 50, or 80 min. Survivor data were subjected to a linear regression analysis that calculated decimal reduction times.
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Astronaut mass measurement using linear acceleration method and the effect of body non-rigidity
NASA Astrophysics Data System (ADS)
Yan, Hui; Li, LuMing; Hu, ChunHua; Chen, Hao; Hao, HongWei
2011-04-01
Astronaut's body mass is an essential factor of health monitoring in space. The latest mass measurement device for the International Space Station (ISS) has employed a linear acceleration method. The principle of this method is that the device generates a constant pulling force, and the astronaut is accelerated on a parallelogram motion guide which rotates at a large radius to achieve a nearly linear trajectory. The acceleration is calculated by regression analysis of the displacement versus time trajectory and the body mass is calculated by using the formula m= F/ a. However, in actual flight, the device is instable that the deviation between runs could be 6-7 kg. This paper considers the body non-rigidity as the major cause of error and instability and analyzes the effects of body non-rigidity from different aspects. Body non-rigidity makes the acceleration of the center of mass (C.M.) oscillate and fall behind the point where force is applied. Actual acceleration curves showed that the overall effect of body non-rigidity is an oscillation at about 7 Hz and a deviation of about 25%. To enhance body rigidity, better body restraints were introduced and a prototype based on linear acceleration method was built. Measurement experiment was carried out on ground on an air table. Three human subjects weighing 60-70 kg were measured. The average variance was 0.04 kg and the average measurement error was 0.4%. This study will provide reference for future development of China's own mass measurement device.
GIS Tools to Estimate Average Annual Daily Traffic
DOT National Transportation Integrated Search
2012-06-01
This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
SU-F-T-130: [18F]-FDG Uptake Dose Response in Lung Correlates Linearly with Proton Therapy Dose
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, D; Titt, U; Mirkovic, D
2016-06-15
Purpose: Analysis of clinical outcomes in lung cancer patients treated with protons using 18F-FDG uptake in lung as a measure of dose response. Methods: A test case lung cancer patient was selected in an unbiased way. The test patient’s treatment planning and post treatment positron emission tomography (PET) were collected from picture archiving and communication system at the UT M.D. Anderson Cancer Center. Average computerized tomography scan was registered with post PET/CT through both rigid and deformable registrations for selected region of interest (ROI) via VelocityAI imaging informatics software. For the voxels in the ROI, a system that extracts themore » Standard Uptake Value (SUV) from PET was developed, and the corresponding relative biological effectiveness (RBE) weighted (both variable and constant) dose was computed using the Monte Carlo (MC) methods. The treatment planning system (TPS) dose was also obtained. Using histogram analysis, the voxel average normalized SUV vs. 3 different doses was obtained and linear regression fit was performed. Results: From the registration process, there were some regions that showed significant artifacts near the diaphragm and heart region, which yielded poor r-squared values when the linear regression fit was performed on normalized SUV vs. dose. Excluding these values, TPS fit yielded mean r-squared value of 0.79 (range 0.61–0.95), constant RBE fit yielded 0.79 (range 0.52–0.94), and variable RBE fit yielded 0.80 (range 0.52–0.94). Conclusion: A system that extracts SUV from PET to correlate between normalized SUV and various dose calculations was developed. A linear relation between normalized SUV and all three different doses was found.« less
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Ling, Ru; Liu, Jiawang
2011-12-01
To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Motulsky, Harvey J; Brown, Ronald E
2006-01-01
Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives. PMID:16526949
NASA Astrophysics Data System (ADS)
Ahamed, A.; Snyder, N. P.; David, G. C.
2014-12-01
The Reservoir Sedimentation Database (ResSed), a catalogue of reservoirs and depositional data that has recently become publically available, allows for rapid calculation of sedimentation rates and rates of capacity loss over short (annual to decadal) timescales. This study is a statistical investigation of factors controlling watershed average erosion rates (E) in eastern United States watersheds. We develop an ArcGIS-based model that delineates watersheds upstream of ResSed dams and calculate drainage areas to determine E for 191 eastern US watersheds. Geomorphic, geologic, regional, climatic, and land use variables are quantified within study watersheds using GIS. Erosion rates exhibit a large amount of scatter, ranging from 0.001 to 1.25 mm/yr. A weak inverse power law relationship between drainage area (A) and E (R2 = 0.09) is evident, similar to other studies (e.g. Milliman and Syvitski, 1992; Koppes and Montgomery, 2009). Linear regressions reveal no relationship between mean watershed slope (S) and E, possibly due to the relatively low relief of the region (mean S for all watersheds is 6°). Analysis of Variance shows that watersheds in formerly glaciated regions exhibit a statistically significant lower mean E (0.06 mm/year) than watersheds in unglaciated regions (0.12 mm/year), but that watersheds with different dam purposes show no significant differences in mean E. Linear regressions reveal no relationships between E and land use parameters like percent agricultural land and percent impervious surfaces (I), but classification and regression trees indicate that watersheds in highly developed regions (I > 34%) exhibit mean E (0.36 mm/year) that is four times higher than watersheds in less developed (I < 34%) regions (0.09 mm/year). Further, interactions between land use variables emerge in formerly glaciated regions, where increased agricultural land results in higher rates of annual capacity loss in reservoirs (R2 = 0.56). Plots of E versus timescale of measurement (e.g., Sadler and Jerolmack, 2014) show that nearly the full range of observed E, including the highest values, are seen over short survey intervals (< 20 years), suggesting that whether or not large sedimentation events (such as floods) occur between two surveys may explain the high degree of variability in measured rates.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Yuan, XiaoDong; Tang, Wei; Shi, WenWei; Yu, Libao; Zhang, Jing; Yuan, Qing; You, Shan; Wu, Ning; Ao, Guokun; Ma, Tingting
2018-07-01
To develop a convenient and rapid single-kidney CT-GFR technique. One hundred and twelve patients referred for multiphasic renal CT and 99mTc-DTPA renal dynamic imaging Gates-GFR measurement were prospectively included and randomly divided into two groups of 56 patients each: the training group and the validation group. On the basis of the nephrographic phase images, the fractional renal accumulation (FRA) was calculated and correlated with the Gates-GFR in the training group. From this correlation a formula was derived for single-kidney CT-GFR calculation, which was validated by a paired t test and linear regression analysis with the single-kidney Gates-GFR in the validation group. In the training group, the FRA (x-axis) correlated well (r = 0.95, p < 0.001) with single-kidney Gates-GFR (y-axis), producing a regression equation of y = 1665x + 1.5 for single-kidney CT-GFR calculation. In the validation group, the difference between the methods of single-kidney GFR measurements was 0.38 ± 5.57 mL/min (p = 0.471); the regression line is identical to the diagonal (intercept = 0 and slope = 1) (p = 0.727 and p = 0.473, respectively), with a standard deviation of residuals of 5.56 mL/min. A convenient and rapid single-kidney CT-GFR technique was presented and validated in this investigation. • The new CT-GFR method takes about 2.5 min of patient time. • The CT-GFR method demonstrated identical results to the Gates-GFR method. • The CT-GFR method is based on the fractional renal accumulation of iodinated CM. • The CT-GFR method is achieved without additional radiation dose to the patient.
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Three-parameter modeling of the soil sorption of acetanilide and triazine herbicide derivatives.
Freitas, Mirlaine R; Matias, Stella V B G; Macedo, Renato L G; Freitas, Matheus P; Venturin, Nelson
2014-02-01
Herbicides have widely variable toxicity and many of them are persistent soil contaminants. Acetanilide and triazine family of herbicides have widespread use, but increasing interest for the development of new herbicides has been rising to increase their effectiveness and to diminish environmental hazard. The environmental risk of new herbicides can be accessed by estimating their soil sorption (logKoc), which is usually correlated to the octanol/water partition coefficient (logKow). However, earlier findings have shown that this correlation is not valid for some acetanilide and triazine herbicides. Thus, easily accessible quantitative structure-property relationship models are required to predict logKoc of analogues of the these compounds. Octanol/water partition coefficient, molecular weight and volume were calculated and then regressed against logKoc for two series of acetanilide and triazine herbicides using multiple linear regression, resulting in predictive and validated models.
Max dD/Dt: A Novel Parameter to Assess Fetal Cardiac Contractility and a Substitute for Max dP/Dt.
Fujita, Yasuyuki; Kiyokoba, Ryo; Yumoto, Yasuo; Kato, Kiyoko
2018-07-01
Aortic pulse waveforms are composed of a forward wave from the heart and a reflection wave from the periphery. We focused on this forward wave and suggested a new parameter, the maximum slope of aortic pulse waveforms (max dD/dt), for fetal cardiac contractility. Max dD/dt was calculated from fetal aortic pulse waveforms recorded with an echo-tracking system. A normal range of max dD/dt was constructed in 105 healthy fetuses using linear regression analysis. Twenty-two fetuses with suspected fetal cardiac dysfunction were divided into normal and decreased max dD/dt groups, and their clinical parameters were compared. Max dD/dt of aortic pulse waveforms increased linearly with advancing gestational age (r = 0.93). The decreased max dD/dt was associated with abnormal cardiotocography findings and short- and long-term prognosis. In conclusion, max dD/dt calculated from the aortic pulse waveforms in fetuses can substitute for max dP/dt, an index of cardiac contractility in adults. Copyright © 2018 World Federation for Ultrasound in Medicine and Biology. Published by Elsevier Inc. All rights reserved.
Image quality and absorbed dose comparison of single- and dual-source cone-beam computed tomography.
Miura, Hideharu; Ozawa, Shuichi; Okazue, Toshiya; Kawakubo, Atsushi; Yamada, Kiyoshi; Nagata, Yasushi
2018-05-01
Dual-source cone-beam computed tomography (DCBCT) is currently available in the Vero4DRT image-guided radiotherapy system. We evaluated the image quality and absorbed dose for DCBCT and compared the values with those for single-source CBCT (SCBCT). Image uniformity, Hounsfield unit (HU) linearity, image contrast, and spatial resolution were evaluated using a Catphan phantom. The rotation angle for acquiring SCBCT and DCBCT images is 215° and 115°, respectively. The image uniformity was calculated using measurements obtained at the center and four peripheral positions. The HUs of seven materials inserted into the phantom were measured to evaluate HU linearity and image contrast. The Catphan phantom was scanned with a conventional CT scanner to measure the reference HU for each material. The spatial resolution was calculated using high-resolution pattern modules. Image quality was analyzed using ImageJ software ver. 1.49. The absorbed dose was measured using a 0.6-cm 3 ionization chamber with a 16-cm-diameter cylindrical phantom, at the center and four peripheral positions of the phantom, and calculated using weighted cone-beam CT dose index (CBCTDI w ). Compared with that of SCBCT, the image uniformity of DCBCT was slightly reduced. A strong linear correlation existed between the measured HU for DCBCT and the reference HU, although the linear regression slope was different from that of the reference HU. DCBCT had poorer image contrast than did SCBCT, particularly with a high-contrast material. There was no significant difference between the spatial resolutions of SCBCT and DCBCT. The absorbed dose for DCBCT was higher than that for SCBCT, because in DCBCT, the two x-ray projections overlap between 45° and 70°. We found that the image quality was poorer and the absorbed dose was higher for DCBCT than for SCBCT in the Vero4DRT. © 2018 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Jaworski, N W; Liu, D W; Li, D F; Stein, H H
2016-07-01
An experiment was conducted to determine effects on DE, ME, and NE for growing pigs of adding 15 or 30% wheat bran to a corn-soybean meal diet and to compare values for DE, ME, and NE calculated using the difference procedure with values obtained using linear regression. Eighteen barrows (54.4 ± 4.3 kg initial BW) were individually housed in metabolism crates. The experiment had 3 diets and 6 replicate pigs per diet. The control diet contained corn, soybean meal, and no wheat bran. Two additional diets were formulated by mixing 15 or 30% wheat bran with 85 or 70% of the control diet, respectively. The experimental period lasted 15 d. During the initial 7 d, pigs were adapted to their experimental diets and housed in metabolism crates and fed 573 kcal ME/kg BW per day. On d 8, metabolism crates with the pigs were moved into open-circuit respiration chambers for measurement of O consumption and CO and CH production. The feeding level was the same as in the adaptation period, and feces and urine were collected during this period. On d 13 and 14, pigs were fed 225 kcal ME/kg BW per day, and pigs were then fasted for 24 h to obtain fasting heat production. Results of the experiment indicated that the apparent total tract digestibility of DM, GE, crude fiber, ADF, and NDF linearly decreased ( ≤ 0.05) as wheat bran inclusion increased in the diets. The daily O consumption and CO and CH production by pigs fed increasing concentrations of wheat bran linearly decreased ( ≤ 0.05), resulting in a linear decrease ( ≤ 0.05) in heat production. The DE (3,454, 3,257, and 3,161 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively for diets containing 0, 15, and 30% wheat bran, respectively), ME (3,400, 3,209, and 3,091 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively), and NE (1,808, 1,575, and 1,458 kcal/kg for diets containing 0, 15, and 30% wheat bran, respectively) of diets decreased (linear, ≤ 0.05) as wheat bran inclusion increased. The DE, ME, and NE of wheat bran determined using the difference procedure were 2,168, 2,117, and 896 kcal/kg, respectively, and these values were within the 95% confidence interval of the DE (2,285 kcal/kg), ME (2,217 kcal/kg), and NE (961 kcal/kg) estimated by linear regression. In conclusion, increasing the inclusion of wheat bran in a corn-soybean meal based diet reduced energy and nutrient digestibility and heat production as well as DE, ME, and NE of diets, but values for DE, ME, and NE for wheat bran determined using the difference procedure were not different from values determined using linear regression.
A prediction model for lift-fan simulator performance. M.S. Thesis - Cleveland State Univ.
NASA Technical Reports Server (NTRS)
Yuska, J. A.
1972-01-01
The performance characteristics of a model VTOL lift-fan simulator installed in a two-dimensional wing are presented. The lift-fan simulator consisted of a 15-inch diameter fan driven by a turbine contained in the fan hub. The performance of the lift-fan simulator was measured in two ways: (1) the calculated momentum thrust of the fan and turbine (total thrust loading), and (2) the axial-force measured on a load cell force balance (axial-force loading). Tests were conducted over a wide range of crossflow velocities, corrected tip speeds, and wing angle of attack. A prediction modeling technique was developed to help in analyzing the performance characteristics of lift-fan simulators. A multiple linear regression analysis technique is presented which calculates prediction model equations for the dependent variables.
Acidity in DMSO from the embedded cluster integral equation quantum solvation model.
Heil, Jochen; Tomazic, Daniel; Egbers, Simon; Kast, Stefan M
2014-04-01
The embedded cluster reference interaction site model (EC-RISM) is applied to the prediction of acidity constants of organic molecules in dimethyl sulfoxide (DMSO) solution. EC-RISM is based on a self-consistent treatment of the solute's electronic structure and the solvent's structure by coupling quantum-chemical calculations with three-dimensional (3D) RISM integral equation theory. We compare available DMSO force fields with reference calculations obtained using the polarizable continuum model (PCM). The results are evaluated statistically using two different approaches to eliminating the proton contribution: a linear regression model and an analysis of pK(a) shifts for compound pairs. Suitable levels of theory for the integral equation methodology are benchmarked. The results are further analyzed and illustrated by visualizing solvent site distribution functions and comparing them with an aqueous environment.
Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao; Lu, Jun-Ying; Zeng, Yan-Hong; Meng, Fan-Jie; Cao, Bin; Zi, Xue-Rong; Han, Shu-Ming; Zhang, Yu-Huan
2013-09-01
Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 × d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l × h × d): V = 0.56 × (l × h × d) + 39.44 (r = 0.92, P = 0.000). The 64-slice CT volume-rendering technique can accurately measure the volume in pleural effusion patients, and a linear regression equation can be used to estimate the volume of the free pleural effusion.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Anodic microbial community diversity as a predictor of the power output of microbial fuel cells.
Stratford, James P; Beecroft, Nelli J; Slade, Robert C T; Grüning, André; Avignone-Rossa, Claudio
2014-03-01
The relationship between the diversity of mixed-species microbial consortia and their electrogenic potential in the anodes of microbial fuel cells was examined using different diversity measures as predictors. Identical microbial fuel cells were sampled at multiple time-points. Biofilm and suspension communities were analysed by denaturing gradient gel electrophoresis to calculate the number and relative abundance of species. Shannon and Simpson indices and richness were examined for association with power using bivariate and multiple linear regression, with biofilm DNA as an additional variable. In simple bivariate regressions, the correlation of Shannon diversity of the biofilm and power is stronger (r=0.65, p=0.001) than between power and richness (r=0.39, p=0.076), or between power and the Simpson index (r=0.5, p=0.018). Using Shannon diversity and biofilm DNA as predictors of power, a regression model can be constructed (r=0.73, p<0.001). Ecological parameters such as the Shannon index are predictive of the electrogenic potential of microbial communities. Copyright © 2014 Elsevier Ltd. All rights reserved.
[In vitro testing of yeast resistance to antimycotic substances].
Potel, J; Arndt, K
1982-01-01
Investigations have been carried out in order to clarify the antibiotic susceptibility determination of yeasts. 291 yeast strains of different species were tested for sensitivity to 7 antimycotics: amphotericin B, flucytosin, nystatin, pimaricin, clotrimazol, econazol and miconazol. Additionally to the evaluation of inhibition zone diameters and MIC-values the influence of pH was examined. 1. The dependence of inhibition zone diameters upon pH-values varies due to the antimycotic tested. For standardizing purposes the pH 6.0 is proposed; moreover, further experimental parameters, such as nutrient composition, agar depth, cell density, incubation time and -temperature, have to be normed. 2. The relation between inhibition zone size and logarythmic MIC does not fit a linear regression analysis when all species are considered together. Therefore regression functions have to be calculated selecting the individual species. In case of the antimycotics amphotericin B, nystatin and pimaricin the low scattering of the MIC-values does not allow regression analysis. 3. A quantitative susceptibility determination of yeasts--particularly to the fungistatical substances with systemic applicability, flucytosin and miconazol, -- is advocated by the results of the MIC-tests.
Modeling and forecasting US presidential election using learning algorithms
NASA Astrophysics Data System (ADS)
Zolghadr, Mohammad; Niaki, Seyed Armin Akhavan; Niaki, S. T. A.
2017-09-01
The primary objective of this research is to obtain an accurate forecasting model for the US presidential election. To identify a reliable model, artificial neural networks (ANN) and support vector regression (SVR) models are compared based on some specified performance measures. Moreover, six independent variables such as GDP, unemployment rate, the president's approval rate, and others are considered in a stepwise regression to identify significant variables. The president's approval rate is identified as the most significant variable, based on which eight other variables are identified and considered in the model development. Preprocessing methods are applied to prepare the data for the learning algorithms. The proposed procedure significantly increases the accuracy of the model by 50%. The learning algorithms (ANN and SVR) proved to be superior to linear regression based on each method's calculated performance measures. The SVR model is identified as the most accurate model among the other models as this model successfully predicted the outcome of the election in the last three elections (2004, 2008, and 2012). The proposed approach significantly increases the accuracy of the forecast.
Manzoori, Jamshid L; Amjadi, Mohammad
2003-03-15
The characteristics of host-guest complexation between beta-cyclodextrin (beta-CD) and two forms of ibuprofen (protonated and deprotonated) were investigated by fluorescence spectrometry. 1:1 stoichiometries for both complexes were established and their association constants at different temperatures were calculated by applying a non-linear regression method to the change in the fluorescence of ibuprofen that brought about by the presence of beta-CD. The thermodynamic parameters (deltaH, deltaS and deltaG) associated with the inclusion process were also determined. Based on the obtained results, a sensitive spectrofluorimetric method for the determination of ibuprofen was developed with a linear range of 0.1-2 microg ml(-1) and a detection limit of 0.03 microg ml(-1). The method was applied satisfactorily to the determination of ibuprofen in pharmaceutical preparations. Copyright 2002 Elsevier Science B.V.
NASA Technical Reports Server (NTRS)
Stankiewicz, N.
1982-01-01
The multiple channel input signal to a soft limiter amplifier as a traveling wave tube is represented as a finite, linear sum of Gaussian functions in the frequency domain. Linear regression is used to fit the channel shapes to a least squares residual error. Distortions in output signal, namely intermodulation products, are produced by the nonlinear gain characteristic of the amplifier and constitute the principal noise analyzed in this study. The signal to noise ratios are calculated for various input powers from saturation to 10 dB below saturation for two specific distributions of channels. A criterion for the truncation of the series expansion of the nonlinear transfer characteristic is given. It is found that he signal to noise ratios are very sensitive to the coefficients used in this expansion. Improper or incorrect truncation of the series leads to ambiguous results in the signal to noise ratios.
Stefanello, C; Vieira, S L; Xue, P; Ajuwon, K M; Adeola, O
2016-07-01
A study was conducted to determine the ileal digestible energy (IDE), ME, and MEn contents of bakery meal using the regression method and to evaluate whether the energy values are age-dependent in broiler chickens from zero to 21 d post hatching. Seven hundred and eighty male Ross 708 chicks were fed 3 experimental diets in which bakery meal was incorporated into a corn-soybean meal-based reference diet at zero, 100, or 200 g/kg by replacing the energy-yielding ingredients. A 3 × 3 factorial arrangement of 3 ages (1, 2, or 3 wk) and 3 dietary bakery meal levels were used. Birds were fed the same experimental diets in these 3 evaluated ages. Birds were grouped by weight into 10 replicates per treatment in a randomized complete block design. Apparent ileal digestibility and total tract retention of DM, N, and energy were calculated. Expression of mucin (MUC2), sodium-dependent phosphate transporter (NaPi-IIb), solute carrier family 7 (cationic amino acid transporter, Y(+) system, SLC7A2), glucose (GLUT2), and sodium-glucose linked transporter (SGLT1) genes were measured at each age in the jejunum by real-time PCR. Addition of bakery meal to the reference diet resulted in a linear decrease in retention of DM, N, and energy, and a quadratic reduction (P < 0.05) in N retention and ME. There was a linear increase in DM, N, and energy as birds' ages increased from 1 to 3 wk. Dietary bakery meal did not affect jejunal gene expression. Expression of genes encoding MUC2, NaPi-IIb, and SLC7A2 linearly increased (P < 0.05) with age. Regression-derived MEn of bakery meal linearly increased (P < 0.05) as the age of birds increased, with values of 2,710, 2,820, and 2,923 kcal/kg DM for 1, 2, and 3 wk, respectively. Based on these results, utilization of energy and nitrogen in the basal diet decreased when bakery meal was included and increased with age of broiler chickens. © 2016 Poultry Science Association Inc.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.
2009-08-01
In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman
2011-01-01
This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
NASA Astrophysics Data System (ADS)
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Lamm, Ryan; Mathews, Steven N; Yang, Jie; Park, Jihye; Talamini, Mark; Pryor, Aurora D; Telem, Dana
2017-05-01
This study sought to characterize in-hospital post-colectomy mortality in New York State. One hundred sixty thousand seven hundred ninety-two patients who underwent colectomy from 1995 to 2014 were analyzed from the all-payer New York Statewide Planning and Research Cooperative System (SPARCS) database. Linear trends of in-hospital mortality rate over 20 years were calculated using log-linear regression models. Chi-square tests were used to compare categorical variables between patients. Multivariable regression models were further used to calculate risk of in-hospital mortality associated with specific demographics, co-morbidities, and perioperative complications. From 1995 to 2014, 7308 (4.5%) in-hospital mortalities occurred within 30 days of surgery. Over this time period, the rate of overall in-hospital post-colectomy mortality decreased by 3.3% (6.3 to 3%, p < 0.0001). The risk of in-hospital mortality for patients receiving emergent and elective surgery decreased by 1% (RR 0.99 [0.98-1.00], p = 0.0005) and 5% (RR 0.95 [0.94-0.96], p < 0.0001) each year, respectively. Patients who underwent open surgeries were more likely to experience in-hospital mortality (adjusted OR 3.65 [3.16-4.21], p < 0.0001), with an increased risk of in-hospital mortality each year (RR 1.01 [1.00-1.03], p = 0.0387). Numerous other risk factors were identified. In-hospital post-colectomy mortality decreased at a slower rate in emergent versus elective surgeries. The risk of in-hospital mortality has increased in open colectomies.
Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence.
Haworth, Kristina M; Chandler, Heather L
2017-02-01
To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Fifty volunteers were seen for two visits 14 ± 2 days apart. Ocular sun exposure was estimated over a 2-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and, possibly, that UVAF remains for at least several months after sun exposure.
A generic sun-tracking algorithm for on-axis solar collector in mobile platforms
NASA Astrophysics Data System (ADS)
Lai, An-Chow; Chong, Kok-Keong; Lim, Boon-Han; Ho, Ming-Cheng; Yap, See-Hao; Heng, Chun-Kit; Lee, Jer-Vui; King, Yeong-Jin
2015-04-01
This paper proposes a novel dynamic sun-tracking algorithm which allows accurate tracking of the sun for both non-concentrated and concentrated photovoltaic systems located on mobile platforms to maximize solar energy extraction. The proposed algorithm takes not only the date, time, and geographical information, but also the dynamic changes of coordinates of the mobile platforms into account to calculate the sun position angle relative to ideal azimuth-elevation axes in real time using general sun-tracking formulas derived by Chong and Wong. The algorithm acquires data from open-loop sensors, i.e. global position system (GPS) and digital compass, which are readily available in many off-the-shelf portable gadgets, such as smart phone, to instantly capture the dynamic changes of coordinates of mobile platforms. Our experiments found that a highly accurate GPS is not necessary as the coordinate changes of practical mobile platforms are not fast enough to produce significant differences in the calculation of the incident angle. On the contrary, it is critical to accurately identify the quadrant and angle where the mobile platforms are moving toward in real time, which can be resolved by using digital compass. In our implementation, a noise filtering mechanism is found necessary to remove unexpected spikes in the readings of the digital compass to ensure stability in motor actuations and effectiveness in continuous tracking. Filtering mechanisms being studied include simple moving average and linear regression; the results showed that a compound function of simple moving average and linear regression produces a better outcome. Meanwhile, we found that a sampling interval is useful to avoid excessive motor actuations and power consumption while not sacrificing the accuracy of sun-tracking.
Guenette, Jeffrey P; Smith, Stacy E
2018-06-01
We aimed to identify job resources and job demands associated with measures of personal accomplishment (PA) in radiology residents in the United States. A 34-item online survey was administered between May and June 2017 to U.S. radiology residents and included the 8 Likert-type PA questions from the Maslach Burnout Inventory-Human Services Survey, 19 visual analog scale job demands-resources questions, and 7 demographic questions. Multiple linear regression was calculated to predict PA based on job demands-resources. Effects of binomial demographic factors on PA scores were compared with independent-samples t tests. Effects of categorical demographic factors on PA scores were compared with one-way between-subjects analysis of variance tests. A linear regression was calculated to evaluate the relationship of age on PA scores. "The skills and knowledge that I am building are important and helpful to society" (P = 2 × 10 -16 ), "I have good social support from my co-residents" (P = 4 × 10 -5 ), and "I regularly receive adequate constructive feedback" (P = 4 × 10 -6 ) all positively correlated with PA. PA scores were significantly lower for individuals who were single vs those married or partnered (P = .01). Radiology residents score higher in the PA domain of burnout when they receive adequate constructive feedback, have good co-resident social support, and feel that the skills and knowledge they are building are important to society. Improving constructive feedback mechanisms, enabling resident-only social time, and supporting opportunities that reinforce the importance of their contributions may therefore improve radiology residents' sense of PA. Copyright © 2018. Published by Elsevier Inc.
Assessing the role of pavement macrotexture in preventing crashes on highways.
Pulugurtha, Srinivas S; Kusam, Prasanna R; Patel, Kuvleshay J
2010-02-01
The objective of this article is to assess the role of pavement macrotexture in preventing crashes on highways in the State of North Carolina. Laser profilometer data obtained from the North Carolina Department of Transportation (NCDOT) for highways comprising four corridors are processed to calculate pavement macrotexture at 100-m (approximately 330-ft) sections according to the American Society for Testing and Materials (ASTM) standards. Crash data collected over the same lengths of the corridors were integrated with the calculated pavement macrotexture for each section. Scatterplots were generated to assess the role of pavement macrotexture on crashes and logarithm of crashes. Regression analyses were conducted by considering predictor variables such as million vehicle miles of travel (as a function of traffic volume and length), the number of interchanges, the number of at-grade intersections, the number of grade-separated interchanges, and the number of bridges, culverts, and overhead signs along with pavement macrotexture to study the statistical significance of relationship between pavement macrotexture and crashes (both linear and log-linear) when compared to other predictor variables. Scatterplots and regression analysis conducted indicate a more statistically significant relationship between pavement macrotexture and logarithm of crashes than between pavement macrotexture and crashes. The coefficient for pavement macrotexture, in general, is negative, indicating that the number of crashes or logarithm of crashes decreases as it increases. The relation between pavement macrotexture and logarithm of crashes is generally stronger than between most other predictor variables and crashes or logarithm of crashes. Based on results obtained, it can be concluded that maintaining pavement macrotexture greater than or equal to 1.524 mm (0.06 in.) as a threshold limit would possibly reduce crashes and provide safe transportation to road users on highways.
Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence
Haworth, Kristina M.; Chandler, Heather L.
2016-01-01
Purpose To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Methods Fifty volunteers were seen for 2 visits 14±2 days apart. Ocular sun exposure was estimated over a two-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Results Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm2, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. Conclusions We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and possibly, that UVAF remains for at least several months following sun exposure. PMID:27820717
Jilcott, Stephanie B; Wall-Bassett, Elizabeth D; Burke, Sloane C; Moore, Justin B
2011-11-01
Obesity disproportionately affects low-income and minority individuals and has been linked with food insecurity, particularly among women. More research is needed to examine potential mechanisms linking obesity and food insecurity. Therefore, this study's purpose was to examine cross-sectional associations between food insecurity, Supplemental Nutrition Assistance Program (SNAP) benefits per household member, perceived stress, and body mass index (BMI) among female SNAP participants in eastern North Carolina (n=202). Women were recruited from the Pitt County Department of Social Services between October 2009 and April 2010. Household food insecurity was measured using the validated US Department of Agriculture 18-item food security survey module. Perceived stress was measured using the 14-item Cohen's Perceived Stress Scale. SNAP benefits and number of children in the household were self-reported and used to calculate benefits per household member. BMI was calculated from measured height and weight (as kg/m(2)). Multivariate linear regression was used to examine associations between BMI, SNAP benefits, stress, and food insecurity while adjusting for age and physical activity. In adjusted linear regression analyses, perceived stress was positively related to food insecurity (P<0.0001), even when SNAP benefits were included in the model. BMI was positively associated with food insecurity (P=0.04). Mean BMI was significantly greater among women receiving <$150 in SNAP benefits per household member vs those receiving ≥$150 in benefits per household member (35.8 vs 33.1; P=0.04). Results suggest that provision of adequate SNAP benefits per household member might partially ameliorate the negative effects of food insecurity on BMI. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Zelle, Sten G; Baltussen, Rob; Otten, Johannes D M; Heijnsdijk, Eveline A M; van Schoor, Guido; Broeders, Mireille J M
2015-03-01
To provide proof of concept for a simple model to estimate the stage shift as a result of breast cancer screening in low- and middle-income countries (LMICs). Stage shift is an essential early detection indicator and an important proxy for the performance and possible further impact of screening programmes. Our model could help LIMCs to choose appropriate control strategies. We assessed our model concept in three steps. First, we calculated the proportional performance rates (i.e. index number Z) based on 16 screening rounds of the Nijmegen Screening Program (384,884 screened women). Second, we used linear regression to assess the association between Z and the amount of stage shift observed in the programme. Third, we hypothesized how Z could be used to estimate the stage shift as a result of breast cancer screening in LMICs. Stage shifts can be estimated by the proportional performance rates (Zs) using linear regression. Zs calculated for each screening round are highly associated with the observed stage shifts in the Nijmegen Screening Program (Pearson's R: 0.798, R square: 0.637). Our model can predict the stage shifts in the Nijmegen Screening Program, and could be applied to settings with different characteristics, although it should not be straightforwardly used to estimate the impact on mortality. Further research should investigate the extrapolation of our model to other settings. As stage shift is an essential screening performance indicator, our model could provide important information on the performance of breast cancer screening programmes that LMICs consider implementing. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Eng, H; Mercer, J B
2000-10-01
Seasonal variations in mortality due to cardiovascular disease have been demonstrated in many countries, with the highest levels occurring during the coldest months of the year. It has been suggested that this can be explained by cold climate. In this study, we examined the relationship between mortality and two different climatic factors in two densely populated areas (Dublin, Ireland and Oslo/Akershus, Norway). Meteorological data (mean daily air temperatures and wind speed) and registered daily mortality data for three groups of cardiovascular disease for the period 1985-1994 were obtained for the two respective areas. The daily mortality ratio for both men and women of 60 years and older was calculated from the mortality data. The wind chill temperature equivalent was calculated from the Siple and Passels formula. The seasonal variations in mortality were greater in Dublin than in Oslo/Akershus, with mortality being highest in winter. This pattern was similar to that previously shown for the two respective countries as a whole. There was a negative correlation between mortality and both air temperature and wind chill temperature equivalent for all three groups of diseases. The slopes of the linear regression lines describing the relationship between mortality and air temperature were a lot steeper for the Irish data than for the Norwegian data. However, the difference between the steepness of the linear regression lines for the relationship between mortality and wind chill temperature equivalent was considerably less between the two areas. This can be explained by the fact that Dublin is a much windier area than Oslo/Akershus. The results of this study demonstrate that the inclusion of two climatic factors rather than just one changes the impression of the relationship between climate and cardiovascular disease mortality.
Possible association between obesity and periodontitis in patients with Down syndrome.
Culebras-Atienza, E; Silvestre, F-J; Silvestre-Rangil, J
2018-05-01
The present study was carried out to evaluate the possible association between obesity and periodontitis in patients with DS, and to explore which measure of obesity is most closely correlated to periodontitis. A prospective observational study was made to determine whether obesity is related to periodontal disease in patients with DS. The anthropometric variables were body height and weight, which were used to calculate BMI and stratify the patients into three categories: < 25(normal weight), 25-29.9 (overweight) and ≥ 30.0 kg/m2 (obese). Waist circumference and hip circumference in turn was recorded as the greatest circumference at the level of the buttocks, while the waist/hip ratio (WHR) was calculated. Periodontal evaluation was made of all teeth recording the plaque index (PI), pocket depth (PD), clinical attachment level (CAL) and the gingival index. We generated a multivariate linear regression model to examine the relationship between PD and the frequency of tooth brushing, gender, BMI, WHI, WHR, age and PI. Significant positive correlations were observed among the anthropometric parameters BMI, WHR, WHI and among the periodontal parameters PI, PD, CAL and GI. The only positive correlation between the anthropometric and periodontal parameters corresponded to WHR. Upon closer examination, the distribution of WHR was seen to differ according to gender. Among the women, the correlation between WHR and the periodontal variables decreased to nonsignificant levels. In contrast, among the males the correlation remained significant and even increased. In a multivariate linear regression model, the coefficients relating PD to PI, WHR and age were positive and significant in all cases. Our results suggest that there may indeed be an association between obesity and periodontitis in male patients with DS. Also, we found a clear correlation with WHR, which was considered to be the ideal adiposity indicator in this context.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method
NASA Astrophysics Data System (ADS)
Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty
2017-03-01
Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.
NASA Astrophysics Data System (ADS)
Alvarez, César I.; Teodoro, Ana; Tierra, Alfonso
2017-10-01
Thin clouds in the optical remote sensing data are frequent and in most of the cases don't allow to have a pure surface data in order to calculate some indexes as Normalized Difference Vegetation Index (NDVI). This paper aims to evaluate the Automatic Cloud Removal Method (ACRM) algorithm over a high elevation city like Quito (Ecuador), with an altitude of 2800 meters above sea level, where the clouds are presented all the year. The ACRM is an algorithm that considers a linear regression between each Landsat 8 OLI band and the Cirrus band using the slope obtained with the linear regression established. This algorithm was employed without any reference image or mask to try to remove the clouds. The results of the application of the ACRM algorithm over Quito didn't show a good performance. Therefore, was considered improving this algorithm using a different slope value data (ACMR Improved). After, the NDVI computation was compared with a reference NDVI MODIS data (MOD13Q1). The ACMR Improved algorithm had a successful result when compared with the original ACRM algorithm. In the future, this Improved ACRM algorithm needs to be tested in different regions of the world with different conditions to evaluate if the algorithm works successfully for all conditions.
Association between the Type of Workplace and Lung Function in Copper Miners
Gruszczyński, Leszek; Wojakowska, Anna; Ścieszka, Marek; Turczyn, Barbara; Schmidt, Edward
2016-01-01
The aim of the analysis was to retrospectively assess changes in lung function in copper miners depending on the type of workplace. In the groups of 225 operators, 188 welders, and 475 representatives of other jobs, spirometry was performed at the start of employment and subsequently after 10, 20, and 25 years of work. Spirometry Longitudinal Data Analysis software was used to estimate changes in group means for FEV1 and FVC. Multiple linear regression analysis was used to assess an association between workplace and lung function. Lung function assessed on the basis of calculation of longitudinal FEV1 (FVC) decline was similar in all studied groups. However, multiple linear regression model used in cross-sectional analysis revealed an association between workplace and lung function. In the group of welders, FEF75 was lower in comparison to operators and other miners as early as after 10 years of work. Simultaneously, in smoking welders, the FEV1/FVC ratio was lower than in nonsmokers (p < 0,05). The interactions between type of workplace and smoking (p < 0,05) in their effect on FVC, FEV1, PEF, and FEF50 were shown. Among underground working copper miners, the group of smoking welders is especially threatened by impairment of lung ventilatory function. PMID:27274987
Rapp, Jennifer L.; Reilly, Pamela A.
2017-11-14
BackgroundThe U.S. Geological Survey (USGS), in cooperation with the Virginia Department of Environmental Quality (DEQ), reviewed a previously compiled set of linear regression models to assess their utility in defining the response of the aquatic biological community to streamflow depletion.As part of the 2012 Virginia Healthy Watersheds Initiative (HWI) study conducted by Tetra Tech, Inc., for the U.S. Environmental Protection Agency (EPA) and Virginia DEQ, a database with computed values of 72 hydrologic metrics, or indicators of hydrologic alteration (IHA), 37 fish metrics, and 64 benthic invertebrate metrics was compiled and quality assured. Hydrologic alteration was represented by simulation of streamflow record for a pre-water-withdrawal condition (baseline) without dams or developed land, compared to the simulated recent-flow condition (2008 withdrawal simulation) including dams and altered landscape to calculate a percent alteration of flow. Biological samples representing the existing populations represent a range of alteration in the biological community today.For this study, all 72 IHA metrics, which included more than 7,272 linear regression models, were considered. This extensive dataset provided the opportunity for hypothesis testing and prioritization of flow-ecology relations that have the potential to explain the effect(s) of hydrologic alteration on biological metrics in Virginia streams.
NASA Astrophysics Data System (ADS)
Song, Seok-Jeong; Kim, Tae-Il; Kim, Youngmi; Nam, Hyoungsik
2018-05-01
Recently, a simple, sensitive, and low-cost fluorescent indicator has been proposed to determine water contents in organic solvents, drugs, and foodstuffs. The change of water content leads to the change of the indicator's fluorescence color under the ultra-violet (UV) light. Whereas the water content values could be estimated from the spectrum obtained by a bulky and expensive spectrometer in the previous research, this paper demonstrates a simple and low-cost camera-based water content measurement scheme with the same fluorescent water indicator. Water content is calculated over the range of 0-30% by quadratic polynomial regression models with color information extracted from the captured images of samples. Especially, several color spaces such as RGB, xyY, L∗a∗b∗, u‧v‧, HSV, and YCBCR have been investigated to establish the optimal color information features over both linear and nonlinear RGB data given by a camera before and after gamma correction. In the end, a 2nd order polynomial regression model along with HSV in a linear domain achieves the minimum mean square error of 1.06% for a 3-fold cross validation method. Additionally, the resultant water content estimation model is implemented and evaluated in an off-the-shelf Android-based smartphone.
Akbar, Jamshed; Iqbal, Shahid; Batool, Fozia; Karim, Abdul; Chan, Kim Wei
2012-01-01
Quantitative structure-retention relationships (QSRRs) have successfully been developed for naturally occurring phenolic compounds in a reversed-phase liquid chromatographic (RPLC) system. A total of 1519 descriptors were calculated from the optimized structures of the molecules using MOPAC2009 and DRAGON softwares. The data set of 39 molecules was divided into training and external validation sets. For feature selection and mapping we used step-wise multiple linear regression (SMLR), unsupervised forward selection followed by step-wise multiple linear regression (UFS-SMLR) and artificial neural networks (ANN). Stable and robust models with significant predictive abilities in terms of validation statistics were obtained with negation of any chance correlation. ANN models were found better than remaining two approaches. HNar, IDM, Mp, GATS2v, DISP and 3D-MoRSE (signals 22, 28 and 32) descriptors based on van der Waals volume, electronegativity, mass and polarizability, at atomic level, were found to have significant effects on the retention times. The possible implications of these descriptors in RPLC have been discussed. All the models are proven to be quite able to predict the retention times of phenolic compounds and have shown remarkable validation, robustness, stability and predictive performance. PMID:23203132
Confounder Detection in High-Dimensional Linear Models Using First Moments of Spectral Measures.
Liu, Furui; Chan, Laiwan
2018-06-12
In this letter, we study the confounder detection problem in the linear model, where the target variable [Formula: see text] is predicted using its [Formula: see text] potential causes [Formula: see text]. Based on an assumption of a rotation-invariant generating process of the model, recent study shows that the spectral measure induced by the regression coefficient vector with respect to the covariance matrix of [Formula: see text] is close to a uniform measure in purely causal cases, but it differs from a uniform measure characteristically in the presence of a scalar confounder. Analyzing spectral measure patterns could help to detect confounding. In this letter, we propose to use the first moment of the spectral measure for confounder detection. We calculate the first moment of the regression vector-induced spectral measure and compare it with the first moment of a uniform spectral measure, both defined with respect to the covariance matrix of [Formula: see text]. The two moments coincide in nonconfounding cases and differ from each other in the presence of confounding. This statistical causal-confounding asymmetry can be used for confounder detection. Without the need to analyze the spectral measure pattern, our method avoids the difficulty of metric choice and multiple parameter optimization. Experiments on synthetic and real data show the performance of this method.
NASA Astrophysics Data System (ADS)
Aulenbach, B. T.; Burns, D. A.; Shanley, J. B.; Yanai, R. D.; Bae, K.; Wild, A.; Yang, Y.; Dong, Y.
2013-12-01
There are many sources of uncertainty in estimates of streamwater solute flux. Flux is the product of discharge and concentration (summed over time), each of which has measurement uncertainty of its own. Discharge can be measured almost continuously, but concentrations are usually determined from discrete samples, which increases uncertainty dependent on sampling frequency and how concentrations are assigned for the periods between samples. Gaps between samples can be estimated by linear interpolation or by models that that use the relations between concentration and continuously measured or known variables such as discharge, season, temperature, and time. For this project, developed in cooperation with QUEST (Quantifying Uncertainty in Ecosystem Studies), we evaluated uncertainty for three flux estimation methods and three different sampling frequencies (monthly, weekly, and weekly plus event). The constituents investigated were dissolved NO3, Si, SO4, and dissolved organic carbon (DOC), solutes whose concentration dynamics exhibit strongly contrasting behavior. The evaluation was completed for a 10-year period at five small, forested watersheds in Georgia, New Hampshire, New York, Puerto Rico, and Vermont. Concentration regression models were developed for each solute at each of the three sampling frequencies for all five watersheds. Fluxes were then calculated using (1) a linear interpolation approach, (2) a regression-model method, and (3) the composite method - which combines the regression-model method for estimating concentrations and the linear interpolation method for correcting model residuals to the observed sample concentrations. We considered the best estimates of flux to be derived using the composite method at the highest sampling frequencies. We also evaluated the importance of sampling frequency and estimation method on flux estimate uncertainty; flux uncertainty was dependent on the variability characteristics of each solute and varied for different reporting periods (e.g. 10-year, study period vs. annually vs. monthly). The usefulness of the two regression model based flux estimation approaches was dependent upon the amount of variance in concentrations the regression models could explain. Our results can guide the development of optimal sampling strategies by weighing sampling frequency with improvements in uncertainty in stream flux estimates for solutes with particular characteristics of variability. The appropriate flux estimation method is dependent on a combination of sampling frequency and the strength of concentration regression models. Sites: Biscuit Brook (Frost Valley, NY), Hubbard Brook Experimental Forest and LTER (West Thornton, NH), Luquillo Experimental Forest and LTER (Luquillo, Puerto Rico), Panola Mountain (Stockbridge, GA), Sleepers River Research Watershed (Danville, VT)
NASA Technical Reports Server (NTRS)
Wu, Man Li C.; Schubert, Siegfried; Lin, Ching I.; Stajner, Ivanka; Einaudi, Franco (Technical Monitor)
2000-01-01
A method is developed for validating model-based estimates of atmospheric moisture and ground temperature using satellite data. The approach relates errors in estimates of clear-sky longwave fluxes at the top of the Earth-atmosphere system to errors in geophysical parameters. The fluxes include clear-sky outgoing longwave radiation (CLR) and radiative flux in the window region between 8 and 12 microns (RadWn). The approach capitalizes on the availability of satellite estimates of CLR and RadWn and other auxiliary satellite data, and multiple global four-dimensional data assimilation (4-DDA) products. The basic methodology employs off-line forward radiative transfer calculations to generate synthetic clear-sky longwave fluxes from two different 4-DDA data sets. Simple linear regression is used to relate the clear-sky longwave flux discrepancies to discrepancies in ground temperature ((delta)T(sub g)) and broad-layer integrated atmospheric precipitable water ((delta)pw). The slopes of the regression lines define sensitivity parameters which can be exploited to help interpret mismatches between satellite observations and model-based estimates of clear-sky longwave fluxes. For illustration we analyze the discrepancies in the clear-sky longwave fluxes between an early implementation of the Goddard Earth Observing System Data Assimilation System (GEOS2) and a recent operational version of the European Centre for Medium-Range Weather Forecasts data assimilation system. The analysis of the synthetic clear-sky flux data shows that simple linear regression employing (delta)T(sub g)) and broad layer (delta)pw provides a good approximation to the full radiative transfer calculations, typically explaining more thin 90% of the 6 hourly variance in the flux differences. These simple regression relations can be inverted to "retrieve" the errors in the geophysical parameters, Uncertainties (normalized by standard deviation) in the monthly mean retrieved parameters range from 7% for (delta)T(sub g) to approx. 20% for the lower tropospheric moisture between 500 hPa and surface. The regression relationships developed from the synthetic flux data, together with CLR and RadWn observed with the Clouds and Earth Radiant Energy System instrument, ire used to assess the quality of the GEOS2 T(sub g) and pw. Results showed that the GEOS2 T(sub g) is too cold over land, and pw in upper layers is too high over the tropical oceans and too low in the lower atmosphere.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
Gjerde, Hallvard; Verstraete, Alain
2010-02-25
To study several methods for estimating the prevalence of high blood concentrations of tetrahydrocannabinol and amphetamine in a population of drug users by analysing oral fluid (saliva). Five methods were compared, including simple calculation procedures dividing the drug concentrations in oral fluid by average or median oral fluid/blood (OF/B) drug concentration ratios or linear regression coefficients, and more complex Monte Carlo simulations. Populations of 311 cannabis users and 197 amphetamine users from the Rosita-2 Project were studied. The results of a feasibility study suggested that the Monte Carlo simulations might give better accuracies than simple calculations if good data on OF/B ratios is available. If using only 20 randomly selected OF/B ratios, a Monte Carlo simulation gave the best accuracy but not the best precision. Dividing by the OF/B regression coefficient gave acceptable accuracy and precision, and was therefore the best method. None of the methods gave acceptable accuracy if the prevalence of high blood drug concentrations was less than 15%. Dividing the drug concentration in oral fluid by the OF/B regression coefficient gave an acceptable estimation of high blood drug concentrations in a population, and may therefore give valuable additional information on possible drug impairment, e.g. in roadside surveys of drugs and driving. If good data on the distribution of OF/B ratios are available, a Monte Carlo simulation may give better accuracy. 2009 Elsevier Ireland Ltd. All rights reserved.
Predicting the demand of physician workforce: an international model based on "crowd behaviors".
Tsai, Tsuen-Chiuan; Eliasziw, Misha; Chen, Der-Fang
2012-03-26
Appropriateness of physician workforce greatly influences the quality of healthcare. When facing the crisis of physician shortages, the correction of manpower always takes an extended time period, and both the public and health personnel suffer. To calculate an appropriate number of Physician Density (PD) for a specific country, this study was designed to create a PD prediction model, based on health-related data from many countries. Twelve factors that could possibly impact physicians' demand were chosen, and data of these factors from 130 countries (by reviewing 195) were extracted. Multiple stepwise-linear regression was used to derive the PD prediction model, and a split-sample cross-validation procedure was performed to evaluate the generalizability of the results. Using data from 130 countries, with the consideration of the correlation between variables, and preventing multi-collinearity, seven out of the 12 predictor variables were selected for entry into the stepwise regression procedure. The final model was: PD = (5.014 - 0.128 × proportion under age 15 years + 0.034 × life expectancy)2, with R2 of 80.4%. Using the prediction equation, 70 countries had PDs with "negative discrepancy", while 58 had PDs with "positive discrepancy". This study provided a regression-based PD model to calculate a "norm" number of PD for a specific country. A large PD discrepancy in a country indicates the needs to examine physician's workloads and their well-being, the effectiveness/efficiency of medical care, the promotion of population health and the team resource management.
[Radiological anatomical examinations in skulls from anthropological collections (author's transl)].
Wicke, L
1976-01-01
A total of 114 skulls dating from the Neolithic Age, the Bronze Age and the Iron Age, of Incas and Red Indians, of Asians from North and South China, as well as Negro skulls found in Turkey were radiologically analysed and compared with control skulls of recent origin. The 3 standard X-ray views were taken (postero-anterior, axial and lateral) and appropriate linear and angle measurements were carried out. The resultant 4120 values were compared by variance analysis and the differences between the groups are presented. The differences in linear values may be attributable merely to racial variation; the constancy of the obtained angle measurements is striking. The results were also compared by means of linear regression with measured volume values of the brain skull; it was thereby possible to develop a new formula by means of which the volume of the brain skull can be calculated from the parameter BPH (introduced by the author) and from the distance B with the help of a constant factor. The importance of Radiology in Anthropology is pointed out.
Linear Regression Links Transcriptomic Data and Cellular Raman Spectra.
Kobayashi-Kirschvink, Koseki J; Nakaoka, Hidenori; Oda, Arisa; Kamei, Ken-Ichiro F; Nosho, Kazuki; Fukushima, Hiroko; Kanesaki, Yu; Yajima, Shunsuke; Masaki, Haruhiko; Ohta, Kunihiro; Wakamoto, Yuichi
2018-06-08
Raman microscopy is an imaging technique that has been applied to assess molecular compositions of living cells to characterize cell types and states. However, owing to the diverse molecular species in cells and challenges of assigning peaks to specific molecules, it has not been clear how to interpret cellular Raman spectra. Here, we provide firm evidence that cellular Raman spectra and transcriptomic profiles of Schizosaccharomyces pombe and Escherichia coli can be computationally connected and thus interpreted. We find that the dimensions of high-dimensional Raman spectra and transcriptomes measured by RNA sequencing can be reduced and connected linearly through a shared low-dimensional subspace. Accordingly, we were able to predict global gene expression profiles by applying the calculated transformation matrix to Raman spectra, and vice versa. Highly expressed non-coding RNAs contributed to the Raman-transcriptome linear correspondence more significantly than mRNAs in S. pombe. This demonstration of correspondence between cellular Raman spectra and transcriptomes is a promising step toward establishing spectroscopic live-cell omics studies. Copyright © 2018 Elsevier Inc. All rights reserved.
2013-01-01
application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
Gifford, Katherine A; Phillips, Jeffrey S; Samuels, Lauren R; Lane, Elizabeth M; Bell, Susan P; Liu, Dandan; Hohman, Timothy J; Romano, Raymond R; Fritzsche, Laura R; Lu, Zengqi; Jefferson, Angela L
2015-07-01
A symptom of mild cognitive impairment (MCI) and Alzheimer's disease (AD) is a flat learning profile. Learning slope calculation methods vary, and the optimal method for capturing neuroanatomical changes associated with MCI and early AD pathology is unclear. This study cross-sectionally compared four different learning slope measures from the Rey Auditory Verbal Learning Test (simple slope, regression-based slope, two-slope method, peak slope) to structural neuroimaging markers of early AD neurodegeneration (hippocampal volume, cortical thickness in parahippocampal gyrus, precuneus, and lateral prefrontal cortex) across the cognitive aging spectrum [normal control (NC); (n=198; age=76±5), MCI (n=370; age=75±7), and AD (n=171; age=76±7)] in ADNI. Within diagnostic group, general linear models related slope methods individually to neuroimaging variables, adjusting for age, sex, education, and APOE4 status. Among MCI, better learning performance on simple slope, regression-based slope, and late slope (Trial 2-5) from the two-slope method related to larger parahippocampal thickness (all p-values<.01) and hippocampal volume (p<.01). Better regression-based slope (p<.01) and late slope (p<.01) were related to larger ventrolateral prefrontal cortex in MCI. No significant associations emerged between any slope and neuroimaging variables for NC (p-values ≥.05) or AD (p-values ≥.02). Better learning performances related to larger medial temporal lobe (i.e., hippocampal volume, parahippocampal gyrus thickness) and ventrolateral prefrontal cortex in MCI only. Regression-based and late slope were most highly correlated with neuroimaging markers and explained more variance above and beyond other common memory indices, such as total learning. Simple slope may offer an acceptable alternative given its ease of calculation.
Conceptual model of consumer’s willingness to eat functional foods
Babicz-Zielinska, Ewa; Jezewska-Zychowicz, Maria
The functional foods constitute the important segment of the food market. Among factors that determine the intentions to eat functional foods, the psychological factors play very important roles. Motives, attitudes and personality are key factors. The relationships between socio-demographic characteristics, attitudes and willingness to purchase functional foods were not fully confirmed. Consumers’ beliefs about health benefits from eaten foods seem to be a strong determinant of a choice of functional foods. The objective of this study was to determine relations between familiarity, attitudes, and beliefs in benefits and risks about functional foods and develop some conceptual models of willingness to eat. The sample of Polish consumers counted 1002 subjects at age 15+. The foods enriched with vitamins or minerals, and cholesterol-lowering margarine or drinks were considered. The questionnaire focused on familiarity with foods, attitudes, beliefs about benefits and risks of their consumption was constructed. The Pearson’s correlations and linear regression equations were calculated. The strongest relations appeared between attitudes, high health value and high benefits, (r = 0.722 and 0.712 for enriched foods, and 0.664 and 0.693 for cholesterol-lowering foods), and between high health value and high benefits (0.814 for enriched foods and 0.758 for cholesterol-lowering foods). The conceptual models based on linear regression of relations between attitudes and all other variables, considering or not the familiarity with the foods, were developed. The positive attitudes and declared consumption are more important for enriched foods. The beliefs on high health value and high benefits play the most important role in the purchase. The interrelations between different variables may be described by new linear regression models, with the beliefs in high benefits, positive attitudes and familiarity being most significant predictors. Health expectations and trust to functional foods are the key factors in their choice.
Maintenance energy requirements in miniature colony dogs.
Serisier, S; Weber, M; Feugier, A; Fardet, M-O; Garnier, F; Biourge, V; German, A J
2013-05-01
There are numerous reports of maintenance energy requirements (MER) in dogs, but little information is available about energy requirements of miniature dog breeds. In this prospective, observational, cohort study, we aimed to determine MER in dogs from a number of miniature breeds and to determine which factors were associated with it. Forty-two dogs participated in the study. MER was calculated by determining daily energy intake (EI) during a period of 196 days (28-359 days) when body weight did not change significantly (e.g. ±2% in 12 weeks). Estimated median MER was 473 kJ/kg(0.75) /day (285-766 kJ/kg(0.75) /day), that is, median 113 kcal/kg(0.75) /day (68-183 kcal/kg(0.75) /day). In the obese dogs that lost weight, median MER after weight loss was completed was 360 kJ/kg(0.75) /day (285-515 kJ/kg(0.75) /day), that is, 86 kcal/kg(0.75) /day, (68-123 kcal/kg(0.75) /day). Simple linear regression analysis suggested that three breeds (e.g. Chihuahua, p = 0.002; Yorkshire terrier, p = 0.039; dachshund, p = 0.035) had an effect on MER. In addition to breed, simple linear regression revealed that neuter status (p = 0.079) and having previously been overweight (p = 0.002) were also of significance. However, with multiple linear regression analysis, only previous overweight status (MER less in dogs previously overweight p = 0.008) and breed (MER greater in Yorkshire terriers [p = 0.029] and less in Chihuahuas [p = 0.089]) remained in the final model. This study is the first to estimate MER in dogs of miniature breeds. Although further information from pet dogs is now needed, the current work will be useful for setting energy and nutrient requirement in such dogs for the future. Journal of Animal Physiology and Animal Nutrition © 2013 Blackwell Verlag GmbH.
Extreme wind-wave modeling and analysis in the south Atlantic ocean
NASA Astrophysics Data System (ADS)
Campos, R. M.; Alves, J. H. G. M.; Guedes Soares, C.; Guimaraes, L. G.; Parente, C. E.
2018-04-01
A set of wave hindcasts is constructed using two different types of wind calibration, followed by an additional test retuning the input source term Sin in the wave model. The goal is to improve the simulation in extreme wave events in the South Atlantic Ocean without compromising average conditions. Wind fields are based on Climate Forecast System Reanalysis (CFSR/NCEP). The first wind calibration applies a simple linear regression model, with coefficients obtained from the comparison of CFSR against buoy data. The second is a method where deficiencies of the CFSR associated with severe sea state events are remedied, whereby "defective" winds are replaced with satellite data within cyclones. A total of six wind datasets forced WAVEWATCH-III and additional three tests with modified Sin in WAVEWATCH III lead to a total of nine wave hindcasts that are evaluated against satellite and buoy data for ambient and extreme conditions. The target variable considered is the significant wave height (Hs). The increase of sea-state severity shows a progressive increase of the hindcast underestimation which could be calculated as a function of percentiles. The wind calibration using a linear regression function shows similar results to the adjustments to Sin term (increase of βmax parameter) in WAVEWATCH-III - it effectively reduces the average bias of Hs but cannot avoid the increase of errors with percentiles. The use of blended scatterometer winds within cyclones could reduce the increasing wave hindcast errors mainly above the 93rd percentile and leads to a better representation of Hs at the peak of the storms. The combination of linear regression calibration of non-cyclonic winds with scatterometer winds within the cyclones generated a wave hindcast with small errors from calm to extreme conditions. This approach led to a reduction of the percentage error of Hs from 14% to less than 8% for extreme waves, while also improving the RMSE.
New functions for estimating AOT40 from ozone passive sampling
NASA Astrophysics Data System (ADS)
De Marco, Alessandra; Vitale, Marcello; Kilic, Umit; Serengil, Yusuf; Paoletti, Elena
2014-10-01
AOT40 is the present European standard to assess whether ozone (O3) pollution is a risk for vegetation, and is calculated by using hourly O3 concentrations from automatic devices, i.e. by active monitoring. Passive O3 monitoring is widespread in remote environments. The Loibl function estimates the mean daily O3 profile and thus hourly O3 concentrations, and has been proposed to calculate AOT40 from passive samplers. We investigated whether this function performs well in inhomogeneous terrains such as over the Italian country. Data from 75 active monitoring stations (28 rural and 47 suburban) were analysed over two years. AOT40 was calculated from hourly O3 data either measured by active measurements or estimated by the Loibl function applied to biweekly averages of active-measurement hourly data. The latter approach simulated the data obtained from passive monitoring, as two weeks is the usual exposure window of passive samplers. Residuals between AOT40 estimated by applying the Loibl function and AOT40 calculated from active monitoring ranged from +241% to -107%, suggesting that the Loibl function is inadequate to accurately predict AOT40 in Italy. New statistical models were built for both rural and suburban areas by using non-linear models and including predictors that can be easily measured at forest sites. The modelled AOT40 values strongly depended on physical predictors (latitude and longitude), alone or in combination with other predictors, such as seasonal cumulated ozone and elevation. These results suggest that multi-variate, non-linear regression models work better than the Loibl-based approach in estimating AOT40.
Lee, Dong Ho; Lee, Jae Young; Lee, Kyung Bun; Han, Joon Koo
2017-11-01
Purpose To determine factors that significantly affect the focal disturbance (FD) ratio calculated with an acoustic structure quantification (ASQ) technique in a dietary-induced fatty liver disease rat model and to assess the diagnostic performance of the FD ratio in the assessment of hepatic steatosis by using histopathologic examination as a standard of reference. Materials and Methods Twenty-eight male F344 rats were fed a methionine-choline-deficient diet with a variable duration (3.5 days [half week] or 1, 2, 3, 4, 5, or 6 weeks; four rats in each group). A control group of four rats was maintained on a standard diet. At the end of each diet period, ASQ ultrasonography (US) and magnetic resonance (MR) spectroscopy were performed. Then, the rat was sacrificed and histopathologic examination of the liver was performed. Receiver operating characteristic curve analysis was performed to assess the diagnostic performance of the FD ratio in the evaluation of the degree of hepatic steatosis. The Spearman correlation coefficient was calculated to assess the correlation between the ordinal values, and multivariate linear regression analysis was used to identify significant determinant factors for the FD ratio. Results The diagnostic performance of the FD ratio in the assessment of the degree of hepatic steatosis (area under the receiver operating characteristic curve: 1.000 for 5%-33% steatosis, 0.981 for >33% to 66% steatosis, and 0.965 for >66% steatosis) was excellent and was comparable to that of MR spectroscopy. There was a strong negative linear correlation between the FD ratio and the estimated fat fraction at MR spectroscopy (Spearman ρ, -0.903; P < .001). Multivariate linear regression analysis showed that the degree of hepatic steatosis (P < .001) and fibrosis stage (P = .022) were significant factors affecting the FD ratio. Conclusion The FD ratio may potentially provide good diagnostic performance in the assessment of the degree of hepatic steatosis, with a strong negative linear correlation with the estimated fat fraction at MR spectroscopy. The degree of steatosis and stage of fibrosis at histopathologic examination were significant factors that affected the FD ratio. © RSNA, 2017 Online supplemental material is available for this article.
Specialization Agreements in the Council for Mutual Economic Assistance
1988-02-01
proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations
ERIC Educational Resources Information Center
Pek, Jolynn; Wong, Octavia; Wong, C. M.
2017-01-01
Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Simple and multiple linear regression: sample size considerations.
Hanley, James A
2016-11-01
The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
Liang, Yuzhen; Xiong, Ruichang; Sandler, Stanley I; Di Toro, Dominic M
2017-09-05
Polyparameter Linear Free Energy Relationships (pp-LFERs), also called Linear Solvation Energy Relationships (LSERs), are used to predict many environmentally significant properties of chemicals. A method is presented for computing the necessary chemical parameters, the Abraham parameters (AP), used by many pp-LFERs. It employs quantum chemical calculations and uses only the chemical's molecular structure. The method computes the Abraham E parameter using density functional theory computed molecular polarizability and the Clausius-Mossotti equation relating the index refraction to the molecular polarizability, estimates the Abraham V as the COSMO calculated molecular volume, and computes the remaining AP S, A, and B jointly with a multiple linear regression using sixty-five solvent-water partition coefficients computed using the quantum mechanical COSMO-SAC solvation model. These solute parameters, referred to as Quantum Chemically estimated Abraham Parameters (QCAP), are further adjusted by fitting to experimentally based APs using QCAP parameters as the independent variables so that they are compatible with existing Abraham pp-LFERs. QCAP and adjusted QCAP for 1827 neutral chemicals are included. For 24 solvent-water systems including octanol-water, predicted log solvent-water partition coefficients using adjusted QCAP have the smallest root-mean-square errors (RMSEs, 0.314-0.602) compared to predictions made using APs estimated using the molecular fragment based method ABSOLV (0.45-0.716). For munition and munition-like compounds, adjusted QCAP has much lower RMSE (0.860) than does ABSOLV (4.45) which essentially fails for these compounds.
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Age estimation standards for a Western Australian population using the coronal pulp cavity index.
Karkhanis, Shalmira; Mack, Peter; Franklin, Daniel
2013-09-10
Age estimation is a vital aspect in creating a biological profile and aids investigators by narrowing down potentially matching identities from the available pool. In addition to routine casework, in the present global political scenario, age estimation in living individuals is required in cases of refugees, asylum seekers, human trafficking and to ascertain age of criminal responsibility. Thus robust methods that are simple, non-invasive and ethically viable are required. The aim of the present study is, therefore, to test the reliability and applicability of the coronal pulp cavity index method, for the purpose of developing age estimation standards for an adult Western Australian population. A total of 450 orthopantomograms (220 females and 230 males) of Australian individuals were analyzed. Crown and coronal pulp chamber heights were measured in the mandibular left and right premolars, and the first and second molars. These measurements were then used to calculate the tooth coronal index. Data was analyzed using paired sample t-tests to assess bilateral asymmetry followed by simple linear and multiple regressions to develop age estimation models. The most accurate age estimation based on simple linear regression model was with mandibular right first molar (SEE ±8.271 years). Multiple regression models improved age prediction accuracy considerably and the most accurate model was with bilateral first and second molars (SEE ±6.692 years). This study represents the first investigation of this method in a Western Australian population and our results indicate that the method is suitable for forensic application. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Ranade, A K; Pandey, M; Datta, D
2013-01-01
A study was conducted to evaluate the absorbed rate coefficient of (238)U, (232)Th, (40)K and (137)Cs present in soil. A total of 31 soil samples and the corresponding terrestrial dose rates at 1 m from different locations were taken around the Anushaktinagar region, where the litho-logy is dominated by red soil. A linear regression model was developed for the estimation of these factors. The estimated coefficients (nGy h(-1) Bq(-1) kg(-1)) were 0.454, 0.586, 0.035 and 0.392, respectively. The factors calculated were in good agreement with the literature values.
Determination of organic compounds in water using ultraviolet LED
NASA Astrophysics Data System (ADS)
Kim, Chihoon; Ji, Taeksoo; Eom, Joo Beom
2018-04-01
This paper describes a method of detecting organic compounds in water using an ultraviolet LED (280 nm) spectroscopy system and a photodetector. The LED spectroscopy system showed a high correlation between the concentration of the prepared potassium hydrogen phthalate and that calculated by multiple linear regression, indicating an adjusted coefficient of determination ranging from 0.953-0.993. In addition, a comparison between the performance of the spectroscopy system and the total organic carbon analyzer indicated that the difference in concentration was small. Based on the close correlation between the spectroscopy and photodetector absorbance values, organic measurement with a photodetector could be configured for monitoring.
Hager, S.W.; Harmon, D.D.; Alpine, A.E.
1984-01-01
Particulate nitrogen (PN) and chlorophyll a (Chla) were measured in the northern reach of San Francisco Bay throughout 1980. The PN values were calculated as the differences between unfiltered and filtered (0??4 ??m) samples analyzed using the UV-catalyzed peroxide digestion method. The Chla values were measured spectrophotometrically, with corrections made for phaeopigments. The plot of all PN Chla data was found to be non-linear, and the concentration of suspended particulate matter (SPM) was found to be the best selector for linear subsets of the data. The best-fit slopes of PN Chla plots, as determined by linear regression (model II), were interpreted to be the N: Chla ratios of phytoplankton. The Y-intercepts of the regression lines were considered to represent easily-oxidizable detrital nitrogen (EDN). In clear water ( < 10 mg l-1 SPM), the N: Chla ratio was 1??07 ??g-at N per ??g Chla. It decreased to 0??60 in the 10-18 mg l-1 range and averaged 0??31 in the remaining four ranges (18-35, 35-65, 65-155, and 155-470 mg l-1). The EDN values were less than 1 ??g-at N l-1 in the clear water and increased monotonically to almost 12 ??g-at N l-1 in the highest SPM range. The N: Chla ratios for the four highest SPM ranges agree well with data for phytoplankton in light-limited cultures. In these ranges, phytoplankton-N averaged only 20% of the PN, while EDN averaged 39% and refractory-N 41%. ?? 1984.
Wolf, Dominik; Fischer, Florian U; Scheurich, Armin; Fellgiebel, Andreas
2015-01-01
Cerebral amyloid-β accumulation and changes in white matter (WM) microstructure are imaging characteristics in clinical Alzheimer's disease and have also been reported in cognitively healthy older adults. However, the relationship between amyloid deposition and WM microstructure is not well understood. Here, we investigated the impact of quantitative cerebral amyloid load on WM microstructure in a group of cognitively healthy older adults. AV45-positron emission tomography and diffusion tensor imaging (DTI) scans of forty-four participants (age-range: 60 to 89 years) from the Alzheimer's Disease Neuroimaging Initiative were analyzed. Fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (DR), and axial diffusivity (DA) were calculated to characterize WM microstructure. Regression analyses demonstrated non-linear (quadratic) relationships between amyloid deposition and FA, MD, as well as RD in widespread WM regions. At low amyloid burden, higher deposition was associated with increased FA as well as decreased MD and DR. At higher amyloid burden, higher deposition was associated with decreased FA as well as increased MD and DR. Additional regression analyses demonstrated an interaction effect between amyloid load and global WM FA, MD, DR, and DA on cognition, suggesting that cognition is only affected when amyloid is increasing and WM integrity is decreasing. Thus, increases in FA and decreases in MD and RD with increasing amyloid load at low levels of amyloid burden may indicate compensatory processes that preserve cognitive functioning. Potential mechanisms underlying the observed non-linear association between amyloid deposition and DTI metrics of WM microstructure are discussed.
Hernández Alava, Mónica; Wailoo, Allan; Wolfe, Fred; Michaud, Kaleb
2014-10-01
Analysts frequently estimate health state utility values from other outcomes. Utility values like EQ-5D have characteristics that make standard statistical methods inappropriate. We have developed a bespoke, mixture model approach to directly estimate EQ-5D. An indirect method, "response mapping," first estimates the level on each of the 5 dimensions of the EQ-5D and then calculates the expected tariff score. These methods have never previously been compared. We use a large observational database from patients with rheumatoid arthritis (N = 100,398). Direct estimation of UK EQ-5D scores as a function of the Health Assessment Questionnaire (HAQ), pain, and age was performed with a limited dependent variable mixture model. Indirect modeling was undertaken with a set of generalized ordered probit models with expected tariff scores calculated mathematically. Linear regression was reported for comparison purposes. Impact on cost-effectiveness was demonstrated with an existing model. The linear model fits poorly, particularly at the extremes of the distribution. The bespoke mixture model and the indirect approaches improve fit over the entire range of EQ-5D. Mean average error is 10% and 5% lower compared with the linear model, respectively. Root mean squared error is 3% and 2% lower. The mixture model demonstrates superior performance to the indirect method across almost the entire range of pain and HAQ. These lead to differences in cost-effectiveness of up to 20%. There are limited data from patients in the most severe HAQ health states. Modeling of EQ-5D from clinical measures is best performed directly using the bespoke mixture model. This substantially outperforms the indirect method in this example. Linear models are inappropriate, suffer from systematic bias, and generate values outside the feasible range. © The Author(s) 2013.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.
Haoliang Yuan; Yuan Yan Tang
2017-04-01
Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Finite Element Vibration Modeling and Experimental Validation for an Aircraft Engine Casing
NASA Astrophysics Data System (ADS)
Rabbitt, Christopher
This thesis presents a procedure for the development and validation of a theoretical vibration model, applies this procedure to a pair of aircraft engine casings, and compares select parameters from experimental testing of those casings to those from a theoretical model using the Modal Assurance Criterion (MAC) and linear regression coefficients. A novel method of determining the optimal MAC between axisymmetric results is developed and employed. It is concluded that the dynamic finite element models developed as part of this research are fully capable of modelling the modal parameters within the frequency range of interest. Confidence intervals calculated in this research for correlation coefficients provide important information regarding the reliability of predictions, and it is recommended that these intervals be calculated for all comparable coefficients. The procedure outlined for aligning mode shapes around an axis of symmetry proved useful, and the results are promising for the development of further optimization techniques.
Stevanović, Nikola R; Perušković, Danica S; Gašić, Uroš M; Antunović, Vesna R; Lolić, Aleksandar Đ; Baošić, Rada M
2017-03-01
The objectives of this study were to gain insights into structure-retention relationships and to propose the model to estimating their retention. Chromatographic investigation of series of 36 Schiff bases and their copper(II) and nickel(II) complexes was performed under both normal- and reverse-phase conditions. Chemical structures of the compounds were characterized by molecular descriptors which are calculated from the structure and related to the chromatographic retention parameters by multiple linear regression analysis. Effects of chelation on retention parameters of investigated compounds, under normal- and reverse-phase chromatographic conditions, were analyzed by principal component analysis, quantitative structure-retention relationship and quantitative structure-activity relationship models were developed on the basis of theoretical molecular descriptors, calculated exclusively from molecular structure, and parameters of retention and lipophilicity. Copyright © 2016 John Wiley & Sons, Ltd.
Genc, D Deniz; Yesilyurt, Canan; Tuncel, Gurdal
2010-07-01
Spatial and temporal variations in concentrations of CO, NO, NO(2), SO(2), and PM(10), measured between 1999 and 2000, at traffic-impacted and residential stations in Ankara were investigated. Air quality in residential areas was found to be influenced by traffic activities in the city. Pollutant ratios were proven to be reliable tracers to differentiate between different sources. Air pollution index (API) of the whole city was calculated to evaluate the level of air quality in Ankara. Multiple linear regression model was developed for forecasting API in Ankara. The correlation coefficients were found to be 0.79 and 0.63 for different time periods. The assimilative capacity of Ankara atmosphere was calculated in terms of ventilation coefficient (VC). The relation between API and VC was investigated and found that the air quality in Ankara was determined by meteorology rather than emissions.
Primordial helium abundance determination using sulphur as metallicity tracer
NASA Astrophysics Data System (ADS)
Fernández, Vital; Terlevich, Elena; Díaz, Angeles I.; Terlevich, Roberto; Rosales-Ortega, F. F.
2018-05-01
The primordial helium abundance YP is calculated using sulphur as metallicity tracer in the classical methodology (with YP as an extrapolation of Y to zero metals). The calculated value, YP, S = 0.244 ± 0.006, is in good agreement with the estimate from the Planck experiment, as well as, determinations in the literature using oxygen as the metallicity tracer. The chemical analysis includes the sustraction of the nebular continuum and of the stellar continuum computed from simple stellar population synthesis grids. The S+2 content is measured from the near infrared [SIII]λλ9069Å, 9532Å lines, while an ICF(S3 +) is proposed based on the Ar3 +/Ar2 + fraction. Finally, we apply a multivariable linear regression using simultaneously oxygen, nitrogen and sulphur abundances for the same sample to determine the primordial helium abundance resulting in YP - O, N, S = 0.245 ± 0.007.
Measurement of left ventricular mass in vivo using gated nuclear magnetic resonance imaging.
Florentine, M S; Grosskreutz, C L; Chang, W; Hartnett, J A; Dunn, V D; Ehrhardt, J C; Fleagle, S R; Collins, S M; Marcus, M L; Skorton, D J
1986-07-01
Alterations of left ventricular mass occur in a variety of congenital and acquired heart diseases. In vivo determination of left ventricular mass, using several different techniques, has been previously reported. Problems inherent in some previous methods include the use of ionizing radiation, complicated geometric assumptions and invasive techniques. We tested the ability of gated nuclear magnetic resonance imaging to determine in vivo left ventricular mass in animals. By studying both dogs (n = 9) and cats (n = 2) of various sizes, a broad range of left ventricular mass (7 to 133 g) was examined. With a 0.5 tesla superconducting nuclear magnetic resonance imaging system the left ventricle was imaged in the transaxial plane and multiple adjacent 10 mm thick slices were obtained. Endocardial and epicardial edges were manually traced in each computer-displayed image. The wall area of each image was determined by computer and the areas were summed and multiplied by the slice thickness and the specific gravity of muscle, providing calculated left ventricular mass. Calculated left ventricular mass was compared with actual postmortem left ventricular mass using linear regression analysis. An excellent relation between calculated and actual mass was found (r = 0.95; SEE = 13.1 g; regression equation: magnetic resonance mass = 0.95 X actual mass + 14.8 g). Intraobserver and interobserver reproducibility were also excellent (r = 0.99). Thus, gated nuclear magnetic resonance imaging can accurately determine in vivo left ventricular mass in anesthetized animals.
1994-09-01
Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System
1989-09-01
residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?
ERIC Educational Resources Information Center
Blankmeyer, Eric
2006-01-01
Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method
ERIC Educational Resources Information Center
Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev
2018-01-01
The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Ridge: a computer program for calculating ridge regression estimates
Donald E. Hilt; Donald W. Seegrist
1977-01-01
Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
NASA Technical Reports Server (NTRS)
Sumnall, Matthew; Peduzzi, Alicia; Fox, Thomas R.; Wynne, Randolph H.; Thomas, Valerie A.; Cook, Bruce
2016-01-01
Leaf area is an important forest structural variable which serves as the primary means of mass and energy exchange within vegetated ecosystems. The objective of the current study was to determine if leaf area index (LAI) could be estimated accurately and consistently in five intensively managed pine plantation forests using two multiple-return airborne LiDAR datasets. Field measurements of LAI were made using the LiCOR LAI2000 and LAI2200 instruments within 116 plots were established of varying size and within a variety of stand conditions (i.e. stand age, nutrient regime and stem density) in North Carolina and Virginia in 2008 and 2013. A number of common LiDAR return height and intensity distribution metrics were calculated (e.g. average return height), in addition to ten indices, with two additional variants, utilized in the surrounding literature which have been used to estimate LAI and fractional cover, were calculated from return heights and intensity, for each plot extent. Each of the indices was assessed for correlation with each other, and was used as independent variables in linear regression analysis with field LAI as the dependent variable. All LiDAR derived metrics were also entered into a forward stepwise linear regression. The results from each of the indices varied from an R2 of 0.33 (S.E. 0.87) to 0.89 (S.E. 0.36). Those indices calculated using ratios of all returns produced the strongest correlations, such as the Above and Below Ratio Index (ABRI) and Laser Penetration Index 1 (LPI1). The regression model produced from a combination of three metrics did not improve correlations greatly (R2 0.90; S.E. 0.35). The results indicate that LAI can be predicted over a range of intensively managed pine plantation forest environments accurately when using different LiDAR sensor designs. Those indices which incorporated counts of specific return numbers (e.g. first returns) or return intensity correlated poorly with field measurements. There were disparities between the number of different types of returns and intensity values when comparing the results from two LiDAR sensors, indicating that predictive models developed using such metrics are not transferable between datasets with different acquisition parameters. Each of the indices were significantly correlated with one another, with one exception (LAI proxy), in particular those indices calculated from all returns, which indicates similarities in information content for those indices. It can then be argued that LiDAR indices have reached a similar stage in development to those calculated from optical-spectral sensors, but which offer a number of advantages, such as the reduction or removal of saturation issues in areas of high biomass.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price
NASA Astrophysics Data System (ADS)
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
An operational definition of a statistically meaningful trend.
Bryhn, Andreas C; Dimberg, Peter H
2011-04-28
Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
NASA Astrophysics Data System (ADS)
Diamond, D. H.; Heyns, P. S.; Oberholster, A. J.
2016-12-01
The measurement of instantaneous angular speed is being increasingly investigated for its use in a wide range of condition monitoring and prognostic applications. Central to many measurement techniques are incremental shaft encoders recording the arrival times of shaft angular increments. The conventional approach to processing these signals assumes that the angular increments are equidistant. This assumption is generally incorrect when working with toothed wheels and especially zebra tape encoders and has been shown to introduce errors in the estimated shaft speed. There are some proposed methods in the literature that aim to compensate for this geometric irregularity. Some of the methods require the shaft speed to be perfectly constant for calibration, something rarely achieved in practice. Other methods assume the shaft speed to be nearly constant with minor deviations. Therefore existing methods cannot calibrate the entire shaft encoder geometry for arbitrary shaft speeds. The present article presents a method to calculate the shaft encoder geometry for arbitrary shaft speed profiles. The method uses Bayesian linear regression to calculate the encoder increment distances. The method is derived and then tested against simulated and laboratory experiments. The results indicate that the proposed method is capable of accurately determining the shaft encoder geometry for any shaft speed profile.
Rothenberg, Stephen J.; Rothenberg, Jesse C.
2005-01-01
Statistical evaluation of the dose–response function in lead epidemiology is rarely attempted. Economic evaluation of health benefits of lead reduction usually assumes a linear dose–response function, regardless of the outcome measure used. We reanalyzed a previously published study, an international pooled data set combining data from seven prospective lead studies examining contemporaneous blood lead effect on IQ (intelligence quotient) of 7-year-old children (n = 1,333). We constructed alternative linear multiple regression models with linear blood lead terms (linear–linear dose response) and natural-log–transformed blood lead terms (log-linear dose response). We tested the two lead specifications for nonlinearity in the models, compared the two lead specifications for significantly better fit to the data, and examined the effects of possible residual confounding on the functional form of the dose–response relationship. We found that a log-linear lead–IQ relationship was a significantly better fit than was a linear–linear relationship for IQ (p = 0.009), with little evidence of residual confounding of included model variables. We substituted the log-linear lead–IQ effect in a previously published health benefits model and found that the economic savings due to U.S. population lead decrease between 1976 and 1999 (from 17.1 μg/dL to 2.0 μg/dL) was 2.2 times ($319 billion) that calculated using a linear–linear dose–response function ($149 billion). The Centers for Disease Control and Prevention action limit of 10 μg/dL for children fails to protect against most damage and economic cost attributable to lead exposure. PMID:16140626
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Ruan, Xiaofang; Zhang, Ruisheng; Yao, Xiaojun; Liu, Mancang; Fan, Botao
2007-03-01
Alkylphenols are a group of permanent pollutants in the environment and could adversely disturb the human endocrine system. It is therefore important to effectively separate and measure the alkylphenols. To guide the chromatographic analysis of these compounds in practice, the development of quantitative relationship between the molecular structure and the retention time of alkylphenols becomes necessary. In this study, topological, constitutional, geometrical, electrostatic and quantum-chemical descriptors of 44 alkylphenols were calculated using a software, CODESSA, and these descriptors were pre-selected using the heuristic method. As a result, three-descriptor linear model (LM) was developed to describe the relationship between the molecular structure and the retention time of alkylphenols. Meanwhile, the non-linear regression model was also developed based on support vector machine (SVM) using the same three descriptors. The correlation coefficient (R(2)) for the LM and SVM was 0.98 and 0. 92, and the corresponding root-mean-square error was 0. 99 and 2. 77, respectively. By comparing the stability and prediction ability of the two models, it was found that the linear model was a better method for describing the quantitative relationship between the retention time of alkylphenols and the molecular structure. The results obtained suggested that the linear model could be applied for the chromatographic analysis of alkylphenols with known molecular structural parameters.
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
ERIC Educational Resources Information Center
Rule, David L.
Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
[Comparison of red edge parameters of winter wheat canopy under late frost stress].
Wu, Yong-feng; Hu, Xin; Lü, Guo-hua; Ren, De-chao; Jiang, Wei-guo; Song, Ji-qing
2014-08-01
In the present study, late frost experiments were implemented under a range of subfreezing temperatures (-1 - -9 degrees C) by using a field movable climate chamber (FMCC) and a cold climate chamber, respectively. Based on the spectra of winter wheat canopy measured at noon on the first day after the frost experiments, red edge parameters REP, Dr, SDr, Dr(min), Dr/Dr(min) and Dr/SDr were extracted using maximum first derivative spectrum method (FD), linear four-point interpolation method (FPI), polynomial fitting method (POLY), inverted Gaussian fitting method (IG) and linear extrapolation technique (LE), respectively. The capacity of the red edge parameters to detect late frost stress was explicated from the aspects of the early, sensitivity and stability through correlation analysis, linear regression modeling and fluctuation analysis. The result indicates that except for REP calculated from FPI and IG method in Experiment 1, REP from the other methods was correlated with frost temperatures (P < 0.05). Thereinto, significant levels (P) of POLY and LE methods all reached 0.01. Except for POLY method in Experiment 2, Dr/SDr from the other methods were all significantly correlated with frost temperatures (P < 0.01). REP showed a trend to shift to short-wave band with decreasing temperatures. The lower the temperature, the more obvious the trend is. Of all the REP, REP calculated by LE method had the highest correlation with frost temperatures which indicated that LE method is the best for REP extraction. In Experiment 1 and 2, only Dr(min) and Dr/Dr(min), calculated by FD method simultaneously achieved the requirements for the early (their correlations with frost temperatures showed a significant level P < 0.01), sensitivity (abso- lute value of the slope of fluctuation coefficient is greater than 2.0) and stability (their correlations with frost temperatures al- ways keep a consistent direction). Dr/SDr calculated from FD and IG methods always had a low sensitivity in Experiment 2. In Experiment 1, the sensitivity of Dr/SDr from FD was moderate and IG was high. REP calculated from LE method had a lowest sensitivity in the two experiments. Totally, Dr(min) and Dr/Dr(min) calculated by FD method have the strongest detection capacity for frost temperature, which will be helpful to conducting the research on early diagnosis of late frost injury to winter wheat.
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance
1989-12-01
v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
1990-03-01
and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
ERIC Educational Resources Information Center
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course
ERIC Educational Resources Information Center
Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna
2010-01-01
Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
ERIC Educational Resources Information Center
Nelson, Dean
2009-01-01
Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Weak acid-concentration Atot and dissociation constant Ka of plasma proteins in racehorses.
Stampfli, H R; Misiaszek, S; Lumsden, J H; Carlson, G P; Heigenhauser, G J
1999-07-01
The plasma proteins are a significant contributor to the total weak acid concentration as a net anionic charge. Due to potential species difference, species-specific values must be confirmed for the weak acid anionic concentrations of proteins (Atot) and the effective dissociation constant for plasma weak acids (Ka). We studied the net anion load Atot of equine plasma protein in 10 clinically healthy mature Standardbred horses. A multi-step titration procedure, using a tonometer covering a titration range of PCO2 from 25 to 145 mmHg at 37 degrees C, was applied on the plasma of these 10 horses. Blood gases (pH, PCO2) and electrolytes required to calculate the strong ion difference ([SID] = [(Na(+) + K(+) + Ca(2+) + Mg(2+))-(Cl(-) + Lac(-) + PO4(2-))]) were simultaneously measured over a physiological pH range from 6.90-7.55. A nonlinear regression iteration to determine Atot and Ka was performed using polygonal regression curve fitting applied to the electrical neutrality equation of the physico-chemical system. The average anion-load Atot for plasma protein of 10 Standardbred horses was 14.89 +/- 0.8 mEq/l plasma and Ka was 2.11 +/- 0.50 x 10(-7) Eq/l (pKa = 6.67). The derived conversion factor (iterated Atot concentration/average plasma protein concentration) for calculation of Atot in plasma is 0.21 mEq/g protein (protein-unit: g/l). This value compares closely with the 0.24 mEq/g protein determined by titration of Van Slyke et al. (1928) and 0.22 mEq/g protein recently published by Constable (1997) for horse plasma. The Ka value compares closely with the value experimentally determined by Constable in 1997 (2.22 x 10(7) Eq/l). Linear regression of a set of experimental data from 5 Thoroughbred horses on a treadmill exercise test, showed excellent correlation with the regression lines not different from identity for the calculated and measured variables pH, HCO3 and SID. Knowledge of Atot and Ka for the horse is useful especially in exercise studies and in clinical conditions to quantify the mechanisms of the acid-base disturbances occurring.
Banzato, Tommaso; Fiore, Enrico; Morgante, Massimo; Manuali, Elisabetta; Zotti, Alessandro
2016-10-01
Hepatic lipidosis is the most diffused hepatic disease in the lactating cow. A new methodology to estimate the degree of fatty infiltration of the liver in lactating cows by means of texture analysis of B-mode ultrasound images is proposed. B-mode ultrasonography of the liver was performed in 48 Holstein Friesian cows using standardized ultrasound parameters. Liver biopsies to determine the triacylglycerol content of the liver (TAGqa) were obtained from each animal. A large number of texture parameters were calculated on the ultrasound images by means of a free software. Based on the TAGqa content of the liver, 29 samples were classified as mild (TAGqa<50mg/g), 6 as moderate (50mg/g
Regression of non-linear coupling of noise in LIGO detectors
NASA Astrophysics Data System (ADS)
Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.
2018-03-01
In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
Activity-based differentiation of pathologists' workload in surgical pathology.
Meijer, G A; Oudejans, J J; Koevoets, J J M; Meijer, C J L M
2009-06-01
Adequate budget control in pathology practice requires accurate allocation of resources. Any changes in types and numbers of specimens handled or protocols used will directly affect the pathologists' workload and consequently the allocation of resources. The aim of the present study was to develop a model for measuring the pathologists' workload that can take into account the changes mentioned above. The diagnostic process was analyzed and broken up into separate activities. The time needed to perform these activities was measured. Based on linear regression analysis, for each activity, the time needed was calculated as a function of the number of slides or blocks involved. The total pathologists' time required for a range of specimens was calculated based on standard protocols and validated by comparing to actually measured workload. Cutting up, microscopic procedures and dictating turned out to be highly correlated to number of blocks and/or slides per specimen. Calculated workload per type of specimen was significantly correlated to the actually measured workload. Modeling pathologists' workload based on formulas that calculate workload per type of specimen as a function of the number of blocks and slides provides a basis for a comprehensive, yet flexible, activity-based costing system for pathology.
Two innovative pore pressure calculation methods for shallow deep-water formations
NASA Astrophysics Data System (ADS)
Deng, Song; Fan, Honghai; Liu, Yuhan; He, Yanfeng; Zhang, Shifeng; Yang, Jing; Fu, Lipei
2017-11-01
There are many geological hazards in shallow formations associated with oil and gas exploration and development in deep-water settings. Abnormal pore pressure can lead to water flow and gas and gas hydrate accumulations, which may affect drilling safety. Therefore, it is of great importance to accurately predict pore pressure in shallow deep-water formations. Experience over previous decades has shown, however, that there are not appropriate pressure calculation methods for these shallow formations. Pore pressure change is reflected closely in log data, particularly for mudstone formations. In this paper, pore pressure calculations for shallow formations are highlighted, and two concrete methods using log data are presented. The first method is modified from an E. Philips test in which a linear-exponential overburden pressure model is used. The second method is a new pore pressure method based on P-wave velocity that accounts for the effect of shallow gas and shallow water flow. Afterwards, the two methods are validated using case studies from two wells in the Yingqiong basin. Calculated results are compared with those obtained by the Eaton method, which demonstrates that the multi-regression method is more suitable for quick prediction of geological hazards in shallow layers.
Use of bedside ultrasound to assess degree of dehydration in children with gastroenteritis.
Chen, Lei; Hsiao, Allen; Langhan, Melissa; Riera, Antonio; Santucci, Karen A
2010-10-01
Prospectively identifying children with significant dehydration from gastroenteritis is difficult in acute care settings. Previous work by our group has shown that bedside ultrasound (US) measurement of the inferior vena cava (IVC) and the aorta (Ao) diameter ratio is correlated with intravascular volume. This study was designed to validate the use of this method in the prospective identification of children with dehydration by investigating whether the IVC/Ao ratio correlated with dehydration in children with acute gastroenteritis. Another objective was to investigate the interrater reliability of the IVC/Ao measurements. A prospective observational study was carried out in a pediatric emergency department (PED) between November 2007 and June 2009. Children with acute gastroenteritis were enrolled as subjects. A pair of investigators obtained transverse images of the IVC and Ao using bedside US. The ratio of IVC and Ao diameters (IVC/Ao) was calculated. Subjects were asked to return after resolution of symptoms. The difference between the convalescent weight and ill weight was used to calculate the degree of dehydration. Greater than or equal to 5% difference was judged to be significant. Linear regression was performed with dehydration as the dependent variable and the IVC/Ao as the independent variable. Pearson's correlation coefficient was calculated to assess the degree of agreement between observers. A total of 112 subjects were enrolled. Seventy-one subjects (63%) completed follow-up. Twenty-eight subjects (39%) had significant dehydration. The linear regression model resulted in an R² value of 0.21 (p < 0.001) and a slope (B) of 0.11 (95% confidence interval [CI] = 0.08 to 0.14). An IVC/Ao cutoff of 0.8 produced a sensitivity of 86% and a specificity of 56% for the diagnosis of significant dehydration. Forty-eight paired measurements of IVC/Ao ratios were made. The Pearson correlation coefficient was 0.76. In this pilot study the ratio of IVC to Ao diameters, as measured by bedside US, was a marginally accurate measurement of acute weight loss in children with dehydration from gastroenteritis. The technique demonstrated good interrater reliability. © 2010 by the Society for Academic Emergency Medicine.
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X
2016-09-01
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D
2017-07-01
Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.
Khanal, Laxman; Shah, Sandip; Koirala, Sarun
2017-03-01
Length of long bones is taken as an important contributor for estimating one of the four elements of forensic anthropology i.e., stature of the individual. Since physical characteristics of the individual differ among different groups of population, population specific studies are needed for estimating the total length of femur from its segment measurements. Since femur is not always recovered intact in forensic cases, it was the aim of this study to derive regression equations from measurements of proximal and distal fragments in Nepalese population. A cross-sectional study was done among 60 dry femora (30 from each side) without sex determination in anthropometry laboratory. Along with maximum femoral length, four proximal and four distal segmental measurements were measured following the standard method with the help of osteometric board, measuring tape and digital Vernier's caliper. Bones with gross defects were excluded from the study. Measured values were recorded separately for right and left side. Statistical Package for Social Science (SPSS version 11.5) was used for statistical analysis. The value of segmental measurements were different between right and left side but statistical difference was not significant except for depth of medial condyle (p=0.02). All the measurements were positively correlated and found to have linear relationship with the femoral length. With the help of regression equation, femoral length can be calculated from the segmental measurements; and then femoral length can be used to calculate the stature of the individual. The data collected may contribute in the analysis of forensic bone remains in study population.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES
Zhu, Liping; Huang, Mian; Li, Runze
2012-01-01
This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Lee, I-Te; Chen, Chen-Huan; Wang, Jun-Sing; Fu, Chia-Po; Lee, Wen-Jane; Liang, Kae-Woei; Lin, Shih-Yi; Sheu, Wayne Huey-Herng
2018-01-01
Arterial stiffening blunts postprandial vasodilatation. We hypothesized that brain-derived neurotrophic factor (BDNF) may modulate postprandial central pulse pressure, a surrogate marker for arterial stiffening. A total of 82 non-diabetic subjects received a 75-g oral glucose tolerance test (OGTT) after overnight fasting. Serum BDNF concentrations were determined at 0, 30, and 120min to calculate the area under the curve (AUC). Brachial and central blood pressures were measured using a noninvasive central blood pressure monitor before blood withdrawals at 0 and 120min. With the median AUC of BDNF of 45(ng/ml)∗h as the cutoff value, the central pulse pressure after glucose intake was significantly higher in the subjects with a low BDNF than in those with a high BDNF (63±16 vs. 53±11mmHg, P=0.003), while the brachial pulse pressure was not significantly different between the 2 groups (P=0.099). In a multivariate linear regression model, a lower AUC of BDNF was an independent predictor of a higher central pulse pressure after oral glucose intake (linear regression coefficient-0.202, 95% confidence interval-0.340 to -0.065, P=0.004). After oral glucose challenge, a lower serum BDNF response is significantly associated with a higher central pulse pressure. Copyright © 2017 Elsevier B.V. All rights reserved.
Measurement of effective air diffusion coefficients for trichloroethene in undisturbed soil cores.
Bartelt-Hunt, Shannon L; Smith, James A
2002-06-01
In this study, we measure effective diffusion coefficients for trichloroethene in undisturbed soil samples taken from Picatinny Arsenal, New Jersey. The measured effective diffusion coefficients ranged from 0.0053 to 0.0609 cm2/s over a range of air-filled porosity of 0.23-0.49. The experimental data were compared to several previously published relations that predict diffusion coefficients as a function of air-filled porosity and porosity. A multiple linear regression analysis was developed to determine if a modification of the exponents in Millington's [Science 130 (1959) 100] relation would better fit the experimental data. The literature relations appeared to generally underpredict the effective diffusion coefficient for the soil cores studied in this work. Inclusion of a particle-size distribution parameter, d10, did not significantly improve the fit of the linear regression equation. The effective diffusion coefficient and porosity data were used to recalculate estimates of diffusive flux through the subsurface made in a previous study performed at the field site. It was determined that the method of calculation used in the previous study resulted in an underprediction of diffusive flux from the subsurface. We conclude that although Millington's [Science 130 (1959) 100] relation works well to predict effective diffusion coefficients in homogeneous soils with relatively uniform particle-size distributions, it may be inaccurate for many natural soils with heterogeneous structure and/or non-uniform particle-size distributions.
Predictability of depression severity based on posterior alpha oscillations.
Jiang, H; Popov, T; Jylänki, P; Bi, K; Yao, Z; Lu, Q; Jensen, O; van Gerven, M A J
2016-04-01
We aimed to integrate neural data and an advanced machine learning technique to predict individual major depressive disorder (MDD) patient severity. MEG data was acquired from 22 MDD patients and 22 healthy controls (HC) resting awake with eyes closed. Individual power spectra were calculated by a Fourier transform. Sources were reconstructed via beamforming technique. Bayesian linear regression was applied to predict depression severity based on the spatial distribution of oscillatory power. In MDD patients, decreased theta (4-8 Hz) and alpha (8-14 Hz) power was observed in fronto-central and posterior areas respectively, whereas increased beta (14-30 Hz) power was observed in fronto-central regions. In particular, posterior alpha power was negatively related to depression severity. The Bayesian linear regression model showed significant depression severity prediction performance based on the spatial distribution of both alpha (r=0.68, p=0.0005) and beta power (r=0.56, p=0.007) respectively. Our findings point to a specific alteration of oscillatory brain activity in MDD patients during rest as characterized from MEG data in terms of spectral and spatial distribution. The proposed model yielded a quantitative and objective estimation for the depression severity, which in turn has a potential for diagnosis and monitoring of the recovery process. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Vasilkova, Olga; Mokhort, Tatiana; Sanec, Igor; Sharshakova, Tamara; Hayashida, Naomi; Takamura, Noboru
2011-01-01
Although many reports have elucidated pathophysiological characteristics of abnormal bone metabolism in patients with type 2 diabetes mellitus (DT2), determinants of bone mineral density (BMD) in patients with DT2 are still controversial. We examined 168 Belarussian men 45-60 years of age. Plasma total cholesterol (TC), high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, very low-density lipoprotein cholesterol, triglycerides, hemoglobin A(1c) (HbA(1c)), immunoreactive insulin, and C-reactive protein concentrations were assessed. BMD was measured using dual energy X-ray densitometry of the lumbar spine (L(1)-L(4)). Total testosterone (TT) and sex hormone-binding globulin were measured, and free testosterone (FT) was calculated. Using univariate linear regression analysis, BMD of the lumbar spine was significantly correlated with FT (r=0.32, p<0.01) and TT (r=0.36, p<0.01). Using multiple linear regression analysis adjusted for confounding factors, BMD was significantly correlated with TT (β=0.23, p<0.001) and TC (β=-0.029, p=0.005). Age (β=0.005, p=0.071), body mass index (β=0.005, p=0.053), HbA(1c) (β=-0.002, p=0.72) and duration of diabetes (β=0.001, p=0.62) were not significantly correlated with BMD. Our data indicate that androgens are independent determinants of BMD in male patients with DT2.
Durand, Casey P
2013-01-01
Statistical interactions are a common component of data analysis across a broad range of scientific disciplines. However, the statistical power to detect interactions is often undesirably low. One solution is to elevate the Type 1 error rate so that important interactions are not missed in a low power situation. To date, no study has quantified the effects of this practice on power in a linear regression model. A Monte Carlo simulation study was performed. A continuous dependent variable was specified, along with three types of interactions: continuous variable by continuous variable; continuous by dichotomous; and dichotomous by dichotomous. For each of the three scenarios, the interaction effect sizes, sample sizes, and Type 1 error rate were varied, resulting in a total of 240 unique simulations. In general, power to detect the interaction effect was either so low or so high at α = 0.05 that raising the Type 1 error rate only served to increase the probability of including a spurious interaction in the model. A small number of scenarios were identified in which an elevated Type 1 error rate may be justified. Routinely elevating Type 1 error rate when testing interaction effects is not an advisable practice. Researchers are best served by positing interaction effects a priori and accounting for them when conducting sample size calculations.
Cummings, Kristin J.; Cox-Ganser, Jean; Riggs, Margaret A.; Edwards, Nicole; Hobbs, Gerald R.; Kreiss, Kathleen
2008-01-01
Objectives. We investigated the relation between respiratory symptoms and exposure to water-damaged homes and the effect of respirator use in posthurricane New Orleans, Louisiana. Methods. We randomly selected 600 residential sites and then interviewed 1 adult per site. We created an exposure variable, calculated upper respiratory symptom (URS) and lower respiratory symptom (LRS) scores, and defined exacerbation categories by the effect on symptoms of being inside water-damaged homes. We used multiple linear regression to model symptom scores (for all participants) and polytomous logistic regression to model exacerbation of symptoms when inside (for those participating in clean-up). Results. Of 553 participants (response rate=92%), 372 (68%) had participated in clean-up; 233 (63%) of these used a respirator. Respiratory symptom scores increased linearly with exposure (P<.05 for trend). Disposable-respirator use was associated with lower odds of exacerbation of moderate or severe symptoms inside water-damaged homes for URS (odds ratio (OR)=.51; 95% confidence interval (CI)=0.24, 1.09) and LRS (OR=0.33; 95% CI=0.13, 0.83). Conclusions. Respiratory symptoms were positively associated with exposure to water-damaged homes, including exposure limited to being inside without participating in clean-up. Respirator use had a protective effect and should be considered when inside water-damaged homes regardless of activities undertaken. PMID:18381997
Zhao, Guangju; Mu, Xingmin; Jiao, Juying; Gao, Peng; Sun, Wenyi; Li, Erhui; Wei, Yanhong; Huang, Jiacong
2018-05-23
Understanding the relative contributions of climate change and human activities to variations in sediment load is of great importance for regional soil, and river basin management. Considerable studies have investigated spatial-temporal variation of sediment load within the Loess Plateau; however, contradictory findings exist among methods used. This study systematically reviewed six quantitative methods: simple linear regression, double mass curve, sediment identity factor analysis, dam-sedimentation based method, the Sediment Delivery Distributed (SEDD) model, and the Soil Water Assessment Tool (SWAT) model. The calculation procedures and merits for each method were systematically explained. A case study in the Huangfuchuan watershed on the northern Loess Plateau has been undertaken. The results showed that sediment load had been reduced by 70.5% during the changing period from 1990 to 2012 compared to that of the baseline period from 1955 to 1989. Human activities accounted for an average of 93.6 ± 4.1% of the total decline in sediment load, whereas climate change contributed 6.4 ± 4.1%. Five methods produced similar estimates, but the linear regression yielded relatively different results. The results of this study provide a good reference for assessing the effects of climate change and human activities on sediment load variation by using different methods. Copyright © 2018. Published by Elsevier B.V.
NASA Astrophysics Data System (ADS)
Xiao, Fan; Chen, Zhijun; Chen, Jianguo; Zhou, Yongzhang
2016-05-01
In this study, a novel batch sliding window (BSW) based singularity mapping approach was proposed. Compared to the traditional sliding window (SW) technique with disadvantages of the empirical predetermination of a fixed maximum window size and outliers sensitivity of least-squares (LS) linear regression method, the BSW based singularity mapping approach can automatically determine the optimal size of the largest window for each estimated position, and utilizes robust linear regression (RLR) which is insensitive to outlier values. In the case study, tin geochemical data in Gejiu, Yunnan, have been processed by BSW based singularity mapping approach. The results show that the BSW approach can improve the accuracy of the calculation of singularity exponent values due to the determination of the optimal maximum window size. The utilization of RLR method in the BSW approach can smoothen the distribution of singularity index values with few or even without much high fluctuate values looking like noise points that usually make a singularity map much roughly and discontinuously. Furthermore, the student's t-statistic diagram indicates a strong spatial correlation between high geochemical anomaly and known tin polymetallic deposits. The target areas within high tin geochemical anomaly could probably have much higher potential for the exploration of new tin polymetallic deposits than other areas, particularly for the areas that show strong tin geochemical anomalies whereas no tin polymetallic deposits have been found in them.
Restoring method for missing data of spatial structural stress monitoring based on correlation
NASA Astrophysics Data System (ADS)
Zhang, Zeyu; Luo, Yaozhi
2017-07-01
Long-term monitoring of spatial structures is of great importance for the full understanding of their performance and safety. The missing part of the monitoring data link will affect the data analysis and safety assessment of the structure. Based on the long-term monitoring data of the steel structure of the Hangzhou Olympic Center Stadium, the correlation between the stress change of the measuring points is studied, and an interpolation method of the missing stress data is proposed. Stress data of correlated measuring points are selected in the 3 months of the season when missing data is required for fitting correlation. Data of daytime and nighttime are fitted separately for interpolation. For a simple linear regression when single point's correlation coefficient is 0.9 or more, the average error of interpolation is about 5%. For multiple linear regression, the interpolation accuracy is not significantly increased after the number of correlated points is more than 6. Stress baseline value of construction step should be calculated before interpolating missing data in the construction stage, and the average error is within 10%. The interpolation error of continuous missing data is slightly larger than that of the discrete missing data. The data missing rate of this method should better not exceed 30%. Finally, a measuring point's missing monitoring data is restored to verify the validity of the method.
Knowledge, Attitude, and Practices Regarding Vector-borne Diseases in Western Jamaica.
Alobuia, Wilson M; Missikpode, Celestin; Aung, Maung; Jolly, Pauline E
2015-01-01
Outbreaks of vector-borne diseases (VBDs) such as dengue and malaria can overwhelm health systems in resource-poor countries. Environmental management strategies that reduce or eliminate vector breeding sites combined with improved personal prevention strategies can help to significantly reduce transmission of these infections. The aim of this study was to assess the knowledge, attitudes, and practices (KAPs) of residents in western Jamaica regarding control of mosquito vectors and protection from mosquito bites. A cross-sectional study was conducted between May and August 2010 among patients or family members of patients waiting to be seen at hospitals in western Jamaica. Participants completed an interviewer-administered questionnaire on sociodemographic factors and KAPs regarding VBDs. KAP scores were calculated and categorized as high or low based on the number of correct or positive responses. Logistic regression analyses were conducted to identify predictors of KAP and linear regression analysis conducted to determine if knowledge and attitude scores predicted practice scores. In all, 361 (85 men and 276 women) people participated in the study. Most participants (87%) scored low on knowledge and practice items (78%). Conversely, 78% scored high on attitude items. By multivariate logistic regression, housewives were 82% less likely than laborers to have high attitude scores; homeowners were 65% less likely than renters to have high attitude scores. Participants from households with 1 to 2 children were 3.4 times more likely to have high attitude scores compared with those from households with no children. Participants from households with at least 5 people were 65% less likely than those from households with fewer than 5 people to have high practice scores. By multivariable linear regression knowledge and attitude scores were significant predictors of practice score. The study revealed poor knowledge of VBDs and poor prevention practices among participants. It identified specific groups that can be targeted with vector control and personal protection interventions to decrease transmission of the infections. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Statistical analysis of radioimmunoassay. In comparison with bioassay (in Japanese)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakano, R.
1973-01-01
Using the data of RIA (radioimmunoassay), statistical procedures for dealing with two problems of the linearization of dose response curve and calculation of relative potency were described. There were three methods for linearization of dose response curve of RIA. In each method, the following parameters were shown on the horizontal and vertical axis: dose x, (B/T)/sup -1/; c/x + c, B/T (C: dose which makes B/T 50%); log x, logit B/T. Among them, the last method seems to be most practical. The statistical procedures for bioassay were employed for calculating the relative potency of unknown samples compared to the standardmore » samples from dose response curves of standand and unknown samples using regression coefficient. It is desirable that relative potency is calculated by plotting more than 5 points in the standard curve and plotting more than 2 points in unknow samples. For examining the statistical limit of precision of measuremert, LH activity of gonadotropin in urine was measured and relative potency, precision coefficient and the upper and lower limits of relative potency at 95% confidence limit were calculated. On the other hand, bioassay (by the ovarian ascorbic acid reduction method and anteriol lobe of prostate weighing method) was done in the same samples, and the precision was compared with that of RIA. In these examinations, the upper and lower limits of the relative potency at 95% confidence limit were near each other, while in bioassay, a considerable difference was observed between the upper and lower limits. The necessity of standardization and systematization of the statistical procedures for increasing the precision of RIA was pointed out. (JA)« less
Cantekin, Kenan; Sekerci, Ahmet Ercan; Buyuk, Suleyman Kutalmis
2013-12-01
Computed tomography (CT) is capable of providing accurate and measurable 3-dimensional images of the third molar. The aims of this study were to analyze the development of the mandibular third molar and its relation to chronological age and to create new reference data for a group of Turkish participants aged 9 to 25 years on the basis of cone-beam CT images. All data were obtained from the patients' records including medical, social, and dental anamnesis and cone-beam CT images of 752 patients. Linear regression analysis was performed to obtain regression formulas for dental age calculation with chronological age and to determine the coefficient of determination (r) for each sex. Statistical analysis showed a strong correlation between age and third-molar development for the males (r2 = 0.80) and the females (r2 = 0.78). Computed tomographic images are clinically useful for accurate and reliable estimation of dental ages of children and youth.
QSAR Analysis of 2-Amino or 2-Methyl-1-Substituted Benzimidazoles Against Pseudomonas aeruginosa
Podunavac-Kuzmanović, Sanja O.; Cvetković, Dragoljub D.; Barna, Dijana J.
2009-01-01
A set of benzimidazole derivatives were tested for their inhibitory activities against the Gram-negative bacterium Pseudomonas aeruginosa and minimum inhibitory concentrations were determined for all the compounds. Quantitative structure activity relationship (QSAR) analysis was applied to fourteen of the abovementioned derivatives using a combination of various physicochemical, steric, electronic, and structural molecular descriptors. A multiple linear regression (MLR) procedure was used to model the relationships between molecular descriptors and the antibacterial activity of the benzimidazole derivatives. The stepwise regression method was used to derive the most significant models as a calibration model for predicting the inhibitory activity of this class of molecules. The best QSAR models were further validated by a leave one out technique as well as by the calculation of statistical parameters for the established theoretical models. To confirm the predictive power of the models, an external set of molecules was used. High agreement between experimental and predicted inhibitory values, obtained in the validation procedure, indicated the good quality of the derived QSAR models. PMID:19468332
Climate change, weather and road deaths.
Robertson, Leon
2018-06-01
In 2015, a 7% increase in road deaths per population in the USA reversed the 35-year downward trend. Here I test the hypothesis that weather influenced the change in trend. I used linear regression to estimate the effect of temperature and precipitation on miles driven per capita in urbanizedurbanised areas of the USA during 2010. I matched date and county of death with temperature on that date and number of people exposed to that temperature to calculate the risk per persons exposed to specific temperatures. I employed logistic regression analysis of temperature, precipitation and other risk factors prevalent in 2014 to project expected deaths in 2015 among the 100 most populous counties in the USA. Comparison of actual and projected deaths provided an estimate of deaths expected without the temperature increase. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Global correlation of topographic heights and gravity anomalies
NASA Technical Reports Server (NTRS)
Roufosse, M. C.
1977-01-01
The short wavelength features were obtained by subtracting a calculated 24th-degree-and-order field from observed data written in 1 deg x 1 deg squares. The correlation between the two residual fields was examined by a program of linear regression. When run on a worldwide scale over oceans and continents separately, the program did not exhibit any correlation; this can be explained by the fact that the worldwide autocorrelation function for residual gravity anomalies falls off much faster as a function of distance than does that for residual topographic heights. The situation was different when the program was used in restricted areas, of the order of 5 deg x 5 deg square. For 30% of the world,fair-to-good correlations were observed, mostly over continents. The slopes of the regression lines are proportional to apparent densities, which offer a large spectrum of values that are being interpreted in terms of features in the upper mantle consistent with available heat-flow, gravity, and seismic data.
Radar modulation classification using time-frequency representation and nonlinear regression
NASA Astrophysics Data System (ADS)
De Luigi, Christophe; Arques, Pierre-Yves; Lopez, Jean-Marc; Moreau, Eric
1999-09-01
In naval electronic environment, pulses emitted by radars are collected by ESM receivers. For most of them the intrapulse signal is modulated by a particular law. To help the classical identification process, a classification and estimation of this modulation law is applied on the intrapulse signal measurements. To estimate with a good accuracy the time-varying frequency of a signal corrupted by an additive noise, one method has been chosen. This method consists on the Wigner distribution calculation, the instantaneous frequency is then estimated by the peak location of the distribution. Bias and variance of the estimator are performed by computed simulations. In a estimated sequence of frequencies, we assume the presence of false and good estimated ones, the hypothesis of Gaussian distribution is made on the errors. A robust non linear regression method, based on the Levenberg-Marquardt algorithm, is thus applied on these estimated frequencies using a Maximum Likelihood Estimator. The performances of the method are tested by using varied modulation laws and different signal to noise ratios.
Price, James
2015-01-01
Propoxyphene was withdrawn from the US market in November 2010. This drug is still tested for in the workplace as part of expanded panel nonregulated testing. A convenience sample of urine specimens (n = 7838) were provided by workers from various industries. The percentage of positive specimens with 95% confidence intervals was calculated for each year of the study. Logistic regression was used to assess the impact of the year upon the propoxyphene result. The prevalence of positive propoxyphene tests was much higher before the product's withdrawal from the market. Logistic regression provided evidence of a decreasing linear trend (P < 0.000; β = -0.71). The odds ratio signifies that for every additional year the urine specimens were 0.49 times less likely to be positive for propoxyphene. This favors the determination that the change in propoxyphene positive drug test over the years is not by chance. The conclusion supports no longer performing nonregulated workplace propoxyphene urine drug testing for this population.
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Travel behavior of low income older adults and implementation of an accessibility calculator
Moniruzzaman, Md; Chudyk, Anna; Páez, Antonio; Winters, Meghan; Sims-Gould, Joanie; McKay, Heather
2016-01-01
Given the aging demographic landscape, the concept of walkable neighborhoods has emerged as a topic of interest, especially during the last decade. However, we know very little about whether walkable neighborhoods promote walking among older adults, particularly those with lower incomes. Therefore in this paper we: (i) examine the relation between trip distance and sociodemographic attributes and accessibility features of lower income older adults in Metro Vancouver; and, (ii) implement a web-based application to calculate the accessibility of lower income older adults in Metro Vancouver based on their travel behavior. We use multilevel linear regression to estimate the determinants of trip length. We find that in this population distance traveled is associated with gender, living arrangements, and dog ownership. Furthermore, significant geographical variations (measured using a trend surface) were also found. To better visualize the impact of travel behavior on accessibility by personal profile and location, we also implemented a web-based calculator that generates an Accessibility (A)-score using Google Maps API v3 that can be used to evaluate the accessibility of neighborhoods from the perspective of older adults. PMID:27104148
Detection of hypertensive retinopathy using vessel measurements and textural features.
Agurto, Carla; Joshi, Vinayak; Nemeth, Sheila; Soliz, Peter; Barriga, Simon
2014-01-01
Features that indicate hypertensive retinopathy have been well described in the medical literature. This paper presents a new system to automatically classify subjects with hypertensive retinopathy (HR) using digital color fundus images. Our method consists of the following steps: 1) normalization and enhancement of the image; 2) determination of regions of interest based on automatic location of the optic disc; 3) segmentation of the retinal vasculature and measurement of vessel width and tortuosity; 4) extraction of color features; 5) classification of vessel segments as arteries or veins; 6) calculation of artery-vein ratios using the six widest (major) vessels for each category; 7) calculation of mean red intensity and saturation values for all arteries; 8) calculation of amplitude-modulation frequency-modulation (AM-FM) features for entire image; and 9) classification of features into HR and non-HR using linear regression. This approach was tested on 74 digital color fundus photographs taken with TOPCON and CANON retinal cameras using leave-one out cross validation. An area under the ROC curve (AUC) of 0.84 was achieved with sensitivity and specificity of 90% and 67%, respectively.
Ding, Kai; Cao, Kunlin; Fuld, Matthew K.; Du, Kaifang; Christensen, Gary E.; Hoffman, Eric A.; Reinhardt, Joseph M.
2012-01-01
Purpose: Regional lung volume change as a function of lung inflation serves as an index of parenchymal and airway status as well as an index of regional ventilation and can be used to detect pathologic changes over time. In this paper, the authors propose a new regional measure of lung mechanics—the specific air volume change by corrected Jacobian. The authors compare this new measure, along with two existing registration based measures of lung ventilation, to a regional ventilation measurement derived from xenon-CT (Xe-CT) imaging. Methods: 4DCT and Xe-CT datasets from four adult sheep are used in this study. Nonlinear, 3D image registration is applied to register an image acquired near end inspiration to an image acquired near end expiration. Approximately 200 annotated anatomical points are used as landmarks to evaluate registration accuracy. Three different registration based measures of regional lung mechanics are derived and compared: the specific air volume change calculated from the Jacobian (SAJ); the specific air volume change calculated by the corrected Jacobian (SACJ); and the specific air volume change by intensity change (SAI). The authors show that the commonly used SAI measure can be derived from the direct SAJ measure by using the air-tissue mixture model and assuming there is no tissue volume change between the end inspiration and end expiration datasets. All three ventilation measures are evaluated by comparing to Xe-CT estimates of regional ventilation. Results: After registration, the mean registration error is on the order of 1 mm. For cubical regions of interest (ROIs) in cubes with size 20 mm × 20 mm × 20 mm, the SAJ and SACJ measures show significantly higher correlation (linear regression, average r2 = 0.75 and r2 = 0.82) with the Xe-CT based measure of specific ventilation (sV) than the SAI measure. For ROIs in slabs along the ventral-dorsal vertical direction with size of 150 mm × 8 mm × 40 mm, the SAJ, SACJ, and SAI all show high correlation (linear regression, average r2 = 0.88, r2 = 0.92, and r2 = 0.87) with the Xe-CT based sV without significant differences when comparing between the three methods. The authors demonstrate a linear relationship between the difference of specific air volume change and difference of tissue volume in all four animals (linear regression, average r2 = 0.86). Conclusions: Given a deformation field by an image registration algorithm, significant differences between the SAJ, SACJ, and SAI measures were found at a regional level compared to the Xe-CT sV in four sheep that were studied. The SACJ introduced here, provides better correlations with Xe-CT based sV than the SAJ and SAI measures, thus providing an improved surrogate for regional ventilation. PMID:22894434
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.
Habib, I H I; Hassouna, M E M; Zaki, G A
2005-03-01
Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Caraviello, D Z; Weigel, K A; Gianola, D
2004-05-01
Predicted transmitting abilities (PTA) of US Jersey sires for daughter longevity were calculated using a Weibull proportional hazards sire model and compared with predictions from a conventional linear animal model. Culling data from 268,008 Jersey cows with first calving from 1981 to 2000 were used. The proportional hazards model included time-dependent effects of herd-year-season contemporary group and parity by stage of lactation interaction, as well as time-independent effects of sire and age at first calving. Sire variances and parameters of the Weibull distribution were estimated, providing heritability estimates of 4.7% on the log scale and 18.0% on the original scale. The PTA of each sire was expressed as the expected risk of culling relative to daughters of an average sire. Risk ratios (RR) ranged from 0.7 to 1.3, indicating that the risk of culling for daughters of the best sires was 30% lower than for daughters of average sires and nearly 50% lower than than for daughters of the poorest sires. Sire PTA from the proportional hazards model were compared with PTA from a linear model similar to that used for routine national genetic evaluation of length of productive life (PL) using cross-validation in independent samples of herds. Models were compared using logistic regression of daughters' stayability to second, third, fourth, or fifth lactation on their sires' PTA values, with alternative approaches for weighting the contribution of each sire. Models were also compared using logistic regression of daughters' stayability to 36, 48, 60, 72, and 84 mo of life. The proportional hazards model generally yielded more accurate predictions according to these criteria, but differences in predictive ability between methods were smaller when using a Kullback-Leibler distance than with other approaches. Results of this study suggest that survival analysis methodology may provide more accurate predictions of genetic merit for longevity than conventional linear models.
Reppas-Chrysovitsinos, Efstathios; Sobek, Anna; MacLeod, Matthew
2016-06-15
Polymeric materials flowing through the technosphere are repositories of organic chemicals throughout their life cycle. Equilibrium partition ratios of organic chemicals between these materials and air (KMA) or water (KMW) are required for models of fate and transport, high-throughput exposure assessment and passive sampling. KMA and KMW have been measured for a growing number of chemical/material combinations, but significant data gaps still exist. We assembled a database of 363 KMA and 910 KMW measurements for 446 individual compounds and nearly 40 individual polymers and biopolymers, collected from 29 studies. We used the EPI Suite and ABSOLV software packages to estimate physicochemical properties of the compounds and we employed an empirical correlation based on Trouton's rule to adjust the measured KMA and KMW values to a standard reference temperature of 298 K. Then, we used a thermodynamic triangle with Henry's law constant to calculate a complete set of 1273 KMA and KMW values. Using simple linear regression, we developed a suite of single parameter linear free energy relationship (spLFER) models to estimate KMA from the EPI Suite-estimated octanol-air partition ratio (KOA) and KMW from the EPI Suite-estimated octanol-water (KOW) partition ratio. Similarly, using multiple linear regression, we developed a set of polyparameter linear free energy relationship (ppLFER) models to estimate KMA and KMW from ABSOLV-estimated Abraham solvation parameters. We explored the two LFER approaches to investigate (1) their performance in estimating partition ratios, and (2) uncertainties associated with treating all different polymers as a single "bulk" polymeric material compartment. The models we have developed are suitable for screening assessments of the tendency for organic chemicals to be emitted from materials, and for use in multimedia models of the fate of organic chemicals in the indoor environment. In screening applications we recommend that KMA and KMW be modeled as 0.06 ×KOA and 0.06 ×KOW respectively, with an uncertainty range of a factor of 15.
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Martínez-Moyá, María; Navarrete-Muñoz, Eva M; García de la Hera, Manuela; Giménez-Monzo, Daniel; González-Palacios, Sandra; Valera-Gran, Desirée; Sempere-Orts, María; Vioque, Jesús
2014-01-01
To explore the association between excess weight or body mass index (BMI) and the time spent watching television, self-reported physical activity and sleep duration in a young adult population. We analyzed cross-sectional baseline data of 1,135 participants (17-35 years old) from the project Dieta, salud y antropometría en población universitaria (Diet, Health and Anthrompmetric Variables in Univeristy Students). Information about time spent watching television, sleep duration, self-reported physical activity and self-reported height and weight was provided by a baseline questionnaire. BMI was calculated as kg/m(2) and excess of weight was defined as ≥25. We used multiple logistic regression to explore the association between excess weight (no/yes) and independent variables, and multiple linear regression for BMI. The prevalence of excess weight was 13.7% (11.2% were overweight and 2.5% were obese). A significant positive association was found between excess weight and a greater amount of time spent watching television. Participants who reported watching television >2h a day had a higher risk of excess weight than those who watched television ≤1h a day (OR=2.13; 95%CI: 1.37-3.36; p-trend: 0.002). A lower level of physical activity was associated with an increased risk of excess weight, although the association was statistically significant only in multiple linear regression (p=0.037). No association was observed with sleep duration. A greater number of hours spent watching television and lower physical activity were significantly associated with a higher BMI in young adults. Both factors are potentially modifiable with preventive strategies. Copyright © 2013 SESPAS. Published by Elsevier Espana. All rights reserved.
Image interpolation via regularized local linear regression.
Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang
2011-12-01
The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
An efficient approach to ARMA modeling of biological systems with multiple inputs and delays
NASA Technical Reports Server (NTRS)
Perrott, M. H.; Cohen, R. J.
1996-01-01
This paper presents a new approach to AutoRegressive Moving Average (ARMA or ARX) modeling which automatically seeks the best model order to represent investigated linear, time invariant systems using their input/output data. The algorithm seeks the ARMA parameterization which accounts for variability in the output of the system due to input activity and contains the fewest number of parameters required to do so. The unique characteristics of the proposed system identification algorithm are its simplicity and efficiency in handling systems with delays and multiple inputs. We present results of applying the algorithm to simulated data and experimental biological data In addition, a technique for assessing the error associated with the impulse responses calculated from estimated ARMA parameterizations is presented. The mapping from ARMA coefficients to impulse response estimates is nonlinear, which complicates any effort to construct confidence bounds for the obtained impulse responses. Here a method for obtaining a linearization of this mapping is derived, which leads to a simple procedure to approximate the confidence bounds.
Senior, Samir A; Madbouly, Magdy D; El massry, Abdel-Moneim
2011-09-01
Quantum chemical and topological descriptors of some organophosphorus compounds (OP) were correlated with their toxicity LD(50) as a dermal. The quantum chemical parameters were obtained using B3LYP/LANL2DZdp-ECP optimization. Using linear regression analysis, equations were derived to calculate the theoretical LD(50) of the studied compounds. The inclusion of quantum parameters, having both charge indices and topological indices, affects the toxicity of the studied compounds resulting in high correlation coefficient factors for the obtained equations. Two of the new four firstly supposed descriptors give higher correlation coefficients namely the Heteroatom Corrected Extended Connectivity Randic index ((1)X(HCEC)) and the Density Randic index ((1)X(Den)). The obtained linear equations were applied to predict the toxicity of some related structures. It was found that the sulfur atoms in these compounds must be replaced by oxygen atoms to achieve improved toxicity. Copyright © 2011 Elsevier Ltd. All rights reserved.
Use of magnetic resonance imaging to predict the body composition of pigs in vivo.
Kremer, P V; Förster, M; Scholz, A M
2013-06-01
The objective of the study was to evaluate whether magnetic resonance imaging (MRI) offers the opportunity to reliably analyze body composition of pigs in vivo. Therefore, the relation between areas of loin eye muscle and its back fat based on MRI images were used to predict body composition values measured by dual energy X-ray absorptiometry (DXA). During the study, a total of 77 pigs were studied by MRI and DXA, with a BW ranging between 42 and 102 kg. The pigs originated from different extensive or conventional breeds or crossbreds such as Cerdo Iberico, Duroc, German Landrace, German Large White, Hampshire and Pietrain. A Siemens Magnetom Open was used for MRI in the thorax region between 13th and 14th vertebrae in order to measure the loin eye area (MRI-LA) and the above back fat area (MRI-FA) of both body sides, whereas a whole body scan was performed by DXA with a GE Lunar DPX-IQ in order to measure the amount and percentage of fat tissue (DXA-FM; DXA-%FM) and lean tissue mass (DXA-LM; DXA-%LM). A linear single regression analysis was performed to quantify the linear relationships between MRI- and DXA-derived traits. In addition, a stepwise regression procedure was carried out to calculate (multiple) regression equations between MRI and DXA variables (including BW). Single regression analyses showed high relationships between DXA-%FM and MRI-FA (R 2 = 0.89, √MSE = 2.39%), DXA-FM and MRI-FA (R 2 = 0.82, √MSE = 2757 g) and DXA-LM and MRI-LA (R 2 = 0.82, √MSE = 4018 g). Only DXA-%LM and MRI-LA did not show any relationship (R 2 = 0). As a result of the multiple regression analysis, DXA-LM and DXA-FM were both highly related to MRI-LA, MRI-FA and BW (R 2 = 0.96; √MSE = 1784 g, and R 2 = 0.95, √MSE = 1496 g). Therefore, it can be concluded that the use of MRI-derived images provides exact information about important 'carcass-traits' in pigs and may be used to reliably predict the body composition in vivo.
NASA Astrophysics Data System (ADS)
Ortiz, M.; Graber, H. C.; Wilkinson, J.; Nyman, L. M.; Lund, B.
2017-12-01
Much work has been done on determining changes in summer ice albedo and morphological properties of melt ponds such as depth, shape and distribution using in-situ measurements and satellite-based sensors. Although these studies have dedicated much pioneering work in this area, there still lacks sufficient spatial and temporal scales. We present a prototype algorithm using Linear Support Vector Machines (LSVMs) designed to quantify the evolution of melt pond fraction from a recently government-declassified high-resolution panchromatic optical dataset. The study area of interest lies within the Beaufort marginal ice zone (MIZ), where several in-situ instruments were deployed by the British Antarctic Survey in joint with the MIZ Program, from April-September, 2014. The LSVM uses four dimensional feature data from the intensity image itself, and from various textures calculated from a modified first-order histogram technique using probability density of occurrences. We explore both the temporal evolution of melt ponds and spatial statistics such as pond fraction, pond area, and number pond density, to name a few. We also introduce a linear regression model that can potentially be used to estimate average pond area by ingesting several melt pond statistics and shape parameters.
[Evaluation of pendulum testing of spasticity].
Le Cavorzin, P; Hernot, X; Bartier, O; Carrault, G; Chagneau, F; Gallien, P; Allain, H; Rochcongar, P
2002-11-01
To identify valid measurements of spasticity derived from the pendulum test of the leg in a representative population of spastic patients. Pendulum testing was performed in 15 spastic and 10 matched healthy subjects. The reflex-mediated torque evoked in quadriceps femoris, as well as muscle mechanical parameters (viscosity and elasticity), were calculated using mathematical modelling. Correlation with the two main measures derived from the pendulum test reported in the literature (the Relaxation Index and the area under the curve) was calculated in order to select the most valid. Among mechanical parameters, only viscosity was found to be significantly higher in the spastic group. As expected, the computed integral of the reflex-mediated torque was found to be larger in spastics than in healthy subjects. A significant non-linear (logarithmic) correlation was found between the clinically-assessed muscle spasticity (Ashworth grading) and the computed reflex-mediated torque, emphasising the non-linear behaviour of this scale. Among measurements derived from the pendulum test which are proposed in the literature for routine estimation of spasticity, the Relaxation Index exhibited an unsuitable U-shaped pattern of variation with increasing reflex-mediated torque. On the opposite, the area under the curve revealed a linear regression, which is more convenient for routine estimation of spasticity. The pendulum test of the leg is a simple technique for the assessment of spastic hypertonia. However, the measurement generally used in the literature (the Relaxation Index) exhibits serious limitations, and would benefit to be replaced by more valid measures, such as the area under the goniometric curve, especially for the assessment of therapeutics.