linear regression parameters: Topics by Science.gov

Sample records for linear regression parameters

Transmission of linear regression patterns between time series: From relationship in time series to complex networks

NASA Astrophysics Data System (ADS)

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

2014-07-01

The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

PubMed

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

2014-07-01

The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

ERIC Educational Resources Information Center

Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

2013-01-01

In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course

ERIC Educational Resources Information Center

Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna

2010-01-01

Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

PubMed Central

Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

2016-01-01

Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Linear regression metamodeling as a tool to summarize and present simulation model results.

PubMed

Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

2013-10-01

Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

PubMed

Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

2018-06-01

This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

PubMed

Kumar, K Vasanth; Porkodi, K; Rocha, F

2008-01-15

A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
Finite-sample and asymptotic sign-based tests for parameters of non-linear quantile regression with Markov noise

NASA Astrophysics Data System (ADS)

Sirenko, M. A.; Tarasenko, P. F.; Pushkarev, M. I.

2017-01-01

One of the most noticeable features of sign-based statistical procedures is an opportunity to build an exact test for simple hypothesis testing of parameters in a regression model. In this article, we expanded a sing-based approach to the nonlinear case with dependent noise. The examined model is a multi-quantile regression, which makes it possible to test hypothesis not only of regression parameters, but of noise parameters as well.
Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2011-01-01

Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
Pseudo-second order models for the adsorption of safranin onto activated carbon: comparison of linear and non-linear regression methods.

PubMed

Kumar, K Vasanth

2007-04-02

Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Quantum State Tomography via Linear Regression Estimation

PubMed Central

Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan

2013-01-01

A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Multiple concurrent recursive least squares identification with application to on-line spacecraft mass-property identification

NASA Technical Reports Server (NTRS)

Wilson, Edward (Inventor)

2006-01-01

The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.
Linear regression analysis of survival data with missing censoring indicators.

PubMed

Wang, Qihua; Dinse, Gregg E

2011-04-01

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
Parameter estimation of Monod model by the Least-Squares method for microalgae Botryococcus Braunii sp

NASA Astrophysics Data System (ADS)

See, J. J.; Jamaian, S. S.; Salleh, R. M.; Nor, M. E.; Aman, F.

2018-04-01

This research aims to estimate the parameters of Monod model of microalgae Botryococcus Braunii sp growth by the Least-Squares method. Monod equation is a non-linear equation which can be transformed into a linear equation form and it is solved by implementing the Least-Squares linear regression method. Meanwhile, Gauss-Newton method is an alternative method to solve the non-linear Least-Squares problem with the aim to obtain the parameters value of Monod model by minimizing the sum of square error ( SSE). As the result, the parameters of the Monod model for microalgae Botryococcus Braunii sp can be estimated by the Least-Squares method. However, the estimated parameters value obtained by the non-linear Least-Squares method are more accurate compared to the linear Least-Squares method since the SSE of the non-linear Least-Squares method is less than the linear Least-Squares method.
Detecting influential observations in nonlinear regression modeling of groundwater flow

USGS Publications Warehouse

Yager, Richard M.

1998-01-01

Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.
A method for nonlinear exponential regression analysis

NASA Technical Reports Server (NTRS)

Junkin, B. G.

1971-01-01

A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 2. Applications

USGS Publications Warehouse

Cooley, Richard L.

1983-01-01

This paper investigates factors influencing the degree of improvement in estimates of parameters of a nonlinear regression groundwater flow model by incorporating prior information of unknown reliability. Consideration of expected behavior of the regression solutions and results of a hypothetical modeling problem lead to several general conclusions. First, if the parameters are properly scaled, linearized expressions for the mean square error (MSE) in parameter estimates of a nonlinear model will often behave very nearly as if the model were linear. Second, by using prior information, the MSE in properly scaled parameters can be reduced greatly over the MSE of ordinary least squares estimates of parameters. Third, plots of estimated MSE and the estimated standard deviation of MSE versus an auxiliary parameter (the ridge parameter) specifying the degree of influence of the prior information on regression results can help determine the potential for improvement of parameter estimates. Fourth, proposed criteria can be used to make appropriate choices for the ridge parameter and another parameter expressing degree of overall bias in the prior information. Results of a case study of Truckee Meadows, Reno-Sparks area, Washoe County, Nevada, conform closely to the results of the hypothetical problem. In the Truckee Meadows case, incorporation of prior information did not greatly change the parameter estimates from those obtained by ordinary least squares. However, the analysis showed that both sets of estimates are more reliable than suggested by the standard errors from ordinary least squares.
RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

PubMed

Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

2009-01-01

This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

NASA Astrophysics Data System (ADS)

Gorgees, HazimMansoor; Mahdi, FatimahAssim

2018-05-01

This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.

Automating approximate Bayesian computation by local linear regression.

PubMed

Thornton, Kevin R

2009-07-07

In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment

NASA Astrophysics Data System (ADS)

Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty

2017-12-01

Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Independent contrasts and PGLS regression estimators are equivalent.

PubMed

Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

2012-05-01

We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
Predictive and mechanistic multivariate linear regression models for reaction development

PubMed Central

Santiago, Celine B.; Guo, Jing-Yao

2018-01-01

Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function

ERIC Educational Resources Information Center

Withers, Christopher S.; Nadarajah, Saralees

2011-01-01

The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Non-Linear Relationship between Economic Growth and CO2 Emissions in China: An Empirical Study Based on Panel Smooth Transition Regression Models

PubMed Central

Wang, Zheng-Xin; Hao, Peng; Yao, Pei-Yi

2017-01-01

The non-linear relationship between provincial economic growth and carbon emissions is investigated by using panel smooth transition regression (PSTR) models. The research indicates that, on the condition of separately taking Gross Domestic Product per capita (GDPpc), energy structure (Es), and urbanisation level (Ul) as transition variables, three models all reject the null hypothesis of a linear relationship, i.e., a non-linear relationship exists. The results show that the three models all contain only one transition function but different numbers of location parameters. The model taking GDPpc as the transition variable has two location parameters, while the other two models separately considering Es and Ul as the transition variables both contain one location parameter. The three models applied in the study all favourably describe the non-linear relationship between economic growth and CO2 emissions in China. It also can be seen that the conversion rate of the influence of Ul on per capita CO2 emissions is significantly higher than those of GDPpc and Es on per capita CO2 emissions. PMID:29236083
Non-Linear Relationship between Economic Growth and CO₂ Emissions in China: An Empirical Study Based on Panel Smooth Transition Regression Models.

PubMed

Wang, Zheng-Xin; Hao, Peng; Yao, Pei-Yi

2017-12-13

The non-linear relationship between provincial economic growth and carbon emissions is investigated by using panel smooth transition regression (PSTR) models. The research indicates that, on the condition of separately taking Gross Domestic Product per capita (GDPpc), energy structure (Es), and urbanisation level (Ul) as transition variables, three models all reject the null hypothesis of a linear relationship, i.e., a non-linear relationship exists. The results show that the three models all contain only one transition function but different numbers of location parameters. The model taking GDPpc as the transition variable has two location parameters, while the other two models separately considering Es and Ul as the transition variables both contain one location parameter. The three models applied in the study all favourably describe the non-linear relationship between economic growth and CO₂ emissions in China. It also can be seen that the conversion rate of the influence of Ul on per capita CO₂ emissions is significantly higher than those of GDPpc and Es on per capita CO₂ emissions.
Carbon dioxide stripping in aquaculture -- part III: model verification

USGS Publications Warehouse

Colt, John; Watten, Barnaby; Pfeiffer, Tim

2012-01-01

Based on conventional mass transfer models developed for oxygen, the use of the non-linear ASCE method, 2-point method, and one parameter linear-regression method were evaluated for carbon dioxide stripping data. For values of KLaCO2 < approximately 1.5/h, the 2-point or ASCE method are a good fit to experimental data, but the fit breaks down at higher values of KLaCO2. How to correct KLaCO2 for gas phase enrichment remains to be determined. The one-parameter linear regression model was used to vary the C*CO2 over the test, but it did not result in a better fit to the experimental data when compared to the ASCE or fixed C*CO2 assumptions.
A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

PubMed

Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

2016-04-01

Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Logistic regression for circular data

NASA Astrophysics Data System (ADS)

Al-Daffaie, Kadhem; Khan, Shahjahan

2017-05-01

This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Multiple regression for physiological data analysis: the problem of multicollinearity.

PubMed

Slinker, B K; Glantz, S A

1985-07-01

Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.
Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

PubMed

Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

2018-01-01

The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis

NASA Astrophysics Data System (ADS)

Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.

2016-02-01

In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.
Prediction of HDR quality by combining perceptually transformed display measurements with machine learning

NASA Astrophysics Data System (ADS)

Choudhury, Anustup; Farrell, Suzanne; Atkins, Robin; Daly, Scott

2017-09-01

We present an approach to predict overall HDR display quality as a function of key HDR display parameters. We first performed subjective experiments on a high quality HDR display that explored five key HDR display parameters: maximum luminance, minimum luminance, color gamut, bit-depth and local contrast. Subjects rated overall quality for different combinations of these display parameters. We explored two models | a physical model solely based on physically measured display characteristics and a perceptual model that transforms physical parameters using human vision system models. For the perceptual model, we use a family of metrics based on a recently published color volume model (ICT-CP), which consists of the PQ luminance non-linearity (ST2084) and LMS-based opponent color, as well as an estimate of the display point spread function. To predict overall visual quality, we apply linear regression and machine learning techniques such as Multilayer Perceptron, RBF and SVM networks. We use RMSE and Pearson/Spearman correlation coefficients to quantify performance. We found that the perceptual model is better at predicting subjective quality than the physical model and that SVM is better at prediction than linear regression. The significance and contribution of each display parameter was investigated. In addition, we found that combined parameters such as contrast do not improve prediction. Traditional perceptual models were also evaluated and we found that models based on the PQ non-linearity performed better.
Quantum algorithm for linear regression

NASA Astrophysics Data System (ADS)

Wang, Guoming

2017-07-01

We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Linear Regression between CIE-Lab Color Parameters and Organic Matter in Soils of Tea Plantations

NASA Astrophysics Data System (ADS)

Chen, Yonggen; Zhang, Min; Fan, Dongmei; Fan, Kai; Wang, Xiaochang

2018-02-01

To quantify the relationship between the soil organic matter and color parameters using the CIE-Lab system, 62 soil samples (0-10 cm, Ferralic Acrisols) from tea plantations were collected from southern China. After air-drying and sieving, numerical color information and reflectance spectra of soil samples were measured under laboratory conditions using an UltraScan VIS (HunterLab) spectrophotometer equipped with CIE-Lab color models. We found that soil total organic carbon (TOC) and nitrogen (TN) contents were negatively correlated with the L* value (lightness) ( r = -0.84 and -0.80, respectively), a* value (correlation coefficient r = -0.51 and -0.46, respectively) and b* value ( r = -0.76 and -0.70, respectively). There were also linear regressions between TOC and TN contents with the L* value and b* value. Results showed that color parameters from a spectrophotometer equipped with CIE-Lab color models can predict TOC contents well for soils in tea plantations. The linear regression model between color values and soil organic carbon contents showed it can be used as a rapid, cost-effective method to evaluate content of soil organic matter in Chinese tea plantations.
Estimating linear temporal trends from aggregated environmental monitoring data

USGS Publications Warehouse

Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.

2017-01-01

Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Combined genetic algorithm and multiple linear regression (GA-MLR) optimizer: Application to multi-exponential fluorescence decay surface.

PubMed

Fisz, Jacek J

2006-12-07

The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear combinations of nonlinear functions, is indicated. The VP algorithm does not distinguish the weakly nonlinear parameters from the nonlinear ones and it does not apply to the model functions which are multi-linear combinations of nonlinear functions.
Estimation of octanol/water partition coefficients using LSER parameters

USGS Publications Warehouse

Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.

1998-01-01

The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Some comparisons of complexity in dictionary-based and linear computational models.

PubMed

Gnecco, Giorgio; Kůrková, Věra; Sanguineti, Marcello

2011-03-01

Neural networks provide a more flexible approximation of functions than traditional linear regression. In the latter, one can only adjust the coefficients in linear combinations of fixed sets of functions, such as orthogonal polynomials or Hermite functions, while for neural networks, one may also adjust the parameters of the functions which are being combined. However, some useful properties of linear approximators (such as uniqueness, homogeneity, and continuity of best approximation operators) are not satisfied by neural networks. Moreover, optimization of parameters in neural networks becomes more difficult than in linear regression. Experimental results suggest that these drawbacks of neural networks are offset by substantially lower model complexity, allowing accuracy of approximation even in high-dimensional cases. We give some theoretical results comparing requirements on model complexity for two types of approximators, the traditional linear ones and so called variable-basis types, which include neural networks, radial, and kernel models. We compare upper bounds on worst-case errors in variable-basis approximation with lower bounds on such errors for any linear approximator. Using methods from nonlinear approximation and integral representations tailored to computational units, we describe some cases where neural networks outperform any linear approximator. Copyright © 2010 Elsevier Ltd. All rights reserved.

Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models.

PubMed

Shah, A A; Xing, W W; Triantafyllidis, V

2017-04-01

In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.
Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models

PubMed Central

Xing, W. W.; Triantafyllidis, V.

2017-01-01

In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327
Multivariate meta-analysis for non-linear and other multi-parameter associations

PubMed Central

Gasparrini, A; Armstrong, B; Kenward, M G

2012-01-01

In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Pseudo second order kinetics and pseudo isotherms for malachite green onto activated carbon: comparison of linear and non-linear regression methods.

PubMed

Kumar, K Vasanth; Sivanesan, S

2006-08-25

Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
Quantile regression models of animal habitat relationships

USGS Publications Warehouse

Cade, Brian S.

2003-01-01

Typically, all factors that limit an organism are not measured and included in statistical models used to investigate relationships with their environment. If important unmeasured variables interact multiplicatively with the measured variables, the statistical models often will have heterogeneous response distributions with unequal variances. Quantile regression is an approach for estimating the conditional quantiles of a response variable distribution in the linear model, providing a more complete view of possible causal relationships between variables in ecological processes. Chapter 1 introduces quantile regression and discusses the ordering characteristics, interval nature, sampling variation, weighting, and interpretation of estimates for homogeneous and heterogeneous regression models. Chapter 2 evaluates performance of quantile rankscore tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). A permutation F test maintained better Type I errors than the Chi-square T test for models with smaller n, greater number of parameters p, and more extreme quantiles τ. Both versions of the test required weighting to maintain correct Type I errors when there was heterogeneity under the alternative model. An example application related trout densities to stream channel width:depth. Chapter 3 evaluates a drop in dispersion, F-ratio like permutation test for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1). Chapter 4 simulates from a large (N = 10,000) finite population representing grid areas on a landscape to demonstrate various forms of hidden bias that might occur when the effect of a measured habitat variable on some animal was confounded with the effect of another unmeasured variable (spatially and not spatially structured). Depending on whether interactions of the measured habitat and unmeasured variable were negative (interference interactions) or positive (facilitation interactions), either upper (τ > 0.5) or lower (τ < 0.5) quantile regression parameters were less biased than mean rate parameters. Sampling (n = 20 - 300) simulations demonstrated that confidence intervals constructed by inverting rankscore tests provided valid coverage of these biased parameters. Quantile regression was used to estimate effects of physical habitat resources on a bivalve mussel (Macomona liliana) in a New Zealand harbor by modeling the spatial trend surface as a cubic polynomial of location coordinates.
Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

NASA Technical Reports Server (NTRS)

Ohring, G.

1972-01-01

Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.
Linear and non-linear regression analysis for the sorption kinetics of methylene blue onto activated carbon.

PubMed

Kumar, K Vasanth

2006-10-11

Batch kinetic experiments were carried out for the sorption of methylene blue onto activated carbon. The experimental kinetics were fitted to the pseudo first-order and pseudo second-order kinetics by linear and a non-linear method. The five different types of Ho pseudo second-order expression have been discussed. A comparison of linear least-squares method and a trial and error non-linear method of estimating the pseudo second-order rate kinetic parameters were examined. The sorption process was found to follow a both pseudo first-order kinetic and pseudo second-order kinetic model. Present investigation showed that it is inappropriate to use a type 1 and type pseudo second-order expressions as proposed by Ho and Blanachard et al. respectively for predicting the kinetic rate constants and the initial sorption rate for the studied system. Three correct possible alternate linear expressions (type 2 to type 4) to better predict the initial sorption rate and kinetic rate constants for the studied system (methylene blue/activated carbon) was proposed. Linear method was found to check only the hypothesis instead of verifying the kinetic model. Non-linear regression method was found to be the more appropriate method to determine the rate kinetic parameters.
Fungible weights in logistic regression.

PubMed

Jones, Jeff A; Waller, Niels G

2016-06-01

In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

PubMed Central

Fernandes, Bruno J. T.; Roque, Alexandre

2018-01-01

Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
Evaluation of confidence intervals for a steady-state leaky aquifer model

USGS Publications Warehouse

Christensen, S.; Cooley, R.L.

1999-01-01

The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley [Vecchia, A.V. and Cooley, R.L., Water Resources Research, 1987, 23(7), 1237-1250] was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.
Construction of multiple linear regression models using blood biomarkers for selecting against abdominal fat traits in broilers.

PubMed

Dong, J Q; Zhang, X Y; Wang, S Z; Jiang, X F; Zhang, K; Ma, G W; Wu, M Q; Li, H; Zhang, H

2018-01-01

Plasma very low-density lipoprotein (VLDL) can be used to select for low body fat or abdominal fat (AF) in broilers, but its correlation with AF is limited. We investigated whether any other biochemical indicator can be used in combination with VLDL for a better selective effect. Nineteen plasma biochemical indicators were measured in male chickens from the Northeast Agricultural University broiler lines divergently selected for AF content (NEAUHLF) in the fed state at 46 and 48 d of age. The average concentration of every parameter for the 2 d was used for statistical analysis. Levels of these 19 plasma biochemical parameters were compared between the lean and fat lines. The phenotypic correlations between these plasma biochemical indicators and AF traits were analyzed. Then, multiple linear regression models were constructed to select the best model used for selecting against AF content. and the heritabilities of plasma indicators contained in the best models were estimated. The results showed that 11 plasma biochemical indicators (triglycerides, total bile acid, total protein, globulin, albumin/globulin, aspartate transaminase, alanine transaminase, gamma-glutamyl transpeptidase, uric acid, creatinine, and VLDL) differed significantly between the lean and fat lines (P < 0.01), and correlated significantly with AF traits (P < 0.05). The best multiple linear regression models based on albumin/globulin, VLDL, triglycerides, globulin, total bile acid, and uric acid, had higher R2 (0.73) than the model based only on VLDL (0.21). The plasma parameters included in the best models had moderate heritability estimates (0.21 ≤ h2 ≤ 0.43). These results indicate that these multiple linear regression models can be used to select for lean broiler chickens. © 2017 Poultry Science Association Inc.
Generating linear regression model to predict motor functions by use of laser range finder during TUG.

PubMed

Adachi, Daiki; Nishiguchi, Shu; Fukutani, Naoto; Hotta, Takayuki; Tashiro, Yuto; Morino, Saori; Shirooka, Hidehiko; Nozaki, Yuma; Hirata, Hinako; Yamaguchi, Moe; Yorozu, Ayanori; Takahashi, Masaki; Aoyama, Tomoki

2017-05-01

The purpose of this study was to investigate which spatial and temporal parameters of the Timed Up and Go (TUG) test are associated with motor function in elderly individuals. This study included 99 community-dwelling women aged 72.9 ± 6.3 years. Step length, step width, single support time, variability of the aforementioned parameters, gait velocity, cadence, reaction time from starting signal to first step, and minimum distance between the foot and a marker placed to 3 in front of the chair were measured using our analysis system. The 10-m walk test, five times sit-to-stand (FTSTS) test, and one-leg standing (OLS) test were used to assess motor function. Stepwise multivariate linear regression analysis was used to determine which TUG test parameters were associated with each motor function test. Finally, we calculated a predictive model for each motor function test using each regression coefficient. In stepwise linear regression analysis, step length and cadence were significantly associated with the 10-m walk test, FTSTS and OLS test. Reaction time was associated with the FTSTS test, and step width was associated with the OLS test. Each predictive model showed a strong correlation with the 10-m walk test and OLS test (P < 0.01), which was not significant higher correlation than TUG test time. We showed which TUG test parameters were associated with each motor function test. Moreover, the TUG test time regarded as the lower extremity function and mobility has strong predictive ability in each motor function test. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters

NASA Astrophysics Data System (ADS)

Huang, Lin-Shan; Chen, Yan-Guang

Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
Estimating sunspot number

NASA Technical Reports Server (NTRS)

Wilson, R. M.; Reichmann, E. J.; Teuber, D. L.

1984-01-01

An empirical method is developed to predict certain parameters of future solar activity cycles. Sunspot cycle statistics are examined, and curve fitting and linear regression analysis techniques are utilized.
A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection

PubMed Central

Sabourin, Jeremy A; Valdar, William; Nobel, Andrew B

2015-01-01

Summary We describe a simple, computationally effcient, permutation-based procedure for selecting the penalty parameter in LASSO penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), Scaled Sparse Linear Regression, and a selection method based on recently developed testing procedures for the LASSO. PMID:26243050
Scale of association: hierarchical linear models and the measurement of ecological systems

Treesearch

Sean M. McMahon; Jeffrey M. Diez

2007-01-01

A fundamental challenge to understanding patterns in ecological systems lies in employing methods that can analyse, test and draw inference from measured associations between variables across scales. Hierarchical linear models (HLM) use advanced estimation algorithms to measure regression relationships and variance-covariance parameters in hierarchically structured...
Periodontitis in coronary heart disease patients: strong association between bleeding on probing and systemic biomarkers.

PubMed

Bokhari, Syed Akhtar H; Khan, Ayyaz A; Butt, Arshad K; Hanif, Mohammad; Izhar, Mateen; Tatakis, Dimitris N; Ashfaq, Mohammad

2014-11-01

Few studies have examined the relationship of individual periodontal parameters with individual systemic biomarkers. This study assessed the possible association between specific clinical parameters of periodontitis and systemic biomarkers of coronary heart disease risk in coronary heart disease patients with periodontitis. Angiographically proven coronary heart disease patients with periodontitis (n = 317), aged >30 years and without other systemic illness were examined. Periodontal clinical parameters of bleeding on probing (BOP), probing depth (PD), and clinical attachment level (CAL) and systemic levels of high-sensitivity C-reactive protein (CRP), fibrinogen (FIB) and white blood cells (WBC) were noted and analyzed to identify associations through linear and stepwise multiple regression analyses. Unadjusted linear regression showed significant associations between periodontal and systemic parameters; the strongest association (r = 0.629; p < 0.001) was found between BOP and CRP levels, the periodontal and systemic inflammation marker, respectively. Stepwise regression analysis models revealed that BOP was a predictor of systemic CRP levels (p < 0.0001). BOP was the only periodontal parameter significantly associated with each systemic parameter (CRP, FIB, and WBC). In coronary heart disease patients with periodontitis, BOP is strongly associated with systemic CRP levels; this association possibly reflects the potential significance of the local periodontal inflammatory burden for systemic inflammation. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Least-Squares Data Adjustment with Rank-Deficient Data Covariance Matrices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, J.G.

2011-07-01

A derivation of the linear least-squares adjustment formulae is required that avoids the assumption that the covariance matrix of prior parameters can be inverted. Possible proofs are of several kinds, including: (i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. In this paper, the least-squares adjustment equations are derived in both these ways, while explicitly assuming that the covariance matrix of prior parameters is singular. It will be proved that the solutions are unique and that, contrary to statements that have appeared inmore » the literature, the least-squares adjustment problem is not ill-posed. No modification is required to the adjustment formulae that have been used in the past in the case of a singular covariance matrix for the priors. In conclusion: The linear least-squares adjustment formula that has been used in the past is valid in the case of a singular covariance matrix for the covariance matrix of prior parameters. Furthermore, it provides a unique solution. Statements in the literature, to the effect that the problem is ill-posed are wrong. No regularization of the problem is required. This has been proved in the present paper by two methods, while explicitly assuming that the covariance matrix of prior parameters is singular: i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. No modification is needed to the adjustment formulae that have been used in the past. (author)« less
Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

PubMed Central

Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

2012-01-01

Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation

NASA Astrophysics Data System (ADS)

Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.

A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.

Parameterizing sorption isotherms using a hybrid global-local fitting procedure.

PubMed

Matott, L Shawn; Singh, Anshuman; Rabideau, Alan J

2017-05-01

Predictive modeling of the transport and remediation of groundwater contaminants requires an accurate description of the sorption process, which is usually provided by fitting an isotherm model to site-specific laboratory data. Commonly used calibration procedures, listed in order of increasing sophistication, include: trial-and-error, linearization, non-linear regression, global search, and hybrid global-local search. Given the considerable variability in fitting procedures applied in published isotherm studies, we investigated the importance of algorithm selection through a series of numerical experiments involving 13 previously published sorption datasets. These datasets, considered representative of state-of-the-art for isotherm experiments, had been previously analyzed using trial-and-error, linearization, or non-linear regression methods. The isotherm expressions were re-fit using a 3-stage hybrid global-local search procedure (i.e. global search using particle swarm optimization followed by Powell's derivative free local search method and Gauss-Marquardt-Levenberg non-linear regression). The re-fitted expressions were then compared to previously published fits in terms of the optimized weighted sum of squared residuals (WSSR) fitness function, the final estimated parameters, and the influence on contaminant transport predictions - where easily computed concentration-dependent contaminant retardation factors served as a surrogate measure of likely transport behavior. Results suggest that many of the previously published calibrated isotherm parameter sets were local minima. In some cases, the updated hybrid global-local search yielded order-of-magnitude reductions in the fitness function. In particular, of the candidate isotherms, the Polanyi-type models were most likely to benefit from the use of the hybrid fitting procedure. In some cases, improvements in fitness function were associated with slight (<10%) changes in parameter values, but in other cases significant (>50%) changes in parameter values were noted. Despite these differences, the influence of isotherm misspecification on contaminant transport predictions was quite variable and difficult to predict from inspection of the isotherms. Copyright © 2017 Elsevier B.V. All rights reserved.
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

PubMed Central

Zhu, Liping; Huang, Mian; Li, Runze

2012-01-01

This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Confidence Intervals for Assessing Heterogeneity in Generalized Linear Mixed Models

ERIC Educational Resources Information Center

Wagler, Amy E.

2014-01-01

Generalized linear mixed models are frequently applied to data with clustered categorical outcomes. The effect of clustering on the response is often difficult to practically assess partly because it is reported on a scale on which comparisons with regression parameters are difficult to make. This article proposes confidence intervals for…
Simplified large African carnivore density estimators from track indices.

PubMed

Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J

2016-01-01

The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
Estimating effects of limiting factors with regression quantiles

USGS Publications Warehouse

Cade, B.S.; Terrell, J.W.; Schroeder, R.L.

1999-01-01

In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.
Spatial structure, sampling design and scale in remotely-sensed imagery of a California savanna woodland

NASA Technical Reports Server (NTRS)

Mcgwire, K.; Friedl, M.; Estes, J. E.

1993-01-01

This article describes research related to sampling techniques for establishing linear relations between land surface parameters and remotely-sensed data. Predictive relations are estimated between percentage tree cover in a savanna environment and a normalized difference vegetation index (NDVI) derived from the Thematic Mapper sensor. Spatial autocorrelation in original measurements and regression residuals is examined using semi-variogram analysis at several spatial resolutions. Sampling schemes are then tested to examine the effects of autocorrelation on predictive linear models in cases of small sample sizes. Regression models between image and ground data are affected by the spatial resolution of analysis. Reducing the influence of spatial autocorrelation by enforcing minimum distances between samples may also improve empirical models which relate ground parameters to satellite data.
Evaluation of multivariate linear regression and artificial neural networks in prediction of water quality parameters

PubMed Central

2014-01-01

This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676
On approaches to analyze the sensitivity of simulated hydrologic fluxes to model parameters in the community land model

DOE PAGES

Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...

2015-12-04

Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less
A regression technique for evaluation and quantification for water quality parameters from remote sensing data

NASA Technical Reports Server (NTRS)

Whitlock, C. H.; Kuo, C. Y.

1979-01-01

The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
The comparison of robust partial least squares regression with robust principal component regression on a real

NASA Astrophysics Data System (ADS)

Polat, Esra; Gunay, Suleyman

2013-10-01

One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Can Functional Cardiac Age be Predicted from ECG in a Normal Healthy Population

NASA Technical Reports Server (NTRS)

Schlegel, Todd; Starc, Vito; Leban, Manja; Sinigoj, Petra; Vrhovec, Milos

2011-01-01

In a normal healthy population, we desired to determine the most age-dependent conventional and advanced ECG parameters. We hypothesized that changes in several ECG parameters might correlate with age and together reliably characterize the functional age of the heart. Methods: An initial study population of 313 apparently healthy subjects was ultimately reduced to 148 subjects (74 men, 84 women, in the range from 10 to 75 years of age) after exclusion criteria. In all subjects, ECG recordings (resting 5-minute 12-lead high frequency ECG) were evaluated via custom software programs to calculate up to 85 different conventional and advanced ECG parameters including beat-to-beat QT and RR variability, waveform complexity, and signal-averaged, high-frequency and spatial/spatiotemporal ECG parameters. The prediction of functional age was evaluated by multiple linear regression analysis using the best 5 univariate predictors. Results: Ignoring what were ultimately small differences between males and females, the functional age was found to be predicted (R2= 0.69, P < 0.001) from a linear combination of 5 independent variables: QRS elevation in the frontal plane (p<0.001), a new repolarization parameter QTcorr (p<0.001), mean high frequency QRS amplitude (p=0.009), the variability parameter % VLF of RRV (p=0.021) and the P-wave width (p=0.10). Here, QTcorr represents the correlation between the calculated QT and the measured QT signal. Conclusions: In apparently healthy subjects with normal conventional ECGs, functional cardiac age can be estimated by multiple linear regression analysis of mostly advanced ECG results. Because some parameters in the regression formula, such as QTcorr, high frequency QRS amplitude and P-wave width also change with disease in the same direction as with increased age, increased functional age of the heart may reflect subtle age-related pathologies in cardiac electrical function that are usually hidden on conventional ECG.
Prediction of accommodative optical response in prepresbyopic patients using ultrasound biomicroscopy

PubMed Central

Ramasubramanian, Viswanathan; Glasser, Adrian

2015-01-01

PURPOSE To determine whether relatively low-resolution ultrasound biomicroscopy (UBM) can predict the accommodative optical response in prepresbyopic eyes as well as in a previous study of young phakic subjects, despite lower accommodative amplitudes. SETTING College of Optometry, University of Houston, Houston, USA. DESIGN Observational cross-sectional study. METHODS Static accommodative optical response was measured with infrared photorefraction and an autorefractor (WR-5100K) in subjects aged 36 to 46 years. A 35 MHz UBM device (Vumax, Sonomed Escalon) was used to image the left eye, while the right eye viewed accommodative stimuli. Custom-developed Matlab image-analysis software was used to perform automated analysis of UBM images to measure the ocular biometry parameters. The accommodative optical response was predicted from biometry parameters using linear regression, 95% confidence intervals (CIs), and 95% prediction intervals. RESULTS The study evaluated 25 subjects. Per-diopter (D) accommodative changes in anterior chamber depth (ACD), lens thickness, anterior and posterior lens radii of curvature, and anterior segment length were similar to previous values from young subjects. The standard deviations (SDs) of accommodative optical response predicted from linear regressions for UBM-measured biometry parameters were ACD, 0.15 D; lens thickness, 0.25 D; anterior lens radii of curvature, 0.09 D; posterior lens radii of curvature, 0.37 D; and anterior segment length, 0.42 D. CONCLUSIONS Ultrasound biomicroscopy parameters can, on average, predict accommodative optical response with SDs of less than 0.55 D using linear regressions and 95% CIs. Ultrasound biomicroscopy can be used to visualize and quantify accommodative biometric changes and predict accommodative optical response in prepresbyopic eyes. PMID:26049831
Runoff load estimation of particulate and dissolved nitrogen in Lake Inba watershed using continuous monitoring data on turbidity and electric conductivity.

PubMed

Kim, J; Nagano, Y; Furumai, H

2012-01-01

Easy-to-measure surrogate parameters for water quality indicators are needed for real time monitoring as well as for generating data for model calibration and validation. In this study, a novel linear regression model for estimating total nitrogen (TN) based on two surrogate parameters is proposed based on evaluation of pollutant loads flowing into a eutrophic lake. Based on their runoff characteristics during wet weather, electric conductivity (EC) and turbidity were selected as surrogates for particulate nitrogen (PN) and dissolved nitrogen (DN), respectively. Strong linear relationships were established between PN and turbidity and DN and EC, and both models subsequently combined for estimation of TN. This model was evaluated by comparison of estimated and observed TN runoff loads during rainfall events. This analysis showed that turbidity and EC are viable surrogates for PN and DN, respectively, and that the linear regression model for TN concentration was successful in estimating TN runoff loads during rainfall events and also under dry weather conditions.
A new approach to assess COPD by identifying lung function break-points

PubMed Central

Eriksson, Göran; Jarenbäck, Linnea; Peterson, Stefan; Ankerst, Jaro; Bjermer, Leif; Tufvesson, Ellen

2015-01-01

Purpose COPD is a progressive disease, which can take different routes, leading to great heterogeneity. The aim of the post-hoc analysis reported here was to perform continuous analyses of advanced lung function measurements, using linear and nonlinear regressions. Patients and methods Fifty-one COPD patients with mild to very severe disease (Global Initiative for Chronic Obstructive Lung Disease [GOLD] Stages I–IV) and 41 healthy smokers were investigated post-bronchodilation by flow-volume spirometry, body plethysmography, diffusion capacity testing, and impulse oscillometry. The relationship between COPD severity, based on forced expiratory volume in 1 second (FEV1), and different lung function parameters was analyzed by flexible nonparametric method, linear regression, and segmented linear regression with break-points. Results Most lung function parameters were nonlinear in relation to spirometric severity. Parameters related to volume (residual volume, functional residual capacity, total lung capacity, diffusion capacity [diffusion capacity of the lung for carbon monoxide], diffusion capacity of the lung for carbon monoxide/alveolar volume) and reactance (reactance area and reactance at 5Hz) were segmented with break-points at 60%–70% of FEV1. FEV1/forced vital capacity (FVC) and resonance frequency had break-points around 80% of FEV1, while many resistance parameters had break-points below 40%. The slopes in percent predicted differed; resistance at 5 Hz minus resistance at 20 Hz had a linear slope change of −5.3 per unit FEV1, while residual volume had no slope change above and −3.3 change per unit FEV1 below its break-point of 61%. Conclusion Continuous analyses of different lung function parameters over the spirometric COPD severity range gave valuable information additional to categorical analyses. Parameters related to volume, diffusion capacity, and reactance showed break-points around 65% of FEV1, indicating that air trapping starts to dominate in moderate COPD (FEV1 =50%–80%). This may have an impact on the patient’s management plan and selection of patients and/or outcomes in clinical research. PMID:26508849
Optimization of isotherm models for pesticide sorption on biopolymer-nanoclay composite by error analysis.

PubMed

Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M

2017-04-01

A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Identification of internal properties of fibers and micro-swimmers

NASA Astrophysics Data System (ADS)

Plouraboue, Franck; Thiam, Ibrahima; Delmotte, Blaise; Climent, Eric; PSC Collaboration

2016-11-01

In this presentation we discuss the identifiability of constitutive parameters of passive or active micro-swimmers. We first present a general framework for describing fibers or micro-swimmers using a bead-model description. Using a kinematic constraint formulation to describe fibers, flagellum or cilia, we find explicit linear relationship between elastic constitutive parameters and generalised velocities from computing contact forces. This linear formulation then permits to address explicitly identifiability conditions and solve for parameter identification. We show that both active forcing and passive parameters are both identifiable independently but not simultaneously. We also provide unbiased estimators for elastic parameters as well as active ones in the presence of Langevin-like forcing with Gaussian noise using normal linear regression models and maximum likelihood method. These theoretical results are illustrated in various configurations of relaxed or actuated passives fibers, and active filament of known passive properties, showing the efficiency of the proposed approach for direct parameter identification. The convergence of the proposed estimators is successfully tested numerically.
Linear regression models for solvent accessibility prediction in proteins.

PubMed

Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław

2005-04-01

The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.
Application of LANDSAT to the surveillance and control of lake eutrophication in the Great Lakes basin. [Saginaw Bay, Michigan and Wisconsin

NASA Technical Reports Server (NTRS)

Rogers, R. H. (Principal Investigator)

1976-01-01

The author has identified the following significant results. Computer techniques were developed for mapping water quality parameters from LANDSAT data, using surface samples collected in an ongoing survey of water quality in Saginaw Bay. Chemical and biological parameters were measured on 31 July 1975 at 16 bay stations in concert with the LANDSAT overflight. Application of stepwise linear regression bands to nine of these parameters and corresponding LANDSAT measurements for bands 4 and 5 only resulted in regression correlation coefficients that varied from 0.94 for temperature to 0.73 for Secchi depth. Regression equations expressed with the pair of bands 4 and 5, rather than the ratio band 4/band 5, provided higher correlation coefficients for all the water quality parameters studied (temperature, Secchi depth, chloride, conductivity, total kjeldahl nitrogen, total phosphorus, chlorophyll a, total solids, and suspended solids).
How is the weather? Forecasting inpatient glycemic control

PubMed Central

Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M

2017-01-01

Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Determination and importance of temperature dependence of retention coefficient (RPHPLC) in QSAR model of nitrazepams' partition coefficient in bile acid micelles.

PubMed

Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan

2011-02-15

Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.

Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression.

PubMed

Meng, Yilin; Roux, Benoît

2015-08-11

The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression

PubMed Central

2015-01-01

The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
Using Simulation Technique to overcome the multi-collinearity problem for estimating fuzzy linear regression parameters.

NASA Astrophysics Data System (ADS)

Mansoor Gorgees, Hazim; Hilal, Mariam Mohammed

2018-05-01

Fatigue cracking is one of the common types of pavement distresses and is an indicator of structural failure; cracks allow moisture infiltration, roughness, may further deteriorate to a pothole. Some causes of pavement deterioration are: traffic loading; environment influences; drainage deficiencies; materials quality problems; construction deficiencies and external contributors. Many researchers have made models that contain many variables like asphalt content, asphalt viscosity, fatigue life, stiffness of asphalt mixture, temperature and other parameters that affect the fatigue life. For this situation, a fuzzy linear regression model was employed and analyzed by using the traditional methods and our proposed method in order to overcome the multi-collinearity problem. The total spread error was used as a criterion to compare the performance of the studied methods. Simulation program was used to obtain the required results.
The application of parameter estimation to flight measurements to obtain lateral-directional stability derivatives of an augmented jet-flap STOL airplane

NASA Technical Reports Server (NTRS)

Stephenson, J. D.

1983-01-01

Flight experiments with an augmented jet flap STOL aircraft provided data from which the lateral directional stability and control derivatives were calculated by applying a linear regression parameter estimation procedure. The tests, which were conducted with the jet flaps set at a 65 deg deflection, covered a large range of angles of attack and engine power settings. The effect of changing the angle of the jet thrust vector was also investigated. Test results are compared with stability derivatives that had been predicted. The roll damping derived from the tests was significantly larger than had been predicted, whereas the other derivatives were generally in agreement with the predictions. Results obtained using a maximum likelihood estimation procedure are compared with those from the linear regression solutions.
Fundamental Analysis of the Linear Multiple Regression Technique for Quantification of Water Quality Parameters from Remote Sensing Data. Ph.D. Thesis - Old Dominion Univ.

NASA Technical Reports Server (NTRS)

Whitlock, C. H., III

1977-01-01

Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.
Refractive Status at Birth: Its Relation to Newborn Physical Parameters at Birth and Gestational Age

PubMed Central

Varghese, Raji Mathew; Sreenivas, Vishnubhatla; Puliyel, Jacob Mammen; Varughese, Sara

2009-01-01

Background Refractive status at birth is related to gestational age. Preterm babies have myopia which decreases as gestational age increases and term babies are known to be hypermetropic. This study looked at the correlation of refractive status with birth weight in term and preterm babies, and with physical indicators of intra-uterine growth such as the head circumference and length of the baby at birth. Methods All babies delivered at St. Stephens Hospital and admitted in the nursery were eligible for the study. Refraction was performed within the first week of life. 0.8% tropicamide with 0.5% phenylephrine was used to achieve cycloplegia and paralysis of accommodation. 599 newborn babies participated in the study. Data pertaining to the right eye is utilized for all the analyses except that for anisometropia where the two eyes were compared. Growth parameters were measured soon after birth. Simple linear regression analysis was performed to see the association of refractive status, (mean spherical equivalent (MSE), astigmatism and anisometropia) with each of the study variables, namely gestation, length, weight and head circumference. Subsequently, multiple linear regression was carried out to identify the independent predictors for each of the outcome parameters. Results Simple linear regression showed a significant relation between all 4 study variables and refractive error but in multiple regression only gestational age and weight were related to refractive error. The partial correlation of weight with MSE adjusted for gestation was 0.28 and that of gestation with MSE adjusted for weight was 0.10. Birth weight had a higher correlation to MSE than gestational age. Conclusion This is the first study to look at refractive error against all these growth parameters, in preterm and term babies at birth. It would appear from this study that birth weight rather than gestation should be used as criteria for screening for refractive error, especially in developing countries where the incidence of intrauterine malnutrition is higher. PMID:19214228
INNOVATIVE INSTRUMENTATION AND ANALYSIS OF THE TEMPERATURE MEASUREMENT FOR HIGH TEMPERATURE GASIFICATION

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seong W. Lee

2004-10-01

The systematic tests of the gasifier simulator on the clean thermocouple were completed in this reporting period. Within the systematic tests on the clean thermocouple, five (5) factors were considered as the experimental parameters including air flow rate, water flow rate, fine dust particle amount, ammonia addition and high/low frequency device (electric motor). The fractional factorial design method was used in the experiment design with sixteen (16) data sets of readings. Analysis of Variances (ANOVA) was applied to the results from systematic tests. The ANOVA results show that the un-balanced motor vibration frequency did not have the significant impact onmore » the temperature changes in the gasifier simulator. For the fine dust particles testing, the amount of fine dust particles has significant impact to the temperature measurements in the gasifier simulator. The effects of the air and water on the temperature measurements show the same results as reported in the previous report. The ammonia concentration was included as an experimental parameter for the reducing environment in this reporting period. The ammonia concentration does not seem to be a significant factor on the temperature changes. The linear regression analysis was applied to the temperature reading with five (5) factors. The accuracy of the linear regression is relatively low, which is less than 10% accuracy. Nonlinear regression was also conducted to the temperature reading with the same factors. Since the experiments were designed in two (2) levels, the nonlinear regression is not very effective with the dataset (16 readings). An extra central point test was conducted. With the data of the center point testing, the accuracy of the nonlinear regression is much better than the linear regression.« less
A comparative look at sunspot cycles

NASA Technical Reports Server (NTRS)

Wilson, R. M.

1984-01-01

On the basis of cycles 8 through 20, spanning about 143 years, observations of sunspot number, smoothed sunspot number, and their temporal properties were used to compute means, standard deviations, ranges, and frequency of occurrence histograms for a number of sunspot cycle parameters. The resultant schematic sunspot cycle was contrasted with the mean sunspot cycle, obtained by averaging smoothed sunspot number as a function of time, tying all cycles (8 through 20) to their minimum occurence date. A relatively good approximation of the time variation of smoothed sunspot number for a given cycle is possible if sunspot cycles are regarded in terms of being either HIGH- or LOW-R(MAX) cycles or LONG- or SHORT-PERIOD cycles, especially the latter. Linear regression analyses were performed comparing late cycle parameters with early cycle parameters and solar cycle number. The early occurring cycle parameters can be used to estimate later occurring cycle parameters with relatively good success, based on cycle 21 as an example. The sunspot cycle record clearly shows that the trend for both R(MIN) and R(MAX) was toward decreasing value between cycles 8 through 14 and toward increasing value between cycles 14 through 20. Linear regression equations were also obtained for several measures of solar activity.
Efficient least angle regression for identification of linear-in-the-parameters models

PubMed Central

Beach, Thomas H.; Rezgui, Yacine

2017-01-01

Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm. PMID:28293140
A Study of the Effect of the Front-End Styling of Sport Utility Vehicles on Pedestrian Head Injuries

PubMed Central

Qin, Qin; Chen, Zheng; Bai, Zhonghao; Cao, Libo

2018-01-01

Background The number of sport utility vehicles (SUVs) on China market is continuously increasing. It is necessary to investigate the relationships between the front-end styling features of SUVs and head injuries at the styling design stage for improving the pedestrian protection performance and product development efficiency. Methods Styling feature parameters were extracted from the SUV side contour line. And simplified finite element models were established based on the 78 SUV side contour lines. Pedestrian headform impact simulations were performed and validated. The head injury criterion of 15 ms (HIC15) at four wrap-around distances was obtained. A multiple linear regression analysis method was employed to describe the relationships between the styling feature parameters and the HIC15 at each impact point. Results The relationship between the selected styling features and the HIC15 showed reasonable correlations, and the regression models and the selected independent variables showed statistical significance. Conclusions The regression equations obtained by multiple linear regression can be used to assess the performance of SUV styling in protecting pedestrians' heads and provide styling designers with technical guidance regarding their artistic creations.
Prediction of Water Quality Parameters Using Statistical Methods: A Case Study in a Specially Protected Area, Ankara, Turkey

NASA Astrophysics Data System (ADS)

Alp, E.; Yücel, Ö.; Özcan, Z.

2014-12-01

Turkey has been making many legal arrangements for sustainable water management during the harmonization process with the European Union. In order to make cost effective and efficient decisions, monitoring network in Turkey has been expanding. However, due to time and budget constraints, desired number of monitoring campaigns can not be carried. Hence, in this study, independent parameters that can be measured easily and quickly are used to estimate water quality parameters in Lake Mogan and Eymir using linear regression. Nonpoint sources are one of the major pollutant components in Eymir and Mogan lakes. In this paper, a correlation between easily measurable parameters, DO, temperature, electrical conductivity, pH, precipitation and dependent variables, TN, TP, COD, Chl-a, TSS, Total Coliform is investigated. Simple regression analysis is performed for each season in Eymir and Mogan lakes by using SPSS Statistical program using the water quality data collected between 2006-2012. Regression analysis demonstrated significant linear relationship between measured and simulated concentrations for TN (R2=0.86), TP (R2=0.85), TSS (R2=0.91), Chl-a (R2=0.94), COD (R2=0.99), T. Coliform (R2=0.97) which are the best results in each season for Eymir and Mogan Lakes. The overall results of this study shows that by using easily measurable parameters even in ungauged situation the water quality of lakes can be predicted. Moreover, the outputs obtained from the regression equations can be used as an input for water quality models such as phosphorus budget model which is used to calculate the required reduction in the external phosphorus load to Lake Mogan to meet the water quality standards.
Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

NASA Astrophysics Data System (ADS)

Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

2016-06-01

Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).
Parameter Identification of Static Friction Based on An Optimal Exciting Trajectory

NASA Astrophysics Data System (ADS)

Tu, X.; Zhao, P.; Zhou, Y. F.

2017-12-01

In this paper, we focus on how to improve the identification efficiency of friction parameters in a robot joint. First, the static friction model that has only linear dependencies with respect to their parameters is adopted so that the servomotor dynamics can be linearized. In this case, the traditional exciting trajectory based on Fourier series is modified by replacing the constant term with quintic polynomial to ensure the boundary continuity of speed and acceleration. Then, the Fourier-related parameters are optimized by genetic algorithm(GA) in which the condition number of regression matrix is set as the fitness function. At last, compared with the constant-velocity tracking experiment, the friction parameters from the exciting trajectory experiment has the similar result with the advantage of time reduction.
Estimating standard errors in feature network models.

PubMed

Frank, Laurence E; Heiser, Willem J

2007-05-01

Feature network models are graphical structures that represent proximity data in a discrete space while using the same formalism that is the basis of least squares methods employed in multidimensional scaling. Existing methods to derive a network model from empirical data only give the best-fitting network and yield no standard errors for the parameter estimates. The additivity properties of networks make it possible to consider the model as a univariate (multiple) linear regression problem with positivity restrictions on the parameters. In the present study, both theoretical and empirical standard errors are obtained for the constrained regression parameters of a network model with known features. The performance of both types of standard error is evaluated using Monte Carlo techniques.
A statistical methodology for estimating transport parameters: Theory and applications to one-dimensional advectivec-dispersive systems

USGS Publications Warehouse

Wagner, Brian J.; Gorelick, Steven M.

1986-01-01

A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
40 CFR 53.34 - Test procedure for methods for PM10 and Class I methods for PM2.5.

Code of Federal Regulations, 2011 CFR

2011-07-01

... linear regression parameters (slope, intercept, and correlation coefficient) describing the relationship... correlation coefficient. (2) To pass the test for comparability, the slope, intercept, and correlation...
Exact and Approximate Statistical Inference for Nonlinear Regression and the Estimating Equation Approach.

PubMed

Demidenko, Eugene

2017-09-01

The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.
Image interpolation via regularized local linear regression.

PubMed

Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang

2011-12-01

The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
Properties of added variable plots in Cox's regression model.

PubMed

Lindkvist, M

2000-03-01

The added variable plot is useful for examining the effect of a covariate in regression models. The plot provides information regarding the inclusion of a covariate, and is useful in identifying influential observations on the parameter estimates. Hall et al. (1996) proposed a plot for Cox's proportional hazards model derived by regarding the Cox model as a generalized linear model. This paper proves and discusses properties of this plot. These properties make the plot a valuable tool in model evaluation. Quantities considered include parameter estimates, residuals, leverage, case influence measures and correspondence to previously proposed residuals and diagnostics.
Forcing Regression through a Given Point Using Any Familiar Computational Routine.

DTIC Science & Technology

1983-03-01

a linear model , Y =a + OX + e ( Model I) then adopt the principle of least squares; and use sample data to estimate the unknown parameters, a and 8...has an expected value of zero indicates that the "average" response is considered linear . If c varies widely, Model I, though conceptually correct, may...relationship is linear from the maximum observed x to x - a, then Model II should be used. To pro- ceed with the customary evaluation of Model I would be

Equilibrium, kinetics and process design of acid yellow 132 adsorption onto red pine sawdust.

PubMed

Can, Mustafa

2015-01-01

Linear and non-linear regression procedures have been applied to the Langmuir, Freundlich, Tempkin, Dubinin-Radushkevich, and Redlich-Peterson isotherms for adsorption of acid yellow 132 (AY132) dye onto red pine (Pinus resinosa) sawdust. The effects of parameters such as particle size, stirring rate, contact time, dye concentration, adsorption dose, pH, and temperature were investigated, and interaction was characterized by Fourier transform infrared spectroscopy and field emission scanning electron microscope. The non-linear method of the Langmuir isotherm equation was found to be the best fitting model to the equilibrium data. The maximum monolayer adsorption capacity was found as 79.5 mg/g. The calculated thermodynamic results suggested that AY132 adsorption onto red pine sawdust was an exothermic, physisorption, and spontaneous process. Kinetics was analyzed by four different kinetic equations using non-linear regression analysis. The pseudo-second-order equation provides the best fit with experimental data.
Simple method for quick estimation of aquifer hydrogeological parameters

NASA Astrophysics Data System (ADS)

Ma, C.; Li, Y. Y.

2017-08-01

Development of simple and accurate methods to determine the aquifer hydrogeological parameters was of importance for groundwater resources assessment and management. Aiming at the present issue of estimating aquifer parameters based on some data of the unsteady pumping test, a fitting function of Theis well function was proposed using fitting optimization method and then a unitary linear regression equation was established. The aquifer parameters could be obtained by solving coefficients of the regression equation. The application of the proposed method was illustrated, using two published data sets. By the error statistics and analysis on the pumping drawdown, it showed that the method proposed in this paper yielded quick and accurate estimates of the aquifer parameters. The proposed method could reliably identify the aquifer parameters from long distance observed drawdowns and early drawdowns. It was hoped that the proposed method in this paper would be helpful for practicing hydrogeologists and hydrologists.
Stress Regression Analysis of Asphalt Concrete Deck Pavement Based on Orthogonal Experimental Design and Interlayer Contact

NASA Astrophysics Data System (ADS)

Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei

2018-03-01

A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.
Effect of the physicochemical parameters of benzimidazole molecules on their retention by a nonpolar sorbent from an aqueous acetonitrile solution

NASA Astrophysics Data System (ADS)

Shafigulin, R. V.; Safonova, I. A.; Bulanova, A. V.

2015-09-01

The effect of the structure of benzimidazoles on their chromatographic retention on octadecyl silica gel from an aqueous acetonitrile eluent was studied. One- and many-parameter correlation equations were obtained by linear regression analysis, and their prognostic potential in determining the retention factors of benzimidazoles under study was analyzed.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer

NASA Astrophysics Data System (ADS)

Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi

2011-07-01

Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Preliminary study of the association between the elimination parameters of phenytoin and phenobarbital.

PubMed

Methaneethorn, Janthima; Panomvana, Duangchit; Vachirayonstien, Thaveechai

2017-09-26

Therapeutic drug monitoring is essential for both phenytoin and phenobarbital therapy given their narrow therapeutic indexes. Nevertheless, the measurement of either phenytoin or phenobarbital concentrations might not be available in some rural hospitals. Information assisting individualized phenytoin and phenobarbital combination therapy is important. This study's objective was to determine the relationship between the maximum rate of metabolism of phenytoin (Vmax) and phenobarbital clearance (CLPB), which can serve as a guide to individualized drug therapy. Data on phenytoin and phenobarbital concentrations of 19 epileptic patients concurrently receiving both drugs were obtained from medical records. Phenytoin and phenobarbital pharmacokinetic parameters were studied at steady-state conditions. The relationship between the elimination parameters of both drugs was determined using simple linear regression. A high correlation coefficient between Vmax and CLPB was found [r=0.744; p<0.001 for Vmax (mg/kg/day) vs. CLPB (L/kg/day)]. Such a relatively strong linear relationship between the elimination parameters of both drugs indicates that Vmax might be predicted from CLPB and vice versa. Regression equations were established for estimating Vmax from CLPB, and vice versa in patients treated with combination of phenytoin and phenobarbital. These proposed equations can be of use in aiding individualized drug therapy.
Stature Estimation from Lower Limb Anthropometry using Linear Regression Analysis: A Study on the Malaysian Population.

PubMed

Abu Bakar, S N; Aspalilah, A; AbdelNasser, I; Nurliza, A; Hairuliza, M J; Swarhib, M; Das, S; Mohd Nor, F

2017-01-01

Stature is one of the characteristics that could be used to identify human, besides age, sex and racial affiliation. This is useful when the body found is either dismembered, mutilated or even decomposed, and helps in narrowing down the missing person's identity. The main aim of the present study was to construct regression functions for stature estimation by using lower limb bones in the Malaysian population. The sample comprised 87 adult individuals (81 males, 6 females) aged between 20 to 79 years. The parameters such as thigh length, lower leg length, leg length, foot length, foot height and foot breadth were measured. They were measured by a ruler and measuring tape. Statistical analysis involved independent t-test to analyse the difference between lower limbs in male and female. The Pearson's correlation test was used to analyse correlations between lower limb parameters and stature, and the linear regressions were used to form equations. The paired t-test was used to compare between actual stature and estimated stature by using the equations formed. Using independent t-test, there was a significant difference (p< 0.05) in the measurement between males and females with regard to leg length, thigh length, lower leg length, foot length and foot breadth. The thigh length, leg length and foot length were observed to have strong correlations with stature with p= 0.75, p= 0.81 and p= 0.69, respectively. Linear regressions were formulated for stature estimation. Paired t-test showed no significant difference between actual stature and estimated stature. It is concluded that regression functions can be used to estimate stature to identify skeletal remains in the Malaysia population.
ESTER HYDROLYSIS RATE CONSTANT PREDICTION FROM INFRARED INTERFEROGRAMS

EPA Science Inventory

A method for predicting reactivity parameters of organic chemicals from spectroscopic data is being developed to assist in assessing the environmental fate of pollutants. he prototype system, which employs multiple linear regression analysis using selected points from the Fourier...
Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

NASA Astrophysics Data System (ADS)

Wu, Cheng; Zhen Yu, Jian

2018-03-01

Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
Using ridge regression in systematic pointing error corrections

NASA Technical Reports Server (NTRS)

Guiar, C. N.

1988-01-01

A pointing error model is used in the antenna calibration process. Data from spacecraft or radio star observations are used to determine the parameters in the model. However, the regression variables are not truly independent, displaying a condition known as multicollinearity. Ridge regression, a biased estimation technique, is used to combat the multicollinearity problem. Two data sets pertaining to Voyager 1 spacecraft tracking (days 105 and 106 of 1987) were analyzed using both linear least squares and ridge regression methods. The advantages and limitations of employing the technique are presented. The problem is not yet fully resolved.
Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

PubMed

Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

2011-06-01

Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

PubMed Central

2011-01-01

Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
Wheat flour dough Alveograph characteristics predicted by Mixolab regression models.

PubMed

Codină, Georgiana Gabriela; Mironeasa, Silvia; Mironeasa, Costel; Popa, Ciprian N; Tamba-Berehoiu, Radiana

2012-02-01

In Romania, the Alveograph is the most used device to evaluate the rheological properties of wheat flour dough, but lately the Mixolab device has begun to play an important role in the breadmaking industry. These two instruments are based on different principles but there are some correlations that can be found between the parameters determined by the Mixolab and the rheological properties of wheat dough measured with the Alveograph. Statistical analysis on 80 wheat flour samples using the backward stepwise multiple regression method showed that Mixolab values using the ‘Chopin S’ protocol (40 samples) and ‘Chopin + ’ protocol (40 samples) can be used to elaborate predictive models for estimating the value of the rheological properties of wheat dough: baking strength (W), dough tenacity (P) and extensibility (L). The correlation analysis confirmed significant findings (P < 0.05 and P < 0.01) between the parameters of wheat dough studied by the Mixolab and its rheological properties measured with the Alveograph. A number of six predictive linear equations were obtained. Linear regression models gave multiple regression coefficients with R²(adjusted) > 0.70 for P, R²(adjusted) > 0.70 for W and R²(adjusted) > 0.38 for L, at a 95% confidence interval. Copyright © 2011 Society of Chemical Industry.
A pocket-sized metabolic analyzer for assessment of resting energy expenditure.

PubMed

Zhao, Di; Xian, Xiaojun; Terrera, Mirna; Krishnan, Ranganath; Miller, Dylan; Bridgeman, Devon; Tao, Kevin; Zhang, Lihua; Tsow, Francis; Forzani, Erica S; Tao, Nongjian

2014-04-01

The assessment of metabolic parameters related to energy expenditure has a proven value for weight management; however these measurements remain too difficult and costly for monitoring individuals at home. The objective of this study is to evaluate the accuracy of a new pocket-sized metabolic analyzer device for assessing energy expenditure at rest (REE) and during sedentary activities (EE). The new device performs indirect calorimetry by measuring an individual's oxygen consumption (VO2) and carbon dioxide production (VCO2) rates, which allows the determination of resting- and sedentary activity-related energy expenditure. VO2 and VCO2 values of 17 volunteer adult subjects were measured during resting and sedentary activities in order to compare the metabolic analyzer with the Douglas bag method. The Douglas bag method is considered the Gold Standard method for indirect calorimetry. Metabolic parameters of VO2, VCO2, and energy expenditure were compared using linear regression analysis, paired t-tests, and Bland-Altman plots. Linear regression analysis of measured VO2 and VCO2 values, as well as calculated energy expenditure assessed with the new analyzer and Douglas bag method, had the following linear regression parameters (linear regression slope LRS0, and R-squared coefficient, r(2)) with p = 0: LRS0 (SD) = 1.00 (0.01), r(2) = 0.9933 for VO2; LRS0 (SD) = 1.00 (0.01), r(2) = 0.9929 for VCO2; and LRS0 (SD) = 1.00 (0.01), r(2) = 0.9942 for energy expenditure. In addition, results from paired t-tests did not show statistical significant difference between the methods with a significance level of α = 0.05 for VO2, VCO2, REE, and EE. Furthermore, the Bland-Altman plot for REE showed good agreement between methods with 100% of the results within ±2SD, which was equivalent to ≤10% error. The findings demonstrate that the new pocket-sized metabolic analyzer device is accurate for determining VO2, VCO2, and energy expenditure. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Multi-Axis Identifiability Using Single-Surface Parameter Estimation Maneuvers on the X-48B Blended Wing Body

NASA Technical Reports Server (NTRS)

Ratnayake, Nalin A.; Koshimoto, Ed T.; Taylor, Brian R.

2011-01-01

The problem of parameter estimation on hybrid-wing-body type aircraft is complicated by the fact that many design candidates for such aircraft involve a large number of aero- dynamic control effectors that act in coplanar motion. This fact adds to the complexity already present in the parameter estimation problem for any aircraft with a closed-loop control system. Decorrelation of system inputs must be performed in order to ascertain individual surface derivatives with any sort of mathematical confidence. Non-standard control surface configurations, such as clamshell surfaces and drag-rudder modes, further complicate the modeling task. In this paper, asymmetric, single-surface maneuvers are used to excite multiple axes of aircraft motion simultaneously. Time history reconstructions of the moment coefficients computed by the solved regression models are then compared to each other in order to assess relative model accuracy. The reduced flight-test time required for inner surface parameter estimation using multi-axis methods was found to come at the cost of slightly reduced accuracy and statistical confidence for linear regression methods. Since the multi-axis maneuvers captured parameter estimates similar to both longitudinal and lateral-directional maneuvers combined, the number of test points required for the inner, aileron-like surfaces could in theory have been reduced by 50%. While trends were similar, however, individual parameters as estimated by a multi-axis model were typically different by an average absolute difference of roughly 15-20%, with decreased statistical significance, than those estimated by a single-axis model. The multi-axis model exhibited an increase in overall fit error of roughly 1-5% for the linear regression estimates with respect to the single-axis model, when applied to flight data designed for each, respectively.
Stature estimation from the lengths of the growing foot-a study on North Indian adolescents.

PubMed

Krishan, Kewal; Kanchan, Tanuj; Passi, Neelam; DiMaggio, John A

2012-12-01

Stature estimation is considered as one of the basic parameters of the investigation process in unknown and commingled human remains in medico-legal case work. Race, age and sex are the other parameters which help in this process. Stature estimation is of the utmost importance as it completes the biological profile of a person along with the other three parameters of identification. The present research is intended to formulate standards for stature estimation from foot dimensions in adolescent males from North India and study the pattern of foot growth during the growing years. 154 male adolescents from the Northern part of India were included in the study. Besides stature, five anthropometric measurements that included the length of the foot from each toe (T1, T2, T3, T4, and T5 respectively) to pternion were measured on each foot. The data was analyzed statistically using Student's t-test, Pearson's correlation, linear and multiple regression analysis for estimation of stature and growth of foot during ages 13-18 years. Correlation coefficients between stature and all the foot measurements were found to be highly significant and positively correlated. Linear regression models and multiple regression models (with age as a co-variable) were derived for estimation of stature from the different measurements of the foot. Multiple regression models (with age as a co-variable) estimate stature with greater accuracy than the regression models for 13-18 years age group. The study shows the growth pattern of feet in North Indian adolescents and indicates that anthropometric measurements of the foot and its segments are valuable in estimation of stature in growing individuals of that population. Copyright © 2012 Elsevier Ltd. All rights reserved.
Lateral-Directional Parameter Estimation on the X-48B Aircraft Using an Abstracted, Multi-Objective Effector Model

NASA Technical Reports Server (NTRS)

Ratnayake, Nalin A.; Waggoner, Erin R.; Taylor, Brian R.

2011-01-01

The problem of parameter estimation on hybrid-wing-body aircraft is complicated by the fact that many design candidates for such aircraft involve a large number of aerodynamic control effectors that act in coplanar motion. This adds to the complexity already present in the parameter estimation problem for any aircraft with a closed-loop control system. Decorrelation of flight and simulation data must be performed in order to ascertain individual surface derivatives with any sort of mathematical confidence. Non-standard control surface configurations, such as clamshell surfaces and drag-rudder modes, further complicate the modeling task. In this paper, time-decorrelation techniques are applied to a model structure selected through stepwise regression for simulated and flight-generated lateral-directional parameter estimation data. A virtual effector model that uses mathematical abstractions to describe the multi-axis effects of clamshell surfaces is developed and applied. Comparisons are made between time history reconstructions and observed data in order to assess the accuracy of the regression model. The Cram r-Rao lower bounds of the estimated parameters are used to assess the uncertainty of the regression model relative to alternative models. Stepwise regression was found to be a useful technique for lateral-directional model design for hybrid-wing-body aircraft, as suggested by available flight data. Based on the results of this study, linear regression parameter estimation methods using abstracted effectors are expected to perform well for hybrid-wing-body aircraft properly equipped for the task.
Modeling absolute differences in life expectancy with a censored skew-normal regression approach

PubMed Central

Clough-Gorr, Kerri; Zwahlen, Marcel

2015-01-01

Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544
An analysis of input errors in precipitation-runoff models using regression with errors in the independent variables

USGS Publications Warehouse

Troutman, Brent M.

1982-01-01

Errors in runoff prediction caused by input data errors are analyzed by treating precipitation-runoff models as regression (conditional expectation) models. Independent variables of the regression consist of precipitation and other input measurements; the dependent variable is runoff. In models using erroneous input data, prediction errors are inflated and estimates of expected storm runoff for given observed input variables are biased. This bias in expected runoff estimation results in biased parameter estimates if these parameter estimates are obtained by a least squares fit of predicted to observed runoff values. The problems of error inflation and bias are examined in detail for a simple linear regression of runoff on rainfall and for a nonlinear U.S. Geological Survey precipitation-runoff model. Some implications for flood frequency analysis are considered. A case study using a set of data from Turtle Creek near Dallas, Texas illustrates the problems of model input errors.
Quantitative assessment of cervical vertebral maturation using cone beam computed tomography in Korean girls.

PubMed

Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Son, Woo-Sung

2015-01-01

This study was aimed to examine the correlation between skeletal maturation status and parameters from the odontoid process/body of the second vertebra and the bodies of third and fourth cervical vertebrae and simultaneously build multiple regression models to be able to estimate skeletal maturation status in Korean girls. Hand-wrist radiographs and cone beam computed tomography (CBCT) images were obtained from 74 Korean girls (6-18 years of age). CBCT-generated cervical vertebral maturation (CVM) was used to demarcate the odontoid process and the body of the second cervical vertebra, based on the dentocentral synchondrosis. Correlation coefficient analysis and multiple linear regression analysis were used for each parameter of the cervical vertebrae (P < 0.05). Forty-seven of 64 parameters from CBCT-generated CVM (independent variables) exhibited statistically significant correlations (P < 0.05). The multiple regression model with the greatest R (2) had six parameters (PH2/W2, UW2/W2, (OH+AH2)/LW2, UW3/LW3, D3, and H4/W4) as independent variables with a variance inflation factor (VIF) of <2. CBCT-generated CVM was able to include parameters from the second cervical vertebral body and odontoid process, respectively, for the multiple regression models. This suggests that quantitative analysis might be used to estimate skeletal maturation status.

Criteria for the use of regression analysis for remote sensing of sediment and pollutants

NASA Technical Reports Server (NTRS)

Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R.

1982-01-01

An examination of limitations, requirements, and precision of the linear multiple-regression technique for quantification of marine environmental parameters is conducted. Both environmental and optical physics conditions have been defined for which an exact solution to the signal response equations is of the same form as the multiple regression equation. Various statistical parameters are examined to define a criteria for selection of an unbiased fit when upwelled radiance values contain error and are correlated with each other. Field experimental data are examined to define data smoothing requirements in order to satisfy the criteria of Daniel and Wood (1971). Recommendations are made concerning improved selection of ground-truth locations to maximize variance and to minimize physical errors associated with the remote sensing experiment.
[Correlation between gaseous exchange rate, body temperature, and mitochondrial protein content in the liver of mice].

PubMed

Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E

2002-01-01

Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.
Does Nonlinear Modeling Play a Role in Plasmid Bioprocess Monitoring Using Fourier Transform Infrared Spectra?

PubMed

Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M

2017-06-01

The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Advances in simultaneous atmospheric profile and cloud parameter regression based retrieval from high-spectral resolution radiance measurements

NASA Astrophysics Data System (ADS)

Weisz, Elisabeth; Smith, William L.; Smith, Nadia

2013-06-01

The dual-regression (DR) method retrieves information about the Earth surface and vertical atmospheric conditions from measurements made by any high-spectral resolution infrared sounder in space. The retrieved information includes temperature and atmospheric gases (such as water vapor, ozone, and carbon species) as well as surface and cloud top parameters. The algorithm was designed to produce a high-quality product with low latency and has been demonstrated to yield accurate results in real-time environments. The speed of the retrieval is achieved through linear regression, while accuracy is achieved through a series of classification schemes and decision-making steps. These steps are necessary to account for the nonlinearity of hyperspectral retrievals. In this work, we detail the key steps that have been developed in the DR method to advance accuracy in the retrieval of nonlinear parameters, specifically cloud top pressure. The steps and their impact on retrieval results are discussed in-depth and illustrated through relevant case studies. In addition to discussing and demonstrating advances made in addressing nonlinearity in a linear geophysical retrieval method, advances toward multi-instrument geophysical analysis by applying the DR to three different operational sounders in polar orbit are also noted. For any area on the globe, the DR method achieves consistent accuracy and precision, making it potentially very valuable to both the meteorological and environmental user communities.
A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

PubMed

Wilke, Marko

2018-02-01

This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.
Lifespan development of pro- and anti-saccades: multiple regression models for point estimates.

PubMed

Klein, Christoph; Foerster, Friedrich; Hartnegg, Klaus; Fischer, Burkhart

2005-12-07

The comparative study of anti- and pro-saccade task performance contributes to our functional understanding of the frontal lobes, their alterations in psychiatric or neurological populations, and their changes during the life span. In the present study, we apply regression analysis to model life span developmental effects on various pro- and anti-saccade task parameters, using data of a non-representative sample of 327 participants aged 9 to 88 years. Development up to the age of about 27 years was dominated by curvilinear rather than linear effects of age. Furthermore, the largest developmental differences were found for intra-subject variability measures and the anti-saccade task parameters. Ageing, by contrast, had the shape of a global linear decline of the investigated saccade functions, lacking the differential effects of age observed during development. While these results do support the assumption that frontal lobe functions can be distinguished from other functions by their strong and protracted development, they do not confirm the assumption of disproportionate deterioration of frontal lobe functions with ageing. We finally show that the regression models applied here to quantify life span developmental effects can also be used for individual predictions in applied research contexts or clinical practice.
Adjusting for overdispersion in piecewise exponential regression models to estimate excess mortality rate in population-based research.

PubMed

Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard

2016-10-01

In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value <0.001). However, the flexible piecewise exponential model showed the smallest overdispersion parameter (3.2 versus 21.3) for non-flexible piecewise exponential models. We showed that there were no major differences between methods. However, using a flexible piecewise regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Multiple linear regression and regression with time series error models in forecasting PM10 concentrations in Peninsular Malaysia.

PubMed

Ng, Kar Yong; Awang, Norhashidah

2018-01-06

Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.
Correlation between lung to thorax transverse area ratio and observed/expected lung area to head circumference ratio in fetuses with left-sided diaphragmatic hernia.

PubMed

Hidaka, Nobuhiro; Murata, Masaharu; Sasahara, Jun; Ishii, Keisuke; Mitsuda, Nobuaki

2015-05-01

Observed/expected lung area to head circumference ratio (o/e LHR) and lung to thorax transverse area ratio (LTR) are the sonographic indicators of postnatal outcome in fetuses with congenital diaphragmatic hernia (CDH), and they are not influenced by gestational age. We aimed to evaluate the relationship between these two parameters in the same subjects with fetal left-sided CDH. Fetuses with left-sided CDH managed between 2005 and 2012 were included. Data of LTR and o/e LHR values measured on the same day prior to 33 weeks' gestation in target fetuses were retrospectively collected. The correlation between the two parameters was estimated using the Spearman's rank-correlation coefficient, and linear regression analysis was used to assess the relationship between them. Data on 61 measurements from 36 CDH fetuses were analyzed to obtain a Spearman's rank-correlation coefficient of 0.74 with the following linear equation: LTR = 0.002 × (o/e LHR) + 0.005. The determination coefficient of this linear equation was sufficiently high at 0.712, and the prediction accuracy obtained with this regression formula was considered satisfactory. A good linear correlation between the LTR and the o/e LHR was obtained, suggesting that we can translate the predictive parameters for each other. This information is expected to be useful to improve our understanding of different investigations focusing on LTR or o/e LHR as a predictor of postnatal outcome in CDH. © 2014 Japanese Teratology Society.
Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Suparti; Wahyu Utami, Tiani

2018-03-01

Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
A parameter estimation subroutine package

NASA Technical Reports Server (NTRS)

Bierman, G. J.; Nead, W. M.

1977-01-01

Linear least squares estimation and regression analyses continue to play a major role in orbit determination and related areas. FORTRAN subroutines have been developed to facilitate analyses of a variety of parameter estimation problems. Easy to use multipurpose sets of algorithms are reported that are reasonably efficient and which use a minimal amount of computer storage. Subroutine inputs, outputs, usage and listings are given, along with examples of how these routines can be used.
Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

PubMed

Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

2016-01-01

Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
A comparison of several methods of solving nonlinear regression groundwater flow problems

USGS Publications Warehouse

Cooley, Richard L.

1985-01-01

Computational efficiency and computer memory requirements for four methods of minimizing functions were compared for four test nonlinear-regression steady state groundwater flow problems. The fastest methods were the Marquardt and quasi-linearization methods, which required almost identical computer times and numbers of iterations; the next fastest was the quasi-Newton method, and last was the Fletcher-Reeves method, which did not converge in 100 iterations for two of the problems. The fastest method per iteration was the Fletcher-Reeves method, and this was followed closely by the quasi-Newton method. The Marquardt and quasi-linearization methods were slower. For all four methods the speed per iteration was directly related to the number of parameters in the model. However, this effect was much more pronounced for the Marquardt and quasi-linearization methods than for the other two. Hence the quasi-Newton (and perhaps Fletcher-Reeves) method might be more efficient than either the Marquardt or quasi-linearization methods if the number of parameters in a particular model were large, although this remains to be proven. The Marquardt method required somewhat less central memory than the quasi-linearization metilod for three of the four problems. For all four problems the quasi-Newton method required roughly two thirds to three quarters of the memory required by the Marquardt method, and the Fletcher-Reeves method required slightly less memory than the quasi-Newton method. Memory requirements were not excessive for any of the four methods.
Evaluation of force-velocity and power-velocity relationship of arm muscles.

PubMed

Sreckovic, Sreten; Cuk, Ivan; Djuric, Sasa; Nedeljkovic, Aleksandar; Mirkov, Dragan; Jaric, Slobodan

2015-08-01

A number of recent studies have revealed an approximately linear force-velocity (F-V) and, consequently, a parabolic power-velocity (P-V) relationship of multi-joint tasks. However, the measurement characteristics of their parameters have been neglected, particularly those regarding arm muscles, which could be a problem for using the linear F-V model in both research and routine testing. Therefore, the aims of the present study were to evaluate the strength, shape, reliability, and concurrent validity of the F-V relationship of arm muscles. Twelve healthy participants performed maximum bench press throws against loads ranging from 20 to 70 % of their maximum strength, and linear regression model was applied on the obtained range of F and V data. One-repetition maximum bench press and medicine ball throw tests were also conducted. The observed individual F-V relationships were exceptionally strong (r = 0.96-0.99; all P < 0.05) and fairly linear, although it remains unresolved whether a polynomial fit could provide even stronger relationships. The reliability of parameters obtained from the linear F-V regressions proved to be mainly high (ICC > 0.80), while their concurrent validity regarding directly measured F, P, and V ranged from high (for maximum F) to medium-to-low (for maximum P and V). The findings add to the evidence that the linear F-V and, consequently, parabolic P-V models could be used to study the mechanical properties of muscular systems, as well as to design a relatively simple, reliable, and ecologically valid routine test of the muscle ability of force, power, and velocity production.
Method validation using weighted linear regression models for quantification of UV filters in water samples.

PubMed

da Silva, Claudia Pereira; Emídio, Elissandro Soares; de Marchi, Mary Rosa Rodrigues

2015-01-01

This paper describes the validation of a method consisting of solid-phase extraction followed by gas chromatography-tandem mass spectrometry for the analysis of the ultraviolet (UV) filters benzophenone-3, ethylhexyl salicylate, ethylhexyl methoxycinnamate and octocrylene. The method validation criteria included evaluation of selectivity, analytical curve, trueness, precision, limits of detection and limits of quantification. The non-weighted linear regression model has traditionally been used for calibration, but it is not necessarily the optimal model in all cases. Because the assumption of homoscedasticity was not met for the analytical data in this work, a weighted least squares linear regression was used for the calibration method. The evaluated analytical parameters were satisfactory for the analytes and showed recoveries at four fortification levels between 62% and 107%, with relative standard deviations less than 14%. The detection limits ranged from 7.6 to 24.1 ng L(-1). The proposed method was used to determine the amount of UV filters in water samples from water treatment plants in Araraquara and Jau in São Paulo, Brazil. Copyright © 2014 Elsevier B.V. All rights reserved.
An iteratively reweighted least-squares approach to adaptive robust adjustment of parameters in linear regression models with autoregressive and t-distributed deviations

NASA Astrophysics Data System (ADS)

Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza

2018-03-01

In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
Serum Iron Level Is Associated with Time to Antibiotics in Cystic Fibrosis.

PubMed

Gifford, Alex H; Dorman, Dana B; Moulton, Lisa A; Helm, Jennifer E; Griffin, Mary M; MacKenzie, Todd A

2015-12-01

Serum levels of hepcidin-25, a peptide hormone that reduces blood iron content, are elevated when patients with cystic fibrosis (CF) develop pulmonary exacerbation (PEx). Because hepcidin-25 is unavailable as a clinical laboratory test, we questioned whether a one-time serum iron level was associated with the subsequent number of days until PEx, as defined by the need to receive systemic antibiotics (ABX) for health deterioration. Clinical, biochemical, and microbiological parameters were simultaneously checked in 54 adults with CF. Charts were reviewed to determine when they first experienced a PEx after these parameters were assessed. Time to ABX was compared in subgroups with and without specific attributes. Multivariate linear regression was used to identify parameters that significantly explained variation in time to ABX. In univariate analyses, time to ABX was significantly shorter in subjects with Aspergillus-positive sputum cultures and CF-related diabetes. Multivariate linear regression models demonstrated that shorter time to ABX was associated with younger age, lower serum iron level, and Aspergillus sputum culture positivity. Serum iron, age, and Aspergillus sputum culture positivity are factors associated with shorter time to subsequent PEx in CF adults. © 2015 Wiley Periodicals, Inc.
System Identification Applied to Dynamic CFD Simulation and Wind Tunnel Data

NASA Technical Reports Server (NTRS)

Murphy, Patrick C.; Klein, Vladislav; Frink, Neal T.; Vicroy, Dan D.

2011-01-01

Demanding aerodynamic modeling requirements for military and civilian aircraft have provided impetus for researchers to improve computational and experimental techniques. Model validation is a key component for these research endeavors so this study is an initial effort to extend conventional time history comparisons by comparing model parameter estimates and their standard errors using system identification methods. An aerodynamic model of an aircraft performing one-degree-of-freedom roll oscillatory motion about its body axes is developed. The model includes linear aerodynamics and deficiency function parameters characterizing an unsteady effect. For estimation of unknown parameters two techniques, harmonic analysis and two-step linear regression, were applied to roll-oscillatory wind tunnel data and to computational fluid dynamics (CFD) simulated data. The model used for this study is a highly swept wing unmanned aerial combat vehicle. Differences in response prediction, parameters estimates, and standard errors are compared and discussed
Examining the influence of link function misspecification in conventional regression models for developing crash modification factors.

PubMed

Wu, Lingtao; Lord, Dominique

2017-05-01

This study further examined the use of regression models for developing crash modification factors (CMFs), specifically focusing on the misspecification in the link function. The primary objectives were to validate the accuracy of CMFs derived from the commonly used regression models (i.e., generalized linear models or GLMs with additive linear link functions) when some of the variables have nonlinear relationships and quantify the amount of bias as a function of the nonlinearity. Using the concept of artificial realistic data, various linear and nonlinear crash modification functions (CM-Functions) were assumed for three variables. Crash counts were randomly generated based on these CM-Functions. CMFs were then derived from regression models for three different scenarios. The results were compared with the assumed true values. The main findings are summarized as follows: (1) when some variables have nonlinear relationships with crash risk, the CMFs for these variables derived from the commonly used GLMs are all biased, especially around areas away from the baseline conditions (e.g., boundary areas); (2) with the increase in nonlinearity (i.e., nonlinear relationship becomes stronger), the bias becomes more significant; (3) the quality of CMFs for other variables having linear relationships can be influenced when mixed with those having nonlinear relationships, but the accuracy may still be acceptable; and (4) the misuse of the link function for one or more variables can also lead to biased estimates for other parameters. This study raised the importance of the link function when using regression models for developing CMFs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

PubMed

Chen, Baojiang; Qin, Jing

2014-05-10

In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.

Quantitative Assessment of Cervical Vertebral Maturation Using Cone Beam Computed Tomography in Korean Girls

PubMed Central

Byun, Bo-Ram; Kim, Yong-Il; Maki, Koutaro; Son, Woo-Sung

2015-01-01

This study was aimed to examine the correlation between skeletal maturation status and parameters from the odontoid process/body of the second vertebra and the bodies of third and fourth cervical vertebrae and simultaneously build multiple regression models to be able to estimate skeletal maturation status in Korean girls. Hand-wrist radiographs and cone beam computed tomography (CBCT) images were obtained from 74 Korean girls (6–18 years of age). CBCT-generated cervical vertebral maturation (CVM) was used to demarcate the odontoid process and the body of the second cervical vertebra, based on the dentocentral synchondrosis. Correlation coefficient analysis and multiple linear regression analysis were used for each parameter of the cervical vertebrae (P < 0.05). Forty-seven of 64 parameters from CBCT-generated CVM (independent variables) exhibited statistically significant correlations (P < 0.05). The multiple regression model with the greatest R 2 had six parameters (PH2/W2, UW2/W2, (OH+AH2)/LW2, UW3/LW3, D3, and H4/W4) as independent variables with a variance inflation factor (VIF) of <2. CBCT-generated CVM was able to include parameters from the second cervical vertebral body and odontoid process, respectively, for the multiple regression models. This suggests that quantitative analysis might be used to estimate skeletal maturation status. PMID:25878721
Hyper-Spectral Image Analysis With Partially Latent Regression and Spatial Markov Dependencies

NASA Astrophysics Data System (ADS)

Deleforge, Antoine; Forbes, Florence; Ba, Sileye; Horaud, Radu

2015-09-01

Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.
Analysis of carbon dioxide bands near 2.2 micrometers

NASA Technical Reports Server (NTRS)

Abubaker, M. S.; Shaw, J. H.

1984-01-01

Carbon dioxide is one of the more important atmospheric infrared-absorbing gases due to its relatively high, and increasing, concentration. The spectral parameters of its bands are required for understanding radiative heat transfer in the atmosphere. The line intensities, positions, line half-widths, rotational constants, and band centers of three overlapping bands of CO2 near 2.2 microns are presented. Non-linear least squares (NLLS) regression procedures were employed to determine these parameters.
Isotherm investigation for the sorption of fluoride onto Bio-F: comparison of linear and non-linear regression method

NASA Astrophysics Data System (ADS)

Yadav, Manish; Singh, Nitin Kumar

2017-12-01

A comparison of the linear and non-linear regression method in selecting the optimum isotherm among three most commonly used adsorption isotherms (Langmuir, Freundlich, and Redlich-Peterson) was made to the experimental data of fluoride (F) sorption onto Bio-F at a solution temperature of 30 ± 1 °C. The coefficient of correlation (r2) was used to select the best theoretical isotherm among the investigated ones. A total of four Langmuir linear equations were discussed and out of which linear form of most popular Langmuir-1 and Langmuir-2 showed the higher coefficient of determination (0.976 and 0.989) as compared to other Langmuir linear equations. Freundlich and Redlich-Peterson isotherms showed a better fit to the experimental data in linear least-square method, while in non-linear method Redlich-Peterson isotherm equations showed the best fit to the tested data set. The present study showed that the non-linear method could be a better way to obtain the isotherm parameters and represent the most suitable isotherm. Redlich-Peterson isotherm was found to be the best representative (r2 = 0.999) for this sorption system. It is also observed that the values of β are not close to unity, which means the isotherms are approaching the Freundlich but not the Langmuir isotherm.
Iterative integral parameter identification of a respiratory mechanics model.

PubMed

Schranz, Christoph; Docherty, Paul D; Chiew, Yeong Shiong; Möller, Knut; Chase, J Geoffrey

2012-07-18

Patient-specific respiratory mechanics models can support the evaluation of optimal lung protective ventilator settings during ventilation therapy. Clinical application requires that the individual's model parameter values must be identified with information available at the bedside. Multiple linear regression or gradient-based parameter identification methods are highly sensitive to noise and initial parameter estimates. Thus, they are difficult to apply at the bedside to support therapeutic decisions. An iterative integral parameter identification method is applied to a second order respiratory mechanics model. The method is compared to the commonly used regression methods and error-mapping approaches using simulated and clinical data. The clinical potential of the method was evaluated on data from 13 Acute Respiratory Distress Syndrome (ARDS) patients. The iterative integral method converged to error minima 350 times faster than the Simplex Search Method using simulation data sets and 50 times faster using clinical data sets. Established regression methods reported erroneous results due to sensitivity to noise. In contrast, the iterative integral method was effective independent of initial parameter estimations, and converged successfully in each case tested. These investigations reveal that the iterative integral method is beneficial with respect to computing time, operator independence and robustness, and thus applicable at the bedside for this clinical application.
Comparison of linear and non-linear models for the adsorption of fluoride onto geo-material: limonite.

PubMed

Sahin, Rubina; Tapadia, Kavita

2015-01-01

The three widely used isotherms Langmuir, Freundlich and Temkin were examined in an experiment using fluoride (F⁻) ion adsorption on a geo-material (limonite) at four different temperatures by linear and non-linear models. Comparison of linear and non-linear regression models were given in selecting the optimum isotherm for the experimental results. The coefficient of determination, r², was used to select the best theoretical isotherm. The four Langmuir linear equations (1, 2, 3, and 4) are discussed. Langmuir isotherm parameters obtained from the four Langmuir linear equations using the linear model differed but they were the same when using the nonlinear model. Langmuir-2 isotherm is one of the linear forms, and it had the highest coefficient of determination (r² = 0.99) compared to the other Langmuir linear equations (1, 3 and 4) in linear form, whereas, for non-linear, Langmuir-4 fitted best among all the isotherms because it had the highest coefficient of determination (r² = 0.99). The results showed that the non-linear model may be a better way to obtain the parameters. In the present work, the thermodynamic parameters show that the absorption of fluoride onto limonite is both spontaneous (ΔG < 0) and endothermic (ΔH > 0). Scanning electron microscope and X-ray diffraction images also confirm the adsorption of F⁻ ion onto limonite. The isotherm and kinetic study reveals that limonite can be used as an adsorbent for fluoride removal. In future we can develop new technology for fluoride removal in large scale by using limonite which is cost-effective, eco-friendly and is easily available in the study area.
Survival Data and Regression Models

NASA Astrophysics Data System (ADS)

Grégoire, G.

2014-12-01

We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Time-resolved perfusion imaging at the angiography suite: preclinical comparison of a new flat-detector application to computed tomography perfusion.

PubMed

Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver

2015-02-01

The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.
Inflammation, homocysteine and carotid intima-media thickness.

PubMed

Baptista, Alexandre P; Cacdocar, Sanjiva; Palmeiro, Hugo; Faísca, Marília; Carrasqueira, Herménio; Morgado, Elsa; Sampaio, Sandra; Cabrita, Ana; Silva, Ana Paula; Bernardo, Idalécio; Gome, Veloso; Neves, Pedro L

2008-01-01

Cardiovascular disease is the main cause of morbidity and mortality in chronic renal patients. Carotid intima-media thickness (CIMT) is one of the most accurate markers of atherosclerosis risk. In this study, the authors set out to evaluate a population of chronic renal patients to determine which factors are associated with an increase in intima-media thickness. We included 56 patients (F=22, M=34), with a mean age of 68.6 years, and an estimated glomerular filtration rate of 15.8 ml/min (calculated by the MDRD equation). Various laboratory and inflammatory parameters (hsCRP, IL-6 and TNF-alpha) were evaluated. All subjects underwent measurement of internal carotid artery intima-media thickness by high-resolution real-time B-mode ultrasonography using a 10 MHz linear transducer. Intima-media thickness was used as a dependent variable in a simple linear regression model, with the various laboratory parameters as independent variables. Only parameters showing a significant correlation with CIMT were evaluated in a multiple regression model: age (p=0.001), hemoglobin (p=00.3), logCRP (p=0.042), logIL-6 (p=0.004) and homocysteine (p=0.002). In the multiple regression model we found that age (p=0.001) and homocysteine (p=0.027) were independently correlated with CIMT. LogIL-6 did not reach statistical significance (p=0.057), probably due to the small population size. The authors conclude that age and homocysteine correlate with carotid intima-media thickness, and thus can be considered as markers/risk factors in chronic renal patients.
Speech prosody impairment predicts cognitive decline in Parkinson's disease.

PubMed

Rektorova, Irena; Mekyska, Jiri; Janousova, Eva; Kostalova, Milena; Eliasova, Ilona; Mrackova, Martina; Berankova, Dagmar; Necasova, Tereza; Smekal, Zdenek; Marecek, Radek

2016-08-01

Impairment of speech prosody is characteristic for Parkinson's disease (PD) and does not respond well to dopaminergic treatment. We assessed whether baseline acoustic parameters, alone or in combination with other predominantly non-dopaminergic symptoms may predict global cognitive decline as measured by the Addenbrooke's cognitive examination (ACE-R) and/or worsening of cognitive status as assessed by a detailed neuropsychological examination. Forty-four consecutive non-depressed PD patients underwent clinical and cognitive testing, and acoustic voice analysis at baseline and at the two-year follow-up. Influence of speech and other clinical parameters on worsening of the ACE-R and of the cognitive status was analyzed using linear and logistic regression. The cognitive status (classified as normal cognition, mild cognitive impairment and dementia) deteriorated in 25% of patients during the follow-up. The multivariate linear regression model consisted of the variation in range of the fundamental voice frequency (F0VR) and the REM Sleep Behavioral Disorder Screening Questionnaire (RBDSQ). These parameters explained 37.2% of the variability of the change in ACE-R. The most significant predictors in the univariate logistic regression were the speech index of rhythmicity (SPIR; p = 0.012), disease duration (p = 0.019), and the RBDSQ (p = 0.032). The multivariate regression analysis revealed that SPIR alone led to 73.2% accuracy in predicting a change in cognitive status. Combining SPIR with RBDSQ improved the prediction accuracy of SPIR alone by 7.3%. Impairment of speech prosody together with symptoms of RBD predicted rapid cognitive decline and worsening of PD cognitive status during a two-year period. Copyright © 2016 Elsevier Ltd. All rights reserved.
Regression to fuzziness method for estimation of remaining useful life in power plant components

NASA Astrophysics Data System (ADS)

Alamaniotis, Miltiadis; Grelle, Austin; Tsoukalas, Lefteri H.

2014-10-01

Mitigation of severe accidents in power plants requires the reliable operation of all systems and the on-time replacement of mechanical components. Therefore, the continuous surveillance of power systems is a crucial concern for the overall safety, cost control, and on-time maintenance of a power plant. In this paper a methodology called regression to fuzziness is presented that estimates the remaining useful life (RUL) of power plant components. The RUL is defined as the difference between the time that a measurement was taken and the estimated failure time of that component. The methodology aims to compensate for a potential lack of historical data by modeling an expert's operational experience and expertise applied to the system. It initially identifies critical degradation parameters and their associated value range. Once completed, the operator's experience is modeled through fuzzy sets which span the entire parameter range. This model is then synergistically used with linear regression and a component's failure point to estimate the RUL. The proposed methodology is tested on estimating the RUL of a turbine (the basic electrical generating component of a power plant) in three different cases. Results demonstrate the benefits of the methodology for components for which operational data is not readily available and emphasize the significance of the selection of fuzzy sets and the effect of knowledge representation on the predicted output. To verify the effectiveness of the methodology, it was benchmarked against the data-based simple linear regression model used for predictions which was shown to perform equal or worse than the presented methodology. Furthermore, methodology comparison highlighted the improvement in estimation offered by the adoption of appropriate of fuzzy sets for parameter representation.
Comparison of stability and control parameters for a light, single-engine, high-winged aircraft using different flight test and parameter estimation techniques

NASA Technical Reports Server (NTRS)

Suit, W. T.; Cannaday, R. L.

1979-01-01

The longitudinal and lateral stability and control parameters for a high wing, general aviation, airplane are examined. Estimations using flight data obtained at various flight conditions within the normal range of the aircraft are presented. The estimations techniques, an output error technique (maximum likelihood) and an equation error technique (linear regression), are presented. The longitudinal static parameters are estimated from climbing, descending, and quasi steady state flight data. The lateral excitations involve a combination of rudder and ailerons. The sensitivity of the aircraft modes of motion to variations in the parameter estimates are discussed.
A method for operative quantitative interpretation of multispectral images of biological tissues

NASA Astrophysics Data System (ADS)

Lisenko, S. A.; Kugeiko, M. M.

2013-10-01

A method for operative retrieval of spatial distributions of biophysical parameters of a biological tissue by using a multispectral image of it has been developed. The method is based on multiple regressions between linearly independent components of the diffuse reflection spectrum of the tissue and unknown parameters. Possibilities of the method are illustrated by an example of determining biophysical parameters of the skin (concentrations of melanin, hemoglobin and bilirubin, blood oxygenation, and scattering coefficient of the tissue). Examples of quantitative interpretation of the experimental data are presented.
Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities

PubMed Central

Gíslason, Magnús; Sigurðsson, Sigurður; Guðnason, Vilmundur; Harris, Tamara; Carraro, Ugo; Gargiulo, Paolo

2018-01-01

Sarcopenic muscular degeneration has been consistently identified as an independent risk factor for mortality in aging populations. Recent investigations have realized the quantitative potential of computed tomography (CT) image analysis to describe skeletal muscle volume and composition; however, the optimum approach to assessing these data remains debated. Current literature reports average Hounsfield unit (HU) values and/or segmented soft tissue cross-sectional areas to investigate muscle quality. However, standardized methods for CT analyses and their utility as a comorbidity index remain undefined, and no existing studies compare these methods to the assessment of entire radiodensitometric distributions. The primary aim of this study was to present a comparison of nonlinear trimodal regression analysis (NTRA) parameters of entire radiodensitometric muscle distributions against extant CT metrics and their correlation with lower extremity function (LEF) biometrics (normal/fast gait speed, timed up-and-go, and isometric leg strength) and biochemical and nutritional parameters, such as total solubilized cholesterol (SCHOL) and body mass index (BMI). Data were obtained from 3,162 subjects, aged 66–96 years, from the population-based AGES-Reykjavik Study. 1-D k-means clustering was employed to discretize each biometric and comorbidity dataset into twelve subpopulations, in accordance with Sturges’ Formula for Class Selection. Dataset linear regressions were performed against eleven NTRA distribution parameters and standard CT analyses (fat/muscle cross-sectional area and average HU value). Parameters from NTRA and CT standards were analogously assembled by age and sex. Analysis of specific NTRA parameters with standard CT results showed linear correlation coefficients greater than 0.85, but multiple regression analysis of correlative NTRA parameters yielded a correlation coefficient of 0.99 (P<0.005). These results highlight the specificities of each muscle quality metric to LEF biometrics, SCHOL, and BMI, and particularly highlight the value of the connective tissue regime in this regard. PMID:29513690
Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities.

PubMed

Edmunds, Kyle; Gíslason, Magnús; Sigurðsson, Sigurður; Guðnason, Vilmundur; Harris, Tamara; Carraro, Ugo; Gargiulo, Paolo

2018-01-01

Sarcopenic muscular degeneration has been consistently identified as an independent risk factor for mortality in aging populations. Recent investigations have realized the quantitative potential of computed tomography (CT) image analysis to describe skeletal muscle volume and composition; however, the optimum approach to assessing these data remains debated. Current literature reports average Hounsfield unit (HU) values and/or segmented soft tissue cross-sectional areas to investigate muscle quality. However, standardized methods for CT analyses and their utility as a comorbidity index remain undefined, and no existing studies compare these methods to the assessment of entire radiodensitometric distributions. The primary aim of this study was to present a comparison of nonlinear trimodal regression analysis (NTRA) parameters of entire radiodensitometric muscle distributions against extant CT metrics and their correlation with lower extremity function (LEF) biometrics (normal/fast gait speed, timed up-and-go, and isometric leg strength) and biochemical and nutritional parameters, such as total solubilized cholesterol (SCHOL) and body mass index (BMI). Data were obtained from 3,162 subjects, aged 66-96 years, from the population-based AGES-Reykjavik Study. 1-D k-means clustering was employed to discretize each biometric and comorbidity dataset into twelve subpopulations, in accordance with Sturges' Formula for Class Selection. Dataset linear regressions were performed against eleven NTRA distribution parameters and standard CT analyses (fat/muscle cross-sectional area and average HU value). Parameters from NTRA and CT standards were analogously assembled by age and sex. Analysis of specific NTRA parameters with standard CT results showed linear correlation coefficients greater than 0.85, but multiple regression analysis of correlative NTRA parameters yielded a correlation coefficient of 0.99 (P<0.005). These results highlight the specificities of each muscle quality metric to LEF biometrics, SCHOL, and BMI, and particularly highlight the value of the connective tissue regime in this regard.
Predicting birth weight with conditionally linear transformation models.

PubMed

Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten

2016-12-01

Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.
Hybrid Support Vector Regression and Autoregressive Integrated Moving Average Models Improved by Particle Swarm Optimization for Property Crime Rates Forecasting with Economic Indicators

PubMed Central

Alwee, Razana; Hj Shamsuddin, Siti Mariyam; Sallehuddin, Roselina

2013-01-01

Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models. PMID:23766729
Hybrid support vector regression and autoregressive integrated moving average models improved by particle swarm optimization for property crime rates forecasting with economic indicators.

PubMed

Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina

2013-01-01

Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.
Analysis and selection of magnitude relations for the Working Group on Utah Earthquake Probabilities

USGS Publications Warehouse

Duross, Christopher; Olig, Susan; Schwartz, David

2015-01-01

Prior to calculating time-independent and -dependent earthquake probabilities for faults in the Wasatch Front region, the Working Group on Utah Earthquake Probabilities (WGUEP) updated a seismic-source model for the region (Wong and others, 2014) and evaluated 19 historical regressions on earthquake magnitude (M). These regressions relate M to fault parameters for historical surface-faulting earthquakes, including linear fault length (e.g., surface-rupture length [SRL] or segment length), average displacement, maximum displacement, rupture area, seismic moment (Mo ), and slip rate. These regressions show that significant epistemic uncertainties complicate the determination of characteristic magnitude for fault sources in the Basin and Range Province (BRP). For example, we found that M estimates (as a function of SRL) span about 0.3–0.4 units (figure 1) owing to differences in the fault parameter used; age, quality, and size of historical earthquake databases; and fault type and region considered.
Statistical approach to Higgs boson couplings in the standard model effective field theory

NASA Astrophysics Data System (ADS)

Murphy, Christopher W.

2018-01-01

We perform a parameter fit in the standard model effective field theory (SMEFT) with an emphasis on using regularized linear regression to tackle the issue of the large number of parameters in the SMEFT. In regularized linear regression, a positive definite function of the parameters of interest is added to the usual cost function. A cross-validation is performed to try to determine the optimal value of the regularization parameter to use, but it selects the standard model (SM) as the best model to explain the measurements. Nevertheless as proof of principle of this technique we apply it to fitting Higgs boson signal strengths in SMEFT, including the latest Run-2 results. Results are presented in terms of the eigensystem of the covariance matrix of the least squares estimators as it has a degree model-independent to it. We find several results in this initial work: the SMEFT predicts the total width of the Higgs boson to be consistent with the SM prediction; the ATLAS and CMS experiments at the LHC are currently sensitive to non-resonant double Higgs boson production. Constraints are derived on the viable parameter space for electroweak baryogenesis in the SMEFT, reinforcing the notion that a first order phase transition requires fairly low-scale beyond the SM physics. Finally, we study which future experimental measurements would give the most improvement on the global constraints on the Higgs sector of the SMEFT.

[Aboveground biomass of three conifers in Qianyanzhou plantation].

PubMed

Li, Xuanran; Liu, Qijing; Chen, Yongrui; Hu, Lile; Yang, Fengting

2006-08-01

In this paper, the regressive models of the aboveground biomass of Pinus elliottii, P. massoniana and Cunninghamia lanceolata in Qianyanzhou of subtropical China were established, and the regression analysis on the dry weight of leaf biomass and total biomass against branch diameter (d), branch length (L), d3 and d2L was conducted with linear, power and exponent functions. Power equation with single parameter (d) was proved to be better than the rests for P. massoniana and C. lanceolata, and linear equation with parameter (d3) was better for P. elliottii. The canopy biomass was derived by the regression equations for all branches. These equations were also used to fit the relationships of total tree biomass, branch biomass and foliage biomass with tree diameter at breast height (D), tree height (H), D3 and D2H, respectively. D2H was found to be the best parameter for estimating total biomass. For foliage-and branch biomass, both parameters and equation forms showed some differences among species. Correlations were highly significant (P <0.001) for foliage-, branch-and total biomass, with the highest for total biomass. By these equations, the aboveground biomass and its allocation were estimated, with the aboveground biomass of P. massoniana, P. elliottii, and C. lanceolata forests being 83.6, 72. 1 and 59 t x hm(-2), respectively, and more stem biomass than foliage-and branch biomass. According to the previous studies, the underground biomass of these three forests was estimated to be 10.44, 9.42 and 11.48 t x hm(-2), and the amount of fixed carbon was 47.94, 45.14 and 37.52 t x hm(-2), respectively.
Penalized nonparametric scalar-on-function regression via principal coordinates

PubMed Central

Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu

2016-01-01

A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963
Waste management under multiple complexities: Inexact piecewise-linearization-based fuzzy flexible programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun Wei; Huang, Guo H., E-mail: huang@iseis.org; Institute for Energy, Environment and Sustainable Communities, University of Regina, Regina, Saskatchewan, S4S 0A2

2012-06-15

Highlights: Black-Right-Pointing-Pointer Inexact piecewise-linearization-based fuzzy flexible programming is proposed. Black-Right-Pointing-Pointer It's the first application to waste management under multiple complexities. Black-Right-Pointing-Pointer It tackles nonlinear economies-of-scale effects in interval-parameter constraints. Black-Right-Pointing-Pointer It estimates costs more accurately than the linear-regression-based model. Black-Right-Pointing-Pointer Uncertainties are decreased and more satisfactory interval solutions are obtained. - Abstract: To tackle nonlinear economies-of-scale (EOS) effects in interval-parameter constraints for a representative waste management problem, an inexact piecewise-linearization-based fuzzy flexible programming (IPFP) model is developed. In IPFP, interval parameters for waste amounts and transportation/operation costs can be quantified; aspiration levels for net system costs, as well as tolerancemore » intervals for both capacities of waste treatment facilities and waste generation rates can be reflected; and the nonlinear EOS effects transformed from objective function to constraints can be approximated. An interactive algorithm is proposed for solving the IPFP model, which in nature is an interval-parameter mixed-integer quadratically constrained programming model. To demonstrate the IPFP's advantages, two alternative models are developed to compare their performances. One is a conventional linear-regression-based inexact fuzzy programming model (IPFP2) and the other is an IPFP model with all right-hand-sides of fussy constraints being the corresponding interval numbers (IPFP3). The comparison results between IPFP and IPFP2 indicate that the optimized waste amounts would have the similar patterns in both models. However, when dealing with EOS effects in constraints, the IPFP2 may underestimate the net system costs while the IPFP can estimate the costs more accurately. The comparison results between IPFP and IPFP3 indicate that their solutions would be significantly different. The decreased system uncertainties in IPFP's solutions demonstrate its effectiveness for providing more satisfactory interval solutions than IPFP3. Following its first application to waste management, the IPFP can be potentially applied to other environmental problems under multiple complexities.« less
Mixed geographically weighted regression (MGWR) model with weighted adaptive bi-square for case of dengue hemorrhagic fever (DHF) in Surakarta

NASA Astrophysics Data System (ADS)

Astuti, H. N.; Saputro, D. R. S.; Susanti, Y.

2017-06-01

MGWR model is combination of linear regression model and geographically weighted regression (GWR) model, therefore, MGWR model could produce parameter estimation that had global parameter estimation, and other parameter that had local parameter in accordance with its observation location. The linkage between locations of the observations expressed in specific weighting that is adaptive bi-square. In this research, we applied MGWR model with weighted adaptive bi-square for case of DHF in Surakarta based on 10 factors (variables) that is supposed to influence the number of people with DHF. The observation unit in the research is 51 urban villages and the variables are number of inhabitants, number of houses, house index, many public places, number of healthy homes, number of Posyandu, area width, level population density, welfare of the family, and high-region. Based on this research, we obtained 51 MGWR models. The MGWR model were divided into 4 groups with significant variable is house index as a global variable, an area width as a local variable and the remaining variables vary in each. Global variables are variables that significantly affect all locations, while local variables are variables that significantly affect a specific location.
A reliable and cost effective approach for radiographic monitoring in nutritional rickets.

PubMed

Chatterjee, D; Gupta, V; Sharma, V; Sinha, B; Samanta, S

2014-04-01

Radiological scoring is particularly useful in rickets, where pre-treatment radiographical findings can reflect the disease severity and can be used to monitor the improvement. However, there is only a single radiographic scoring system for rickets developed by Thacher and, to the best of our knowledge, no study has evaluated radiographic changes in rickets based on this scoring system apart from the one done by Thacher himself. The main objective of this study is to compare and analyse the pre-treatment and post-treatment radiographic parameters in nutritional rickets with the help of Thacher's scoring technique. 176 patients with nutritional rickets were given a single intramuscular injection of vitamin D (600 000 IU) along with oral calcium (50 mg kg(-1)) and vitamin D (400 IU per day) until radiological resolution and followed for 1 year. Pre- and post-treatment radiological parameters were compared and analysed statistically based on Thacher's scoring system. Radiological resolution was complete by 6 months. Time for radiological resolution and initial radiological score were linearly associated on regression analysis. The distal ulna was the last to heal in most cases except when the initial score was 10, when distal femur was the last to heal. Thacher's scoring system can effectively monitor nutritional rickets. The formula derived through linear regression has prognostic significance. The distal femur is a better indicator in radiologically severe rickets and when resolution is delayed. Thacher's scoring is very useful for monitoring of rickets. The formula derived through linear regression can predict the expected time for radiological resolution.
Conditional parametric models for storm sewer runoff

NASA Astrophysics Data System (ADS)

Jonsdottir, H.; Nielsen, H. Aa; Madsen, H.; Eliasson, J.; Palsson, O. P.; Nielsen, M. K.

2007-05-01

The method of conditional parametric modeling is introduced for flow prediction in a sewage system. It is a well-known fact that in hydrological modeling the response (runoff) to input (precipitation) varies depending on soil moisture and several other factors. Consequently, nonlinear input-output models are needed. The model formulation described in this paper is similar to the traditional linear models like final impulse response (FIR) and autoregressive exogenous (ARX) except that the parameters vary as a function of some external variables. The parameter variation is modeled by local lines, using kernels for local linear regression. As such, the method might be referred to as a nearest neighbor method. The results achieved in this study were compared to results from the conventional linear methods, FIR and ARX. The increase in the coefficient of determination is substantial. Furthermore, the new approach conserves the mass balance better. Hence this new approach looks promising for various hydrological models and analysis.
Noninvasive diagnostics of skin microphysical parameters based on spatially resolved diffuse reflectance spectroscopy

NASA Astrophysics Data System (ADS)

Lisenko, S. A.; Kugeiko, M. M.

2013-01-01

The ability to determine noninvasively microphysical parameters (MPPs) of skin characteristic of malignant melanoma was demonstrated. The MPPs were the melanin content in dermis, saturation of tissue with blood vessels, and concentration and effective size of tissue scatterers. The proposed method was based on spatially resolved spectral measurements of skin diffuse reflectance and multiple regressions between linearly independent measurement components and skin MPPs. The regressions were established by modeling radiation transfer in skin with a wide variation of its MPPs. Errors in the determination of skin MPPs were estimated using fiber-optic measurements of its diffuse reflectance at wavelengths of commercially available semiconductor diode lasers (578, 625, 660, 760, and 806 nm) at source-detector separations of 0.23-1.38 mm.
Probabilistic accounting of uncertainty in forecasts of species distributions under climate change

Treesearch

Seth J. Wenger; Nicholas A. Som; Daniel C. Dauwalter; Daniel J. Isaak; Helen M. Neville; Charles H. Luce; Jason B. Dunham; Michael K. Young; Kurt D. Fausch; Bruce E. Rieman

2013-01-01

Forecasts of species distributions under future climates are inherently uncertain, but there have been few attempts to describe this uncertainty comprehensively in a probabilistic manner. We developed a Monte Carlo approach that accounts for uncertainty within generalized linear regression models (parameter uncertainty and residual error), uncertainty among competing...
Developing an Adequately Specified Model of State Level Student Achievement with Multilevel Data.

ERIC Educational Resources Information Center

Bernstein, Lawrence

Limitations of using linear, unilevel regression procedures in modeling student achievement are discussed. This study is a part of a broader study that is developing an empirically-based predictive model of variables associated with academic achievement from a multilevel perspective and examining the differences by which parameters are estimated…
Quantifying and Reducing Curve-Fitting Uncertainty in Isc

DOE Office of Scientific and Technical Information (OSTI.GOV)

Campanelli, Mark; Duck, Benjamin; Emery, Keith

2015-06-14

Current-voltage (I-V) curve measurements of photovoltaic (PV) devices are used to determine performance parameters and to establish traceable calibration chains. Measurement standards specify localized curve fitting methods, e.g., straight-line interpolation/extrapolation of the I-V curve points near short-circuit current, Isc. By considering such fits as statistical linear regressions, uncertainties in the performance parameters are readily quantified. However, the legitimacy of such a computed uncertainty requires that the model be a valid (local) representation of the I-V curve and that the noise be sufficiently well characterized. Using more data points often has the advantage of lowering the uncertainty. However, more data pointsmore » can make the uncertainty in the fit arbitrarily small, and this fit uncertainty misses the dominant residual uncertainty due to so-called model discrepancy. Using objective Bayesian linear regression for straight-line fits for Isc, we investigate an evidence-based method to automatically choose data windows of I-V points with reduced model discrepancy. We also investigate noise effects. Uncertainties, aligned with the Guide to the Expression of Uncertainty in Measurement (GUM), are quantified throughout.« less
Quantifying and Reducing Curve-Fitting Uncertainty in Isc: Preprint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Campanelli, Mark; Duck, Benjamin; Emery, Keith

Current-voltage (I-V) curve measurements of photovoltaic (PV) devices are used to determine performance parameters and to establish traceable calibration chains. Measurement standards specify localized curve fitting methods, e.g., straight-line interpolation/extrapolation of the I-V curve points near short-circuit current, Isc. By considering such fits as statistical linear regressions, uncertainties in the performance parameters are readily quantified. However, the legitimacy of such a computed uncertainty requires that the model be a valid (local) representation of the I-V curve and that the noise be sufficiently well characterized. Using more data points often has the advantage of lowering the uncertainty. However, more data pointsmore » can make the uncertainty in the fit arbitrarily small, and this fit uncertainty misses the dominant residual uncertainty due to so-called model discrepancy. Using objective Bayesian linear regression for straight-line fits for Isc, we investigate an evidence-based method to automatically choose data windows of I-V points with reduced model discrepancy. We also investigate noise effects. Uncertainties, aligned with the Guide to the Expression of Uncertainty in Measurement (GUM), are quantified throughout.« less
Microbial Transformation of Esters of Chlorinated Carboxylic Acids

PubMed Central

Paris, D. F.; Wolfe, N. L.; Steen, W. C.

1984-01-01

Two groups of compounds were selected for microbial transformation studies. In the first group were carboxylic acid esters having a fixed aromatic moiety and an increasing length of the alkyl component. Ethyl esters of chlorine-substituted carboxylic acids were in the second group. Microorganisms from environmental waters and a pure culture of Pseudomonas putida U were used. The bacterial populations were monitored by plate counts, and disappearance of the parent compound was followed by gas-liquid chromatography as a function of time. The products of microbial hydrolysis were the respective carboxylic acids. Octanol-water partition coefficients (Kow) for the compounds were measured. These values spanned three orders of magnitude, whereas microbial transformation rate constants (kb) varied only 50-fold. The microbial rate constants of the carboxylic acid esters with a fixed aromatic moiety increased with an increasing length of alkyl substituents. The regression coefficient for the linear relationships between log kb and log Kow was high for group 1 compounds, indicating that these parameters correlated well. The regression coefficient for the linear relationships for group 2 compounds, however, was low, indicating that these parameters correlated poorly. PMID:16346459
Can DCE-MRI Explain the Heterogeneity in Radiopeptide Uptake Imaged by SPECT in a Pancreatic Neuroendocrine Tumor Model?

PubMed Central

Groen, Harald C.; Niessen, Wiro J.; Bernsen, Monique R.; de Jong, Marion; Veenland, Jifke F.

2013-01-01

Although efficient delivery and distribution of treatment agents over the whole tumor is essential for successful tumor treatment, the distribution of most of these agents cannot be visualized. However, with single-photon emission computed tomography (SPECT), both delivery and uptake of radiolabeled peptides can be visualized in a neuroendocrine tumor model overexpressing somatostatin receptors. A heterogeneous peptide uptake is often observed in these tumors. We hypothesized that peptide distribution in the tumor is spatially related to tumor perfusion, vessel density and permeability, as imaged and quantified by DCE-MRI in a neuroendocrine tumor model. Four subcutaneous CA20948 tumor-bearing Lewis rats were injected with the somatostatin-analog 111In-DTPA-Octreotide (50 MBq). SPECT-CT and MRI scans were acquired and MRI was spatially registered to SPECT-CT. DCE-MRI was analyzed using semi-quantitative and quantitative methods. Correlation between SPECT and DCE-MRI was investigated with 1) Spearman’s rank correlation coefficient; 2) SPECT uptake values grouped into deciles with corresponding median DCE-MRI parametric values and vice versa; and 3) linear regression analysis for median parameter values in combined datasets. In all tumors, areas with low peptide uptake correlated with low perfusion/density/ /permeability for all DCE-MRI-derived parameters. Combining all datasets, highest linear regression was found between peptide uptake and semi-quantitative parameters (R2>0.7). The average correlation coefficient between SPECT and DCE-MRI-derived parameters ranged from 0.52-0.56 (p<0.05) for parameters primarily associated with exchange between blood and extracellular extravascular space. For these parameters a linear relation with peptide uptake was observed. In conclusion, the ‘exchange-related’ DCE-MRI-derived parameters seemed to predict peptide uptake better than the ‘contrast amount- related’ parameters. Consequently, fast and efficient diffusion through the vessel wall into tissue is an important factor for peptide delivery. DCE-MRI helps to elucidate the relation between vascular characteristics, peptide delivery and treatment efficacy, and may form a basis to predict targeting efficiency. PMID:24116203
Modeling thermal sensation in a Mediterranean climate—a comparison of linear and ordinal models

NASA Astrophysics Data System (ADS)

Pantavou, Katerina; Lykoudis, Spyridon

2014-08-01

A simple thermo-physiological model of outdoor thermal sensation adjusted with psychological factors is developed aiming to predict thermal sensation in Mediterranean climates. Microclimatic measurements simultaneously with interviews on personal and psychological conditions were carried out in a square, a street canyon and a coastal location of the greater urban area of Athens, Greece. Multiple linear and ordinal regression were applied in order to estimate thermal sensation making allowance for all the recorded parameters or specific, empirically selected, subsets producing so-called extensive and empirical models, respectively. Meteorological, thermo-physiological and overall models - considering psychological factors as well - were developed. Predictions were improved when personal and psychological factors were taken into account as compared to meteorological models. The model based on ordinal regression reproduced extreme values of thermal sensation vote more adequately than the linear regression one, while the empirical model produced satisfactory results in relation to the extensive model. The effects of adaptation and expectation on thermal sensation vote were introduced in the models by means of the exposure time, season and preference related to air temperature and irradiation. The assessment of thermal sensation could be a useful criterion in decision making regarding public health, outdoor spaces planning and tourism.
Quantification of endocrine disruptors and pesticides in water by gas chromatography-tandem mass spectrometry. Method validation using weighted linear regression schemes.

PubMed

Mansilha, C; Melo, A; Rebelo, H; Ferreira, I M P L V O; Pinho, O; Domingues, V; Pinho, C; Gameiro, P

2010-10-22

A multi-residue methodology based on a solid phase extraction followed by gas chromatography-tandem mass spectrometry was developed for trace analysis of 32 compounds in water matrices, including estrogens and several pesticides from different chemical families, some of them with endocrine disrupting properties. Matrix standard calibration solutions were prepared by adding known amounts of the analytes to a residue-free sample to compensate matrix-induced chromatographic response enhancement observed for certain pesticides. Validation was done mainly according to the International Conference on Harmonisation recommendations, as well as some European and American validation guidelines with specifications for pesticides analysis and/or GC-MS methodology. As the assumption of homoscedasticity was not met for analytical data, weighted least squares linear regression procedure was applied as a simple and effective way to counteract the greater influence of the greater concentrations on the fitted regression line, improving accuracy at the lower end of the calibration curve. The method was considered validated for 31 compounds after consistent evaluation of the key analytical parameters: specificity, linearity, limit of detection and quantification, range, precision, accuracy, extraction efficiency, stability and robustness. Copyright © 2010 Elsevier B.V. All rights reserved.
TSS concentration in sewers estimated from turbidity measurements by means of linear regression accounting for uncertainties in both variables.

PubMed

Bertrand-Krajewski, J L

2004-01-01

In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.
Differential gene expression detection and sample classification using penalized linear regression models.

PubMed

Wu, Baolin

2006-02-15

Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.
Gap-filling methods to impute eddy covariance flux data by preserving variance.

NASA Astrophysics Data System (ADS)

Kunwor, S.; Staudhammer, C. L.; Starr, G.; Loescher, H. W.

2015-12-01

To represent carbon dynamics, in terms of exchange of CO2 between the terrestrial ecosystem and the atmosphere, eddy covariance (EC) data has been collected using eddy flux towers from various sites across globe for more than two decades. However, measurements from EC data are missing for various reasons: precipitation, routine maintenance, or lack of vertical turbulence. In order to have estimates of net ecosystem exchange of carbon dioxide (NEE) with high precision and accuracy, robust gap-filling methods to impute missing data are required. While the methods used so far have provided robust estimates of the mean value of NEE, little attention has been paid to preserving the variance structures embodied by the flux data. Preserving the variance of these data will provide unbiased and precise estimates of NEE over time, which mimic natural fluctuations. We used a non-linear regression approach with moving windows of different lengths (15, 30, and 60-days) to estimate non-linear regression parameters for one year of flux data from a long-leaf pine site at the Joseph Jones Ecological Research Center. We used as our base the Michaelis-Menten and Van't Hoff functions. We assessed the potential physiological drivers of these parameters with linear models using micrometeorological predictors. We then used a parameter prediction approach to refine the non-linear gap-filling equations based on micrometeorological conditions. This provides us an opportunity to incorporate additional variables, such as vapor pressure deficit (VPD) and volumetric water content (VWC) into the equations. Our preliminary results indicate that improvements in gap-filling can be gained with a 30-day moving window with additional micrometeorological predictors (as indicated by lower root mean square error (RMSE) of the predicted values of NEE). Our next steps are to use these parameter predictions from moving windows to gap-fill the data with and without incorporation of potential driver variables of the parameters traditionally used. Then, comparisons of the predicted values from these methods and 'traditional' gap-filling methods (using 12 fixed monthly windows) will be assessed to show the scale of preserving variance. Further, this method will be applied to impute artificially created gaps for analyzing if variance is preserved.
[Correlation of age, IGF-1 serum levels, muscular mass index and their influence as determinants of isokinetic variables in patients with osteoporosis].

PubMed

Coronado-Zarco, Roberto; Diez-García, María del Pilar; Chávez-Arias, Daniel; León-Hernández, Saúl Renán; Cruz-Medina, Eva; Arellano-Hernández, Aurelia

2005-01-01

Bone and skeletal muscle mass loss is related to age. Mechanisms by which they interact have not been well established. To establish a relationship of age with serum levels of IGF-1, skeletal muscle and appendicular muscle mass index, and their influence in isokinetic parameters in osteoporotic female patients. Pearson correlation coefficient and linear regression analyses were used. There were 38 patients with a mean age of 65.16 years (range: 50-84 years), mean appendicular skeletal mass index (ASMI) of 6.3 kg/m2 (range: 4.3-8.3) and mean skeletal mass index (SMI) of 12.4 kg/m2 (range: 9.6-15.7), mean serum IGF-1 levels of 82.97 ng/ml (range: 22-177). Linear regression predicted hip mineral bone density by SMI (p = 0.19) and age (p = 0.017, r = 0.50). Some isokinetic parameters had a positive correlation for work with age. Knee acceleration time had a positive correlation with age. Osteoporosis and sarcopenia may have related pathophysiologic mechanisms. Growth factor study must include the influence of sex hormones. Some isokinetic parameters are determined by the predominant muscle fiber, skeletal mass index and age.
Advanced statistics: linear regression, part I: simple linear regression.

PubMed

Marill, Keith A

2004-01-01

Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

Temperature profile retrievals with extended Kalman-Bucy filters

NASA Technical Reports Server (NTRS)

Ledsham, W. H.; Staelin, D. H.

1979-01-01

The Extended Kalman-Bucy Filter is a powerful technique for estimating non-stationary random parameters in situations where the received signal is a noisy non-linear function of those parameters. A practical causal filter for retrieving atmospheric temperature profiles from radiances observed at a single scan angle by the Scanning Microwave Spectrometer (SCAMS) carried on the Nimbus 6 satellite typically shows approximately a 10-30% reduction in rms error about the mean at almost all levels below 70 mb when compared with a regression inversion.
Modeling Pan Evaporation for Kuwait by Multiple Linear Regression

PubMed Central

Almedeij, Jaber

2012-01-01

Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values. PMID:23226984
Missing-value estimation using linear and non-linear regression with Bayesian gene selection.

PubMed

Zhou, Xiaobo; Wang, Xiaodong; Dougherty, Edward R

2003-11-22

Data from microarray experiments are usually in the form of large matrices of expression levels of genes under different experimental conditions. Owing to various reasons, there are frequently missing values. Estimating these missing values is important because they affect downstream analysis, such as clustering, classification and network design. Several methods of missing-value estimation are in use. The problem has two parts: (1) selection of genes for estimation and (2) design of an estimation rule. We propose Bayesian variable selection to obtain genes to be used for estimation, and employ both linear and nonlinear regression for the estimation rule itself. Fast implementation issues for these methods are discussed, including the use of QR decomposition for parameter estimation. The proposed methods are tested on data sets arising from hereditary breast cancer and small round blue-cell tumors. The results compare very favorably with currently used methods based on the normalized root-mean-square error. The appendix is available from http://gspsnap.tamu.edu/gspweb/zxb/missing_zxb/ (user: gspweb; passwd: gsplab).
Comparison of adsorption equilibrium models for the study of CL-, NO3- and SO4(2-) removal from aqueous solutions by an anion exchange resin.

PubMed

Dron, Julien; Dodi, Alain

2011-06-15

The removal of chloride, nitrate and sulfate ions from aqueous solutions by a macroporous resin is studied through the ion exchange systems OH(-)/Cl(-), OH(-)/NO(3)(-), OH(-)/SO(4)(2-), and HCO(3)(-)/Cl(-), Cl(-)/NO(3)(-), Cl(-)/SO(4)(2-). They are investigated by means of Langmuir, Freundlich, Dubinin-Radushkevitch (D-R) and Dubinin-Astakhov (D-A) single-component adsorption isotherms. The sorption parameters and the fitting of the models are determined by nonlinear regression and discussed. The Langmuir model provides a fair estimation of the sorption capacity whatever the system under study, on the contrary to Freundlich and D-R models. The adsorption energies deduced from Dubinin and Langmuir isotherms are in good agreement, and the surface parameter of the D-A isotherm appears consistent. All models agree on the order of affinity OH(-)
A weighted least squares estimation of the polynomial regression model on paddy production in the area of Kedah and Perlis

NASA Astrophysics Data System (ADS)

Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd

2017-08-01

The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.
Predicting Grain Growth in Nanocrystalline Materials: A Thermodynamic and Kinetic-Based Model Informed by High Temperature X-ray Diffraction Experiments

DTIC Science & Technology

2014-10-01

and d) Γb0. The scatter of the data points is due to the variation in the other parameters at 1 h. The line represents a best fit linear regression...parameters: a) Hseg, b) QL, c) γ0, and d) Γb0. The scatter of the data points is due to the variation in the other parameters at 1 h. The line represents...concentration x0 for the nanocrystalline Fe–Zr system. The white square data point shows the location of the experimental data used for fitting the
Simulation of multi-stage nonlinear bone remodeling induced by fixed partial dentures of different configurations: a comparative clinical and numerical study.

PubMed

Liao, Zhipeng; Yoda, Nobuhiro; Chen, Junning; Zheng, Keke; Sasaki, Keiichi; Swain, Michael V; Li, Qing

2017-04-01

This paper aimed to develop a clinically validated bone remodeling algorithm by integrating bone's dynamic properties in a multi-stage fashion based on a four-year clinical follow-up of implant treatment. The configurational effects of fixed partial dentures (FPDs) were explored using a multi-stage remodeling rule. Three-dimensional real-time occlusal loads during maximum voluntary clenching were measured with a piezoelectric force transducer and were incorporated into a computerized tomography-based finite element mandibular model. Virtual X-ray images were generated based on simulation and statistically correlated with clinical data using linear regressions. The strain energy density-driven remodeling parameters were regulated over the time frame considered. A linear single-stage bone remodeling algorithm, with a single set of constant remodeling parameters, was found to poorly fit with clinical data through linear regression (low [Formula: see text] and R), whereas a time-dependent multi-stage algorithm better simulated the remodeling process (high [Formula: see text] and R) against the clinical results. The three-implant-supported and distally cantilevered FPDs presented noticeable and continuous bone apposition, mainly adjacent to the cervical and apical regions. The bridged and mesially cantilevered FPDs showed bone resorption or no visible bone formation in some areas. Time-dependent variation of bone remodeling parameters is recommended to better correlate remodeling simulation with clinical follow-up. The position of FPD pontics plays a critical role in mechanobiological functionality and bone remodeling. Caution should be exercised when selecting the cantilever FPD due to the risk of overloading bone resorption.
Area under the curve predictions of dalbavancin, a new lipoglycopeptide agent, using the end of intravenous infusion concentration data point by regression analyses such as linear, log-linear and power models.

PubMed

Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally

2018-02-01

1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUC inf ) of dalbavancin is a key parameter and AUC inf /MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. C max ) C max versus AUC inf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUC inf were performed using published C max data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The C max versus AUC inf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE < 10.3%. The external data evaluation showed that the models predicted AUC inf with a RMSE of 3.02-27.46% with fold difference largely contained within 0.64-1.48. 5. Regardless of the regression models, a single time point strategy of using C max (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUC inf of dalbavancin in patients.
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

ERIC Educational Resources Information Center

He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei

2013-01-01

Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Effect Size Measure and Analysis of Single Subject Designs

ERIC Educational Resources Information Center

Society for Research on Educational Effectiveness, 2013

2013-01-01

One of the vexing problems in the analysis of SSD is in the assessment of the effect of intervention. Serial dependence notwithstanding, the linear model approach that has been advanced involves, in general, the fitting of regression lines (or curves) to the set of observations within each phase of the design and comparing the parameters of these…
An algebraic method for constructing stable and consistent autoregressive filters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harlim, John, E-mail: jharlim@psu.edu; Department of Meteorology, the Pennsylvania State University, University Park, PA 16802; Hong, Hoon, E-mail: hong@ncsu.edu

2015-02-15

In this paper, we introduce an algebraic method to construct stable and consistent univariate autoregressive (AR) models of low order for filtering and predicting nonlinear turbulent signals with memory depth. By stable, we refer to the classical stability condition for the AR model. By consistent, we refer to the classical consistency constraints of Adams–Bashforth methods of order-two. One attractive feature of this algebraic method is that the model parameters can be obtained without directly knowing any training data set as opposed to many standard, regression-based parameterization methods. It takes only long-time average statistics as inputs. The proposed method provides amore » discretization time step interval which guarantees the existence of stable and consistent AR model and simultaneously produces the parameters for the AR models. In our numerical examples with two chaotic time series with different characteristics of decaying time scales, we find that the proposed AR models produce significantly more accurate short-term predictive skill and comparable filtering skill relative to the linear regression-based AR models. These encouraging results are robust across wide ranges of discretization times, observation times, and observation noise variances. Finally, we also find that the proposed model produces an improved short-time prediction relative to the linear regression-based AR-models in forecasting a data set that characterizes the variability of the Madden–Julian Oscillation, a dominant tropical atmospheric wave pattern.« less
Can we predict uranium bioavailability based on soil parameters? Part 1: effect of soil parameters on soil solution uranium concentration.

PubMed

Vandenhove, H; Van Hees, M; Wouters, K; Wannijn, J

2007-01-01

Present study aims to quantify the influence of soil parameters on soil solution uranium concentration for (238)U spiked soils. Eighteen soils collected under pasture were selected such that they covered a wide range for those parameters hypothesised as being potentially important in determining U sorption. Maximum soil solution uranium concentrations were observed at alkaline pH, high inorganic carbon content and low cation exchange capacity, organic matter content, clay content, amorphous Fe and phosphate levels. Except for the significant correlation between the solid-liquid distribution coefficients (K(d), L kg(-1)) and the organic matter content (R(2)=0.70) and amorphous Fe content (R(2)=0.63), there was no single soil parameter significantly explaining the soil solution uranium concentration (which varied 100-fold). Above pH=6, log(K(d)) was linearly related with pH [log(K(d))=-1.18 pH+10.8, R(2)=0.65]. Multiple linear regression analysis did result in improved predictions of the soil solution uranium concentration but the model was complex.
Poster — Thur Eve — 44: Linearization of Compartmental Models for More Robust Estimates of Regional Hemodynamic, Metabolic and Functional Parameters using DCE-CT/PET Imaging

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blais, AR; Dekaban, M; Lee, T-Y

2014-08-15

Quantitative analysis of dynamic positron emission tomography (PET) data usually involves minimizing a cost function with nonlinear regression, wherein the choice of starting parameter values and the presence of local minima affect the bias and variability of the estimated kinetic parameters. These nonlinear methods can also require lengthy computation time, making them unsuitable for use in clinical settings. Kinetic modeling of PET aims to estimate the rate parameter k{sub 3}, which is the binding affinity of the tracer to a biological process of interest and is highly susceptible to noise inherent in PET image acquisition. We have developed linearized kineticmore » models for kinetic analysis of dynamic contrast enhanced computed tomography (DCE-CT)/PET imaging, including a 2-compartment model for DCE-CT and a 3-compartment model for PET. Use of kinetic parameters estimated from DCE-CT can stabilize the kinetic analysis of dynamic PET data, allowing for more robust estimation of k{sub 3}. Furthermore, these linearized models are solved with a non-negative least squares algorithm and together they provide other advantages including: 1) only one possible solution and they do not require a choice of starting parameter values, 2) parameter estimates are comparable in accuracy to those from nonlinear models, 3) significantly reduced computational time. Our simulated data show that when blood volume and permeability are estimated with DCE-CT, the bias of k{sub 3} estimation with our linearized model is 1.97 ± 38.5% for 1,000 runs with a signal-to-noise ratio of 10. In summary, we have developed a computationally efficient technique for accurate estimation of k{sub 3} from noisy dynamic PET data.« less
Challenge from the simple: some caveats in linearization of the Boyle-van't Hoff and Arrhenius plots.

PubMed

Katkov, Igor I

2008-10-01

Some aspects of proper linearization of the Boyle-van't Hoff (BVH) relationship for calculation of the osmotically inactive volume v(b), and Arrhenius plot (AP) for the activation energy E(a) are discussed. It is shown that the commonly used determination of the slope and the intercept (v(b)), which are presumed to be independent from each other, is invalid if the initial intracellular molality m(0) is known. Instead, the linear regression with only one independent parameter (v(b)) or the Least Square Method (LSM) with v(b) as the only fitting LSM parameter must be applied. The slope can then be calculated from the BVH relationship as the function of v(b). In case of unknown m(0) (for example, if cells are preloaded with trehalose, or electroporation caused ion leakage, etc.), it is considered as the second independent statistical parameter to be found. In this (and only) scenario, all three methods give the same results for v(b) and m(0). AP can be linearized only for water hydraulic conductivity (L(p)) and solute mobility (omega(s)) while water and solute permeabilities P(w) identical withL(p)RT and P(s) identical withomega(s)RT cannot be linearized because they have pre-exponential factor (RT) that depends on the temperature T.
Normative values for the spine shape parameters using 3D standing analysis from a database of 268 asymptomatic Caucasian and Japanese subjects.

PubMed

Le Huec, Jean Charles; Hasegawa, Kazuhiro

2016-11-01

Sagittal balance analysis has gained importance and the measure of the radiographic spinopelvic parameters is now a routine part of many interventions of spine surgery. Indeed, surgical correction of lumbar lordosis must be proportional to the pelvic incidence (PI). The compensatory mechanisms [pelvic retroversion with increased pelvic tilt (PT) and decreased thoracic kyphosis] spontaneously reverse after successful surgery. This study is the first to provide 3D standing spinopelvic reference values from a large database of Caucasian (n = 137) and Japanese (n = 131) asymptomatic subjects. The key spinopelvic parameters [e.g., PI, PT, sacral slope (SS)] were comparable in Japanese and Caucasian populations. Three equations, namely lumbar lordosis based on PI, PT based on PI and SS based on PI, were calculated after linear regression modeling and were comparable in both populations: lumbar lordosis (L1-S1) = 0.54*PI + 27.6, PT = 0.44*PI - 11.4 and SS = 0.54*PI + 11.90. We showed that the key spinopelvic parameters obtained from a large database of healthy subjects were comparable for Causasian and Japanese populations. The normative values provided in this study and the equations obtained after linear regression modeling could help to estimate pre-operatively the lumbar lordosis restoration and could be also used as guidelines for spinopelvic sagittal balance.
A reliable and cost effective approach for radiographic monitoring in nutritional rickets

PubMed Central

Gupta, V; Sharma, V; Sinha, B; Samanta, S

2014-01-01

Objective: Radiological scoring is particularly useful in rickets, where pre-treatment radiographical findings can reflect the disease severity and can be used to monitor the improvement. However, there is only a single radiographic scoring system for rickets developed by Thacher and, to the best of our knowledge, no study has evaluated radiographic changes in rickets based on this scoring system apart from the one done by Thacher himself. The main objective of this study is to compare and analyse the pre-treatment and post-treatment radiographic parameters in nutritional rickets with the help of Thacher's scoring technique. Methods: 176 patients with nutritional rickets were given a single intramuscular injection of vitamin D (600 000 IU) along with oral calcium (50 mg kg−1) and vitamin D (400 IU per day) until radiological resolution and followed for 1 year. Pre- and post-treatment radiological parameters were compared and analysed statistically based on Thacher's scoring system. Results: Radiological resolution was complete by 6 months. Time for radiological resolution and initial radiological score were linearly associated on regression analysis. The distal ulna was the last to heal in most cases except when the initial score was 10, when distal femur was the last to heal. Conclusion: Thacher's scoring system can effectively monitor nutritional rickets. The formula derived through linear regression has prognostic significance. Advances in knowledge: The distal femur is a better indicator in radiologically severe rickets and when resolution is delayed. Thacher's scoring is very useful for monitoring of rickets. The formula derived through linear regression can predict the expected time for radiological resolution. PMID:24593231
A model for predicting sulcus-to-sulcus diameter in posterior chamber phakic intraocular lens candidates: correlation between ocular biometric parameters.

PubMed

Ghoreishi, Mohammad; Abdi-Shahshahani, Mehdi; Peyman, Alireza; Pourazizi, Mohsen

2018-02-21

The aim of this study was to determine the correlation between ocular biometric parameters and sulcus-to-sulcus (STS) diameter. This was a cross-sectional study of preoperative ocular biometry data of patients who were candidates for phakic intraocular lens (IOL) surgery. Subjects underwent ocular biometry analysis, including refraction error evaluation using an autorefractor and Orbscan topography for white-to-white (WTW) corneal diameter and measurement. Pentacam was used to perform WTW corneal diameter and measurements of minimum and maximum keratometry (K). Measurements of STS and angle-to-angle (ATA) were obtained using a 50-MHz B-mode ultrasound device. Anterior optical coherence tomography was performed for anterior chamber depth measurement. Pearson's correlation test and stepwise linear regression analysis were used to find a model to predict STS. Fifty-eight eyes of 58 patients were enrolled. Mean age ± standard deviation of sample was 28.95 ± 6.04 years. The Pearson's correlation coefficient between STS with WTW, ATA, mean K was 0.383, 0.492, and - 0.353, respectively, which was statistically significant (all P < 0.001). Using stepwise linear regression analysis, there is a statistically significant association between STS with WTW (P = 0.011) and mean K (P = 0.025). The standardized coefficient was 0.323 and - 0.284 for WTW and mean K, respectively. The stepwise linear regression analysis equation was: (STS = 9.549 + 0.518 WTW - 0.083 mean K). Based on our result, given the correlation of STS with WTW and mean K and potential of direct and essay measurement of WTW and mean K, it seems that current IOL sizing protocols could be estimating with WTW and mean K.
Methamphetamine abuse during pregnancy and its health impact on neonates born at Siriraj Hospital, Bangkok, Thailand.

PubMed

Chomchai, Chulathida; Na Manorom, Natawadee; Watanarungsan, Pornchai; Yossuck, Panitan; Chomchai, Summon

2004-03-01

To ascertain the impact of intrauterine methamphetamine exposure on the overall health of newborn infants at Siriraj Hospital, Bangkok, Thailand, birth records of somatic growth parameters and neonatal withdrawal symptoms of 47 infants born to methamphetamine-abusing women during January 2001 to December 2001 were compared to 49 newborns whose mothers did not use methamphetamines during pregnancy. The data on somatic growth was analyzed using linear regression and multiple linear regression. The association between methamphetamine use and withdrawal symptoms was analyzed using the chi-square. Home visitation and maternal interview records were reviewed in order to assess for child-rearing attitude, and psychosocial parameters. Infants of methamphetamine-abusing mothers were found to have a significantly smaller gestational age-adjusted head circumference (regression coefficient = -1.458, p < 0.001) and birth weight (regression coefficient = -217.9, p < or = 0.001) measurements. Methamphetamine exposure was also associated with symptoms of agitation (5/47), vomiting (11/47) and tachypnea (12/47) when compared to the non-exposed group (p < 0r =0.001). Maternal interviews were conducted in 23 cases and showed that: 96% of the cases had inadequate prenatal care (<5 visits), 48% had at least one parent involved in prostitution, 39% of the mothers were unwilling to take their children home, and government or non-government support were provided in only 30% of the cases. In-utero methamphetamine exposure has been shown to adversely effect somatic growth of newborns and cause a variety of withdrawal-like symptoms. These infants are also psychosocially disadvantaged and are at greater risk for abuse and neglect.
Quantum Chemically Estimated Abraham Solute Parameters Using Multiple Solvent-Water Partition Coefficients and Molecular Polarizability.

PubMed

Liang, Yuzhen; Xiong, Ruichang; Sandler, Stanley I; Di Toro, Dominic M

2017-09-05

Polyparameter Linear Free Energy Relationships (pp-LFERs), also called Linear Solvation Energy Relationships (LSERs), are used to predict many environmentally significant properties of chemicals. A method is presented for computing the necessary chemical parameters, the Abraham parameters (AP), used by many pp-LFERs. It employs quantum chemical calculations and uses only the chemical's molecular structure. The method computes the Abraham E parameter using density functional theory computed molecular polarizability and the Clausius-Mossotti equation relating the index refraction to the molecular polarizability, estimates the Abraham V as the COSMO calculated molecular volume, and computes the remaining AP S, A, and B jointly with a multiple linear regression using sixty-five solvent-water partition coefficients computed using the quantum mechanical COSMO-SAC solvation model. These solute parameters, referred to as Quantum Chemically estimated Abraham Parameters (QCAP), are further adjusted by fitting to experimentally based APs using QCAP parameters as the independent variables so that they are compatible with existing Abraham pp-LFERs. QCAP and adjusted QCAP for 1827 neutral chemicals are included. For 24 solvent-water systems including octanol-water, predicted log solvent-water partition coefficients using adjusted QCAP have the smallest root-mean-square errors (RMSEs, 0.314-0.602) compared to predictions made using APs estimated using the molecular fragment based method ABSOLV (0.45-0.716). For munition and munition-like compounds, adjusted QCAP has much lower RMSE (0.860) than does ABSOLV (4.45) which essentially fails for these compounds.
A simple approach to power and sample size calculations in logistic regression and Cox regression models.

PubMed

Vaeth, Michael; Skovlund, Eva

2004-06-15

For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

Kinetics of hydrogen peroxide decomposition by catalase: hydroxylic solvent effects.

PubMed

Raducan, Adina; Cantemir, Anca Ruxandra; Puiu, Mihaela; Oancea, Dumitru

2012-11-01

The effect of water-alcohol (methanol, ethanol, propan-1-ol, propan-2-ol, ethane-1,2-diol and propane-1,2,3-triol) binary mixtures on the kinetics of hydrogen peroxide decomposition in the presence of bovine liver catalase is investigated. In all solvents, the activity of catalase is smaller than in water. The results are discussed on the basis of a simple kinetic model. The kinetic constants for product formation through enzyme-substrate complex decomposition and for inactivation of catalase are estimated. The organic solvents are characterized by several physical properties: dielectric constant (D), hydrophobicity (log P), concentration of hydroxyl groups ([OH]), polarizability (α), Kamlet-Taft parameter (β) and Kosower parameter (Z). The relationships between the initial rate, kinetic constants and medium properties are analyzed by linear and multiple linear regression.
Comprehensive ripeness-index for prediction of ripening level in mangoes by multivariate modelling of ripening behaviour

NASA Astrophysics Data System (ADS)

Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan

2017-01-01

Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.
Sensitivity Analysis of Mechanical Parameters of Different Rock Layers to the Stability of Coal Roadway in Soft Rock Strata

PubMed Central

Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing

2013-01-01

According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Unsteady hovering wake parameters identified from dynamic model tests, part 1

NASA Technical Reports Server (NTRS)

Hohenemser, K. H.; Crews, S. T.

1977-01-01

The development of a 4-bladed model rotor is reported that can be excited with a simple eccentric mechanism in progressing and regressing modes with either harmonic or transient inputs. Parameter identification methods were applied to the problem of extracting parameters for linear perturbation models, including rotor dynamic inflow effects, from the measured blade flapping responses to transient pitch stirring excitations. These perturbation models were then used to predict blade flapping response to other pitch stirring transient inputs, and rotor wake and blade flapping responses to harmonic inputs. The viability and utility of using parameter identification methods for extracting the perturbation models from transients are demonstrated through these combined analytical and experimental studies.
Mass Spectrometry Parameters Optimization for the 46 Multiclass Pesticides Determination in Strawberries with Gas Chromatography Ion-Trap Tandem Mass Spectrometry

NASA Astrophysics Data System (ADS)

Fernandes, Virgínia C.; Vera, Jose L.; Domingues, Valentina F.; Silva, Luís M. S.; Mateus, Nuno; Delerue-Matos, Cristina

2012-12-01

Multiclass analysis method was optimized in order to analyze pesticides traces by gas chromatography with ion-trap and tandem mass spectrometry (GC-MS/MS). The influence of some analytical parameters on pesticide signal response was explored. Five ion trap mass spectrometry (IT-MS) operating parameters, including isolation time (IT), excitation voltage (EV), excitation time (ET), maximum excitation energy or " q" value (q), and isolation mass window (IMW) were numerically tested in order to maximize the instrument analytical signal response. For this, multiple linear regression was used in data analysis to evaluate the influence of the five parameters on the analytical response in the ion trap mass spectrometer and to predict its response. The assessment of the five parameters based on the regression equations substantially increased the sensitivity of IT-MS/MS in the MS/MS mode. The results obtained show that for most of the pesticides, these parameters have a strong influence on both signal response and detection limit. Using the optimized method, a multiclass pesticide analysis was performed for 46 pesticides in a strawberry matrix. Levels higher than the limit established for strawberries by the European Union were found in some samples.
Correlation and simple linear regression.

PubMed

Eberly, Lynn E

2007-01-01

This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Relations among soil radon, environmental parameters, volcanic and seismic events at Mt. Etna (Italy)

NASA Astrophysics Data System (ADS)

Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.

2013-12-01

From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Robust estimation for partially linear models with large-dimensional covariates

PubMed Central

Zhu, LiPing; Li, RunZe; Cui, HengJian

2014-01-01

We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087
Robust estimation for partially linear models with large-dimensional covariates.

PubMed

Zhu, LiPing; Li, RunZe; Cui, HengJian

2013-10-01

We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
Female Literacy Rate is a Better Predictor of Birth Rate and Infant Mortality Rate in India.

PubMed

Saurabh, Suman; Sarkar, Sonali; Pandey, Dhruv K

2013-01-01

Educated women are known to take informed reproductive and healthcare decisions. These result in population stabilization and better infant care reflected by lower birth rates and infant mortality rates (IMRs), respectively. Our objective was to study the relationship of male and female literacy rates with crude birth rates (CBRs) and IMRs of the states and union territories (UTs) of India. The data were analyzed using linear regression. CBR and IMR were taken as the dependent variables; while the overall literacy rates, male, and female literacy rates were the independent variables. CBRs were inversely related to literacy rates (slope parameter = -0.402, P < 0.001). On multiple linear regression with male and female literacy rates, a significant inverse relationship emerged between female literacy rate and CBR (slope = -0.363, P < 0.001), while male literacy rate was not significantly related to CBR (P = 0.674). IMR of the states were also inversely related to their literacy rates (slope = -1.254, P < 0.001). Multiple linear regression revealed a significant inverse relationship between IMR and female literacy (slope = -0.816, P = 0.031), whereas male literacy rate was not significantly related (P = 0.630). Female literacy is relatively highly important for both population stabilization and better infant health.
Influence of Curing Mode on the Surface Energy and Sorption/Solubility of Dental Self-Adhesive Resin Cements

PubMed Central

Kim, Hyun-Jin; Bagheri, Rafat; Kim, Young Kyung; Son, Jun Sik; Kwon, Tae-Yub

2017-01-01

This study investigated the influence of curing mode (dual- or self-cure) on the surface energy and sorption/solubility of four self-adhesive resin cements (SARCs) and one conventional resin cement. The degree of conversion (DC) and surface energy parameters including degree of hydrophilicity (DH) were determined using Fourier transform infrared spectroscopy and contact angle measurements, respectively (n = 5). Sorption and solubility were assessed by mass gain or loss after storage in distilled water or lactic acid for 60 days (n = 5). A linear regression model was used to correlate between the results (%DC vs. DH and %DC/DH vs. sorption/solubility). For all materials, the dual-curing consistently produced significantly higher %DC values than the self-curing (p < 0.05). Significant negative linear regressions were established between the %DC and DH in both curing modes (p < 0.05). Overall, the SARCs showed higher sorption/solubility values, in particular when immersed in lactic acid, than the conventional resin cement. Linear regression revealed that %DC and DH were negatively and positively correlated with the sorption/solubility values, respectively. Dual-curing of SARCs seems to lower the sorption and/or solubility in comparison with self-curing by increased %DC and occasionally decreased hydrophilicity. PMID:28772489
The extinction law from photometric data: linear regression methods

NASA Astrophysics Data System (ADS)

Ascenso, J.; Lombardi, M.; Lada, C. J.; Alves, J.

2012-04-01

Context. The properties of dust grains, in particular their size distribution, are expected to differ from the interstellar medium to the high-density regions within molecular clouds. Since the extinction at near-infrared wavelengths is caused by dust, the extinction law in cores should depart from that found in low-density environments if the dust grains have different properties. Aims: We explore methods to measure the near-infrared extinction law produced by dense material in molecular cloud cores from photometric data. Methods: Using controlled sets of synthetic and semi-synthetic data, we test several methods for linear regression applied to the specific problem of deriving the extinction law from photometric data. We cover the parameter space appropriate to this type of observations. Results: We find that many of the common linear-regression methods produce biased results when applied to the extinction law from photometric colors. We propose and validate a new method, LinES, as the most reliable for this effect. We explore the use of this method to detect whether or not the extinction law of a given reddened population has a break at some value of extinction. Based on observations collected at the European Organisation for Astronomical Research in the Southern Hemisphere, Chile (ESO programmes 069.C-0426 and 074.C-0728).
Performance of an Axisymmetric Rocket Based Combined Cycle Engine During Rocket Only Operation Using Linear Regression Analysis

NASA Technical Reports Server (NTRS)

Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.

1998-01-01

The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.
Variable Selection with Prior Information for Generalized Linear Models via the Prior LASSO Method.

PubMed

Jiang, Yuan; He, Yunxiao; Zhang, Heping

LASSO is a popular statistical tool often used in conjunction with generalized linear models that can simultaneously select variables and estimate parameters. When there are many variables of interest, as in current biological and biomedical studies, the power of LASSO can be limited. Fortunately, so much biological and biomedical data have been collected and they may contain useful information about the importance of certain variables. This paper proposes an extension of LASSO, namely, prior LASSO (pLASSO), to incorporate that prior information into penalized generalized linear models. The goal is achieved by adding in the LASSO criterion function an additional measure of the discrepancy between the prior information and the model. For linear regression, the whole solution path of the pLASSO estimator can be found with a procedure similar to the Least Angle Regression (LARS). Asymptotic theories and simulation results show that pLASSO provides significant improvement over LASSO when the prior information is relatively accurate. When the prior information is less reliable, pLASSO shows great robustness to the misspecification. We illustrate the application of pLASSO using a real data set from a genome-wide association study.
Locomotive syndrome is associated not only with physical capacity but also degree of depression.

PubMed

Ikemoto, Tatsunori; Inoue, Masayuki; Nakata, Masatoshi; Miyagawa, Hirofumi; Shimo, Kazuhiro; Wakabayashi, Toshiko; Arai, Young-Chang P; Ushida, Takahiro

2016-05-01

Reports of locomotive syndrome (LS) have recently been increasing. Although physical performance measures for LS have been well investigated to date, studies including psychiatric assessment are still scarce. Hence, the aim of this study was to investigate both physical and mental parameters in relation to presence and severity of LS using a 25-question geriatric locomotive function scale (GLFS-25) questionnaire. 150 elderly people aged over 60 years who were members of our physical-fitness center and displayed well-being were enrolled in this study. Firstly, using the previously determined GLFS-25 cutoff value (=16 points), subjects were divided into two groups accordingly: an LS and non-LS group in order to compare each parameter (age, grip strength, timed-up-and-go test (TUG), one-leg standing with eye open, back muscle and leg muscle strength, degree of depression and cognitive impairment) between the groups using the Mann-Whitney U-test followed by multiple logistic regression analysis. Secondly, a multiple linear regression was conducted to determine which variables showed the strongest correlation with severity of LS. We confirmed 110 people for non-LS (73%) and 40 people for LS using the GLFS-25 cutoff value. Comparative analysis between LS and non-LS revealed significant differences in parameters in age, grip strength, TUG, one-leg standing, back muscle strength and degree of depression (p < 0.006, after Bonferroni correction). Multiple logistic regression revealed that functional decline in grip strength, TUG and one-leg standing and degree of depression were significantly associated with LS. On the other hand, we observed that the significant contributors towards the GLFS-25 score were TUG and degree of depression in multiple linear regression analysis. The results indicate that LS is associated with not only the capacity of physical performance but also the degree of depression although most participants fell under the criteria of LS. Copyright © 2016 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
A practical approach for linearity assessment of calibration curves under the International Union of Pure and Applied Chemistry (IUPAC) guidelines for an in-house validation of method of analysis.

PubMed

Sanagi, M Marsin; Nasir, Zalilah; Ling, Susie Lu; Hermawan, Dadan; Ibrahim, Wan Aini Wan; Naim, Ahmedy Abu

2010-01-01

Linearity assessment as required in method validation has always been subject to different interpretations and definitions by various guidelines and protocols. However, there are very limited applicable implementation procedures that can be followed by a laboratory chemist in assessing linearity. Thus, this work proposes a simple method for linearity assessment in method validation by a regression analysis that covers experimental design, estimation of the parameters, outlier treatment, and evaluation of the assumptions according to the International Union of Pure and Applied Chemistry guidelines. The suitability of this procedure was demonstrated by its application to an in-house validation for the determination of plasticizers in plastic food packaging by GC.
Analysis of the Effects of the Commander’s Battle Positioning on Unit Combat Performance

DTIC Science & Technology

1991-03-01

Analysis ......... .. 58 Logistic Regression Analysis ......... .. 61 Canonical Correlation Analysis ........ .. 62 Descriminant Analysis...entails classifying objects into two or more distinct groups, or responses. Dillon defines descriminant analysis as "deriving linear combinations of the...object given it’s predictor variables. The second objective is, through analysis of the parameters of the descriminant functions, determine those
Nearest-neighbor thermodynamics of deoxyinosine pairs in DNA duplexes

PubMed Central

Watkins, Norman E.; SantaLucia, John

2005-01-01

Nearest-neighbor thermodynamic parameters of the ‘universal pairing base’ deoxyinosine were determined for the pairs I·C, I·A, I·T, I·G and I·I adjacent to G·C and A·T pairs. Ultraviolet absorbance melting curves were measured and non-linear regression performed on 84 oligonucleotide duplexes with 9 or 12 bp lengths. These data were combined with data for 13 inosine containing duplexes from the literature. Multiple linear regression was used to solve for the 32 nearest-neighbor unknowns. The parameters predict the Tm for all sequences within 1.2°C on average. The general trend in decreasing stability is I·C > I·A > I·T ≈ I· G > I·I. The stability trend for the base pair 5′ of the I·X pair is G·C > C·G > A·T > T·A. The stability trend for the base pair 3′ of I·X is the same. These trends indicate a complex interplay between H-bonding, nearest-neighbor stacking, and mismatch geometry. A survey of 14 tandem inosine pairs and 8 tandem self-complementary inosine pairs is also provided. These results may be used in the design of degenerate PCR primers and for degenerate microarray probes. PMID:16264087
A non-linear data mining parameter selection algorithm for continuous variables

PubMed Central

Razavi, Marianne; Brady, Sean

2017-01-01

In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables. PMID:29131829
Estimation of Compaction Parameters Based on Soil Classification

NASA Astrophysics Data System (ADS)

Lubis, A. S.; Muis, Z. A.; Hastuty, I. P.; Siregar, I. M.

2018-02-01

Factors that must be considered in compaction of the soil works were the type of soil material, field control, maintenance and availability of funds. Those problems then raised the idea of how to estimate the density of the soil with a proper implementation system, fast, and economical. This study aims to estimate the compaction parameter i.e. the maximum dry unit weight (γ dmax) and optimum water content (Wopt) based on soil classification. Each of 30 samples were being tested for its properties index and compaction test. All of the data’s from the laboratory test results, were used to estimate the compaction parameter values by using linear regression and Goswami Model. From the research result, the soil types were A4, A-6, and A-7 according to AASHTO and SC, SC-SM, and CL based on USCS. By linear regression, the equation for estimation of the maximum dry unit weight (γdmax *)=1,862-0,005*FINES- 0,003*LL and estimation of the optimum water content (wopt *)=- 0,607+0,362*FINES+0,161*LL. By Goswami Model (with equation Y=mLogG+k), for estimation of the maximum dry unit weight (γdmax *) with m=-0,376 and k=2,482, for estimation of the optimum water content (wopt *) with m=21,265 and k=-32,421. For both of these equations a 95% confidence interval was obtained.

Quantifying the sensitivity of feedstock properties and process conditions on hydrochar yield, carbon content, and energy content.

PubMed

Li, Liang; Wang, Yiying; Xu, Jiting; Flora, Joseph R V; Hoque, Shamia; Berge, Nicole D

2018-08-01

Hydrothermal carbonization (HTC) is a wet, low temperature thermal conversion process that continues to gain attention for the generation of hydrochar. The importance of specific process conditions and feedstock properties on hydrochar characteristics is not well understood. To evaluate this, linear and non-linear models were developed to describe hydrochar characteristics based on data collected from HTC-related literature. A Sobol analysis was subsequently conducted to identify parameters that most influence hydrochar characteristics. Results from this analysis indicate that for each investigated hydrochar property, the model fit and predictive capability associated with the random forest models is superior to both the linear and regression tree models. Based on results from the Sobol analysis, the feedstock properties and process conditions most influential on hydrochar yield, carbon content, and energy content were identified. In addition, a variational process parameter sensitivity analysis was conducted to determine how feedstock property importance changes with process conditions. Copyright © 2018 Elsevier Ltd. All rights reserved.
Application of multivariate chemometric techniques for simultaneous determination of five parameters of cottonseed oil by single bounce attenuated total reflectance Fourier transform infrared spectroscopy.

PubMed

Talpur, M Younis; Kara, Huseyin; Sherazi, S T H; Ayyildiz, H Filiz; Topkafa, Mustafa; Arslan, Fatma Nur; Naz, Saba; Durmaz, Fatih; Sirajuddin

2014-11-01

Single bounce attenuated total reflectance (SB-ATR) Fourier transform infrared (FTIR) spectroscopy in conjunction with chemometrics was used for accurate determination of free fatty acid (FFA), peroxide value (PV), iodine value (IV), conjugated diene (CD) and conjugated triene (CT) of cottonseed oil (CSO) during potato chips frying. Partial least square (PLS), stepwise multiple linear regression (SMLR), principal component regression (PCR) and simple Beer׳s law (SBL) were applied to develop the calibrations for simultaneous evaluation of five stated parameters of cottonseed oil (CSO) during frying of French frozen potato chips at 170°C. Good regression coefficients (R(2)) were achieved for FFA, PV, IV, CD and CT with value of >0.992 by PLS, SMLR, PCR, and SBL. Root mean square error of prediction (RMSEP) was found to be less than 1.95% for all determinations. Result of the study indicated that SB-ATR FTIR in combination with multivariate chemometrics could be used for accurate and simultaneous determination of different parameters during the frying process without using any toxic organic solvent. Copyright © 2014 Elsevier B.V. All rights reserved.
Tensile properties of cooked meat sausages and their correlation with texture profile analysis (TPA) parameters and physico-chemical characteristics.

PubMed

Herrero, A M; de la Hoz, L; Ordóñez, J A; Herranz, B; Romero de Ávila, M D; Cambero, M I

2008-11-01

The possibilities of using breaking strength (BS) and energy to fracture (EF) for monitoring textural properties of some cooked meat sausages (chopped, mortadella and galantines) were studied. Texture profile analysis (TPA), folding test and physico-chemical measurements were also performed. Principal component analysis enabled these meat products to be grouped into three textural profiles which showed significant (p<0.05) differences mainly for BS, hardness, adhesiveness and cohesiveness. Multivariate analysis indicated that BS, EF and TPA parameters were correlated (p<0.05) for every individual meat product (chopped, mortadella and galantines) and all products together. On the basis of these results, TPA parameters could be used for constructing regression models to predict BS. The resulting regression model for all cooked meat products was BS=-0.160+6.600∗cohesiveness-1.255∗adhesiveness+0.048∗hardness-506.31∗springiness (R(2)=0.745, p<0.00005). Simple linear regression analysis showed significant coefficients of determination between BS (R(2)=0.586, p<0.0001) versus folding test grade (FG) and EF versus FG (R(2)=0.564, p<0.0001).
Mapping of fluoride endemic areas and correlation studies of fluoride with other quality parameters of drinking water of Veppanapalli block of Dharmapuri district in Tamil Nadu.

PubMed

Karthikeyan, G; Sundarraj, A Shunmuga; Elango, K P

2003-10-01

193 drinking water samples from water sources of 27 panchayats of Veppanapalli block of Dharmapuri district of Tamil Nadu were analysed for chemical quality parameters. Based on the fluoride content of the water sources, fluoride maps differentiating regions with high / low fluoride levels were prepared using Isopleth mapping technique. The interdependence among the important chemical quality parameters were assessed using correlation studies. The experimental results of the application of linear and multiple regression equations on the influence of hardness, alkalinity, total dissolved solids and pH on fluoride are discussed.
Prevalence of vitamin D deficiency and associated factors in women and newborns in the immediate postpartum period

PubMed Central

do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro

2015-01-01

Abstract Objective: To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. Methods: This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95%, was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α <5%. Results: From 226 women included, 200 (88.5%) were 20-44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. Conclusions: This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. PMID:26100593
Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

PubMed Central

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-01-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

PubMed

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-12-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Time Series Analysis of Soil Radon Data Using Multiple Linear Regression and Artificial Neural Network in Seismic Precursory Studies

NASA Astrophysics Data System (ADS)

Singh, S.; Jaishi, H. P.; Tiwari, R. P.; Tiwari, R. C.

2017-07-01

This paper reports the analysis of soil radon data recorded in the seismic zone-V, located in the northeastern part of India (latitude 23.73N, longitude 92.73E). Continuous measurements of soil-gas emission along Chite fault in Mizoram (India) were carried out with the replacement of solid-state nuclear track detectors at weekly interval. The present study was done for the period from March 2013 to May 2015 using LR-115 Type II detectors, manufactured by Kodak Pathe, France. In order to reduce the influence of meteorological parameters, statistical analysis tools such as multiple linear regression and artificial neural network have been used. Decrease in radon concentration was recorded prior to some earthquakes that occurred during the observation period. Some false anomalies were also recorded which may be attributed to the ongoing crustal deformation which was not major enough to produce an earthquake.
Prediction of human gait parameters from temporal measures of foot-ground contact

NASA Technical Reports Server (NTRS)

Breit, G. A.; Whalen, R. T.

1997-01-01

Investigation of the influence of human physical activity on bone functional adaptation requires long-term histories of gait-related ground reaction force (GRF). Towards a simpler portable GRF measurement, we hypothesized that: 1) the reciprocal of foot-ground contact time (1/tc); or 2) the reciprocal of stride-period-normalized contact time (T/tc) predict peak vertical and horizontal GRF, loading rates, and horizontal speed during gait. GRF data were collected from 24 subjects while they walked and ran at a variety of speeds. Linear regression and ANCOVA determined the dependence of gait parameters on 1/tc and T/tc, and prediction SE. All parameters were significantly correlated to 1/tc and T/tc. The closest pooled relationship existed between peak running vertical GRF and T/tc (r2 = 0.896; SE = 3.6%) and improved with subject-specific regression (r2 = 0.970; SE = 2.2%). We conclude that temporal measures can predict force parameters of gait and may represent an alternative to direct GRF measurements for determining daily histories of habitual lower limb loading quantities necessary to quantify a bone remodeling stimulus.
Changes in Clavicle Length and Maturation in Americans: 1840-1980.

PubMed

Langley, Natalie R; Cridlin, Sandra

2016-01-01

Secular changes refer to short-term biological changes ostensibly due to environmental factors. Two well-documented secular trends in many populations are earlier age of menarche and increasing stature. This study synthesizes data on maximum clavicle length and fusion of the medial epiphysis in 1840-1980 American birth cohorts to provide a comprehensive assessment of developmental and morphological change in the clavicle. Clavicles from the Hamann-Todd Human Osteological Collection (n = 354), McKern and Stewart Korean War males (n = 341), Forensic Anthropology Data Bank (n = 1,239), and the McCormick Clavicle Collection (n = 1,137) were used in the analysis. Transition analysis was used to evaluate fusion of the medial epiphysis (scored as unfused, fusing, or fused). Several statistical treatments were used to assess fluctuations in maximum clavicle length. First, Durbin-Watson tests were used to evaluate autocorrelation, and a local regression (LOESS) was used to identify visual shifts in the regression slope. Next, piecewise regression was used to fit linear regression models before and after the estimated breakpoints. Multiple starting parameters were tested in the range determined to contain the breakpoint, and the model with the smallest mean squared error was chosen as the best fit. The parameters from the best-fit models were then used to derive the piecewise models, which were compared with the initial simple linear regression models to determine which model provided the best fit for the secular change data. The epiphyseal union data indicate a decline in the age at onset of fusion since the early twentieth century. Fusion commences approximately four years earlier in mid- to late twentieth-century birth cohorts than in late nineteenth- and early twentieth-century birth cohorts. However, fusion is completed at roughly the same age across cohorts. The most significant decline in age at onset of epiphyseal union appears to have occurred since the mid-twentieth century. LOESS plots show a breakpoint in the clavicle length data around the mid-twentieth century in both sexes, and piecewise regression models indicate a significant decrease in clavicle length in the American population after 1940. The piecewise model provides a slightly better fit than the simple linear model. Since the model standard error is not substantially different from the piecewise model, an argument could be made to select the less complex linear model. However, we chose the piecewise model to detect changes in clavicle length that are overfitted with a linear model. The decrease in maximum clavicle length is in line with a documented narrowing of the American skeletal form, as shown by analyses of cranial and facial breadth and bi-iliac breadth of the pelvis. Environmental influences on skeletal form include increases in body mass index, health improvements, improved socioeconomic status, and elimination of infectious diseases. Secular changes in bony dimensions and skeletal maturation stipulate that medical and forensic standards used to deduce information about growth, health, and biological traits must be derived from modern populations.
Monthly monsoon rainfall forecasting using artificial neural networks

NASA Astrophysics Data System (ADS)

Ganti, Ravikumar

2014-10-01

Indian agriculture sector heavily depends on monsoon rainfall for successful harvesting. In the past, prediction of rainfall was mainly performed using regression models, which provide reasonable accuracy in the modelling and forecasting of complex physical systems. Recently, Artificial Neural Networks (ANNs) have been proposed as efficient tools for modelling and forecasting. A feed-forward multi-layer perceptron type of ANN architecture trained using the popular back-propagation algorithm was employed in this study. Other techniques investigated for modeling monthly monsoon rainfall include linear and non-linear regression models for comparison purposes. The data employed in this study include monthly rainfall and monthly average of the daily maximum temperature in the North Central region in India. Specifically, four regression models and two ANN model's were developed. The performance of various models was evaluated using a wide variety of standard statistical parameters and scatter plots. The results obtained in this study for forecasting monsoon rainfalls using ANNs have been encouraging. India's economy and agricultural activities can be effectively managed with the help of the availability of the accurate monsoon rainfall forecasts.
Upper extremity disorders in heavy industry workers in Greece.

PubMed

Tsouvaltzidou, Thomaella; Alexopoulos, Evangelos; Fragkakis, Ioannis; Jelastopulu, Eleni

2017-06-18

To investigate the disability due to musculoskeletal disorders of the upper extremities in heavy industry workers. The population under study consisted of 802 employees, both white- and blue-collar, working in a shipyard industry in Athens, Greece. Data were collected through the distribution of questionnaires and the recording of individual and job-related characteristics during the period 2006-2009. The questionnaires used were the Quick Disabilities of the Arm, Shoulder and Hand (QD) Outcome Measure, the Work Ability Index (WAI) and the Short-Form-36 (SF-36) Health Survey. The QD was divided into three parameters - movement restrictions in everyday activities, work and sports/music activities - and the SF-36 into two items, physical and emotional. Multiple linear regression analysis was performed by means of the SPSS v.22 for Windows Statistical Package. The answers given by the participants for the QD did not reveal great discomfort regarding the execution of manual tasks, with the majority of the participants scoring under 5%, meaning no disability. After conducting multiple linear regression, age revealed a positive association with the parameter of restrictions in everyday activities (b = 0.64, P = 0.000). Basic education showed a statistically significant association regarding restrictions during leisure activities, with b = 2.140 ( P = 0.029) for compulsory education graduates. WAI's final score displayed negative charging in the regression analysis of all three parameters, with b = -0.142 ( P = 0.0), b = -0.099 ( P = 0.055) and b = -0.376 ( P = 0.001) respectively, while the physical and emotional components of SF-36 associated with movement restrictions only in daily activities and work. The participants' specialty made no statistically significant associations with any of the three parameters of the QD. Increased musculoskeletal disorders of the upper extremity are associated with older age, lower basic education and physical and mental/emotional health and reduced working ability.
The Hydrothermal Chemistry of Gold, Arsenic, Antimony, Mercury and Silver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bessinger, Brad; Apps, John A.

2003-03-23

A comprehensive thermodynamic database based on the Helgeson-Kirkham-Flowers (HKF) equation of state was developed for metal complexes in hydrothermal systems. Because this equation of state has been shown to accurately predict standard partial molal thermodynamic properties of aqueous species at elevated temperatures and pressures, this study provides the necessary foundation for future exploration into transport and depositional processes in polymetallic ore deposits. The HKF equation of state parameters for gold, arsenic, antimony, mercury, and silver sulfide and hydroxide complexes were derived from experimental equilibrium constants using nonlinear regression calculations. In order to ensure that the resulting parameters were internally consistent,more » those experiments utilizing incompatible thermodynamic data were re-speciated prior to regression. Because new experimental studies were used to revise the HKF parameters for H2S0 and HS-1, those metal complexes for which HKF parameters had been previously derived were also updated. It was found that predicted thermodynamic properties of metal complexes are consistent with linear correlations between standard partial molal thermodynamic properties. This result allowed assessment of several complexes for which experimental data necessary to perform regression calculations was limited. Oxygen fugacity-temperature diagrams were calculated to illustrate how thermodynamic data improves our understanding of depositional processes. Predicted thermodynamic properties were used to investigate metal transport in Carlin-type gold deposits. Assuming a linear relationship between temperature and pressure, metals are predicted to predominantly be transported as sulfide complexes at a total aqueous sulfur concentration of 0.05 m. Also, the presence of arsenic and antimony mineral phases in the deposits are shown to restrict mineralization within a limited range of chemical conditions. Finally, at a lesser aqueous sulfur concentration of 0.01 m, host rock sulfidation can explain the origin of arsenic and antimony minerals within the paragenetic sequence.« less
Modeling the soil water retention curves of soil-gravel mixtures with regression method on the Loess Plateau of China.

PubMed

Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an

2013-01-01

Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present.
Modeling the Soil Water Retention Curves of Soil-Gravel Mixtures with Regression Method on the Loess Plateau of China

PubMed Central

Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an

2013-01-01

Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel contents are present. PMID:23555040
Wavelet analysis for the study of the relations among soil radon anomalies, volcanic and seismic events: the case of Mt. Etna (Italy)

NASA Astrophysics Data System (ADS)

Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido

2013-04-01

From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Understanding Coupling of Global and Diffuse Solar Radiation with Climatic Variability

NASA Astrophysics Data System (ADS)

Hamdan, Lubna

Global solar radiation data is very important for wide variety of applications and scientific studies. However, this data is not readily available because of the cost of measuring equipment and the tedious maintenance and calibration requirements. Wide variety of models have been introduced by researchers to estimate and/or predict the global solar radiations and its components (direct and diffuse radiation) using other readily obtainable atmospheric parameters. The goal of this research is to understand the coupling of global and diffuse solar radiation with climatic variability, by investigating the relationships between these radiations and atmospheric parameters. For this purpose, we applied multilinear regression analysis on the data of National Solar Radiation Database 1991--2010 Update. The analysis showed that the main atmospheric parameters that affect the amount of global radiation received on earth's surface are cloud cover and relative humidity. Global radiation correlates negatively with both variables. Linear models are excellent approximations for the relationship between atmospheric parameters and global radiation. A linear model with the predictors total cloud cover, relative humidity, and extraterrestrial radiation is able to explain around 98% of the variability in global radiation. For diffuse radiation, the analysis showed that the main atmospheric parameters that affect the amount received on earth's surface are cloud cover and aerosol optical depth. Diffuse radiation correlates positively with both variables. Linear models are very good approximations for the relationship between atmospheric parameters and diffuse radiation. A linear model with the predictors total cloud cover, aerosol optical depth, and extraterrestrial radiation is able to explain around 91% of the variability in diffuse radiation. Prediction analysis showed that the linear models we fitted were able to predict diffuse radiation with efficiency of test adjusted R2 values equal to 0.93, using the data of total cloud cover, aerosol optical depth, relative humidity and extraterrestrial radiation. However, for prediction purposes, using nonlinear terms or nonlinear models might enhance the prediction of diffuse radiation.
Linearly Supporting Feature Extraction for Automated Estimation of Stellar Atmospheric Parameters

NASA Astrophysics Data System (ADS)

Li, Xiangru; Lu, Yu; Comte, Georges; Luo, Ali; Zhao, Yongheng; Wang, Yongjun

2015-05-01

We describe a scheme to extract linearly supporting (LSU) features from stellar spectra to automatically estimate the atmospheric parameters {{T}{\\tt{eff} }}, log g, and [Fe/H]. “Linearly supporting” means that the atmospheric parameters can be accurately estimated from the extracted features through a linear model. The successive steps of the process are as follow: first, decompose the spectrum using a wavelet packet (WP) and represent it by the derived decomposition coefficients; second, detect representative spectral features from the decomposition coefficients using the proposed method Least Absolute Shrinkage and Selection Operator (LARS)bs; third, estimate the atmospheric parameters {{T}{\\tt{eff} }}, log g, and [Fe/H] from the detected features using a linear regression method. One prominent characteristic of this scheme is its ability to evaluate quantitatively the contribution of each detected feature to the atmospheric parameter estimate and also to trace back the physical significance of that feature. This work also shows that the usefulness of a component depends on both the wavelength and frequency. The proposed scheme has been evaluated on both real spectra from the Sloan Digital Sky Survey (SDSS)/SEGUE and synthetic spectra calculated from Kurucz's NEWODF models. On real spectra, we extracted 23 features to estimate {{T}{\\tt{eff} }}, 62 features for log g, and 68 features for [Fe/H]. Test consistencies between our estimates and those provided by the Spectroscopic Parameter Pipeline of SDSS show that the mean absolute errors (MAEs) are 0.0062 dex for log {{T}{\\tt{eff} }} (83 K for {{T}{\\tt{eff} }}), 0.2345 dex for log g, and 0.1564 dex for [Fe/H]. For the synthetic spectra, the MAE test accuracies are 0.0022 dex for log {{T}{\\tt{eff} }} (32 K for {{T}{\\tt{eff} }}), 0.0337 dex for log g, and 0.0268 dex for [Fe/H].
Controls on the variability of net infiltration to desert sandstone

USGS Publications Warehouse

Heilweil, Victor M.; McKinney, Tim S.; Zhdanov, Michael S.; Watt, Dennis E.

2007-01-01

As populations grow in arid climates and desert bedrock aquifers are increasingly targeted for future development, understanding and quantifying the spatial variability of net infiltration becomes critically important for accurately inventorying water resources and mapping contamination vulnerability. This paper presents a conceptual model of net infiltration to desert sandstone and then develops an empirical equation for its spatial quantification at the watershed scale using linear least squares inversion methods for evaluating controlling parameters (independent variables) based on estimated net infiltration rates (dependent variables). Net infiltration rates used for this regression analysis were calculated from environmental tracers in boreholes and more than 3000 linear meters of vadose zone excavations in an upland basin in southwestern Utah underlain by Navajo sandstone. Soil coarseness, distance to upgradient outcrop, and topographic slope were shown to be the primary physical parameters controlling the spatial variability of net infiltration. Although the method should be transferable to other desert sandstone settings for determining the relative spatial distribution of net infiltration, further study is needed to evaluate the effects of other potential parameters such as slope aspect, outcrop parameters, and climate on absolute net infiltration rates.
Limits of detection and decision. Part 3

NASA Astrophysics Data System (ADS)

Voigtman, E.

2008-02-01

It has been shown that the MARLAP (Multi-Agency Radiological Laboratory Analytical Protocols) for estimating the Currie detection limit, which is based on 'critical values of the non-centrality parameter of the non-central t distribution', is intrinsically biased, even if no calibration curve or regression is used. This completed the refutation of the method, begun in Part 2. With the field cleared of obstructions, the true theory underlying Currie's limits of decision, detection and quantification, as they apply in a simple linear chemical measurement system (CMS) having heteroscedastic, Gaussian measurement noise and using weighted least squares (WLS) processing, was then derived. Extensive Monte Carlo simulations were performed, on 900 million independent calibration curves, for linear, "hockey stick" and quadratic noise precision models (NPMs). With errorless NPM parameters, all the simulation results were found to be in excellent agreement with the derived theoretical expressions. Even with as much as 30% noise on all of the relevant NPM parameters, the worst absolute errors in rates of false positives and false negatives, was only 0.3%.

High dimensional linear regression models under long memory dependence and measurement error

NASA Astrophysics Data System (ADS)

Kaul, Abhishek

This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.
Multivariate generalized hidden Markov regression models with random covariates: Physical exercise in an elderly population.

PubMed

Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello

2018-04-22

A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.
Analyzing the performance of fluorescence parameters in the monitoring of leaf nitrogen content of paddy rice

PubMed Central

Yang, Jian; Gong, Wei; Shi, Shuo; Du, Lin; Sun, Jia; Song, Shalei; Chen, Biwu; Zhang, Zhenbing

2016-01-01

Leaf nitrogen content (LNC) is a significant factor which can be utilized to monitor the status of paddy rice and it requires a reliable approach for fast and precise quantification. This investigation aims to quantitatively analyze the correlation between fluorescence parameters and LNC based on laser-induced fluorescence (LIF) technology. The fluorescence parameters exhibited a consistent positive linear correlation with LNC in different growing years (2014 and 2015) and different rice cultivars. The R2 of the models varied from 0.6978 to 0.9045. Support vector machine (SVM) was then utilized to verify the feasibility of the fluorescence parameters for monitoring LNC. Comparison of the fluorescence parameters indicated that F740 is the most sensitive (the R2 of linear regression analysis of the between predicted and measured values changed from 0.8475 to 0.9226, and REs ranged from 3.52% to 4.83%) to the changes in LNC among all fluorescence parameters. Experimental results demonstrated that fluorescence parameters based on LIF technology combined with SVM is a potential method for realizing real-time, non-destructive monitoring of paddy rice LNC, which can provide guidance for the decision-making of farmers in their N fertilization strategies. PMID:27350029
Quantile regression for the statistical analysis of immunological data with many non-detects.

PubMed

Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

2012-07-07

Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
Chromatographic behaviour predicts the ability of potential nootropics to permeate the blood-brain barrier.

PubMed

Farsa, Oldřich

2013-01-01

The log BB parameter is the logarithm of the ratio of a compound's equilibrium concentrations in the brain tissue versus the blood plasma. This parameter is a useful descriptor in assessing the ability of a compound to permeate the blood-brain barrier. The aim of this study was to develop a Hansch-type linear regression QSAR model that correlates the parameter log BB and the retention time of drugs and other organic compounds on a reversed-phase HPLC containing an embedded amide moiety. The retention time was expressed by the capacity factor log k'. The second aim was to estimate the brain's absorption of 2-(azacycloalkyl)acetamidophenoxyacetic acids, which are analogues of piracetam, nefiracetam, and meclofenoxate. Notably, these acids may be novel nootropics. Two simple regression models that relate log BB and log k' were developed from an assay performed using a reversed-phase HPLC that contained an embedded amide moiety. Both the quadratic and linear models yielded statistical parameters comparable to previously published models of log BB dependence on various structural characteristics. The models predict that four members of the substituted phenoxyacetic acid series have a strong chance of permeating the barrier and being absorbed in the brain. The results of this study show that a reversed-phase HPLC system containing an embedded amide moiety is a functional in vitro surrogate of the blood-brain barrier. These results suggest that racetam-type nootropic drugs containing a carboxylic moiety could be more poorly absorbed than analogues devoid of the carboxyl group, especially if the compounds penetrate the barrier by a simple diffusion mechanism.
Relationship between masticatory performance using a gummy jelly and masticatory movement.

PubMed

Uesugi, Hanako; Shiga, Hiroshi

2017-10-01

The purpose of this study was to clarify the relationship between masticatory performance using a gummy jelly and masticatory movement. Thirty healthy males were asked to chew a gummy jelly on their habitual chewing side for 20s, and the parameters of masticatory performance and masticatory movement were calculated as follows. For evaluating the masticatory performance, the amount of glucose extraction during chewing of a gummy jelly was measured. For evaluating the masticatory movement, the movement of the mandibular incisal point was recorded using the MKG K6-I, and ten parameters of the movement path (opening distance and masticatory width), movement rhythm (opening time, closing time, occluding time, and cycle time), stability of movement (stability of path and stability of rhythm), and movement velocity (opening maximum velocity and closing maximum velocity) were calculated from 10 cycles of chewing beginning with the fifth cycle. The relationship between the amount of glucose extraction and parameters representing masticatory movement was investigated and then stepwise multiple linear regression analysis was performed. The amount of glucose extraction was associated with 7 parameters representing the masticatory movement. Stepwise multiple linear regression analysis showed that the opening distance, closing time, stability of rhythm, and closing maximum velocity were the most important factors affecting the glucose extraction. From these results it was suggested that there was a close relation between masticatory performance and masticatory movement, and that the masticatory performance could be increased by rhythmic, rapid and stable mastication with a large opening distance. Copyright © 2017 Japan Prosthodontic Society. Published by Elsevier Ltd. All rights reserved.
Assessing the impact of local meteorological variables on surface ozone in Hong Kong during 2000-2015 using quantile and multiple line regression models

NASA Astrophysics Data System (ADS)

Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo

2016-11-01

The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.
Correlation among extinction efficiency and other parameters in an aggregate dust model

NASA Astrophysics Data System (ADS)

Dhar, Tanuj Kumar; Sekhar Das, Himadri

2017-10-01

We study the extinction properties of highly porous Ballistic Cluster-Cluster Aggregate dust aggregates in a wide range of complex refractive indices (1.4≤ n≤ 2.0, 0.001≤ k≤ 1.0) and wavelengths (0.11 {{μ }}{{m}}≤ {{λ }}≤ 3.4 {{μ }} m). An attempt has been made for the first time to investigate the correlation among extinction efficiency ({Q}{ext}), composition of dust aggregates (n,k), wavelength of radiation (λ) and size parameter of the monomers (x). If k is fixed at any value between 0.001 and 1.0, {Q}{ext} increases with increase of n from 1.4 to 2.0. {Q}{ext} and n are correlated via linear regression when the cluster size is small, whereas the correlation is quadratic at moderate and higher sizes of the cluster. This feature is observed at all wavelengths (ultraviolet to optical to infrared). We also find that the variation of {Q}{ext} with n is very small when λ is high. When n is fixed at any value between 1.4 and 2.0, it is observed that {Q}{ext} and k are correlated via a polynomial regression equation (of degree 1, 2, 3 or 4), where the degree of the equation depends on the cluster size, n and λ. The correlation is linear for small size and quadratic/cubic/quartic for moderate and higher sizes. We have also found that {Q}{ext} and x are correlated via a polynomial regression (of degree 3, 4 or 5) for all values of n. The degree of regression is found to be n and k-dependent. The set of relations obtained from our work can be used to model interstellar extinction for dust aggregates in a wide range of wavelengths and complex refractive indices.
A systematic study on the influencing parameters and improvement of quantitative analysis of multi-component with single marker method using notoginseng as research subject.

PubMed

Wang, Chao-Qun; Jia, Xiu-Hong; Zhu, Shu; Komatsu, Katsuko; Wang, Xuan; Cai, Shao-Qing

2015-03-01

A new quantitative analysis of multi-component with single marker (QAMS) method for 11 saponins (ginsenosides Rg1, Rb1, Rg2, Rh1, Rf, Re and Rd; notoginsenosides R1, R4, Fa and K) in notoginseng was established, when 6 of these saponins were individually used as internal referring substances to investigate the influences of chemical structure, concentrations of quantitative components, and purities of the standard substances on the accuracy of the QAMS method. The results showed that the concentration of the analyte in sample solution was the major influencing parameter, whereas the other parameters had minimal influence on the accuracy of the QAMS method. A new method for calculating the relative correction factors by linear regression was established (linear regression method), which demonstrated to decrease standard method differences of the QAMS method from 1.20%±0.02% - 23.29%±3.23% to 0.10%±0.09% - 8.84%±2.85% in comparison with the previous method. And the differences between external standard method and the QAMS method using relative correction factors calculated by linear regression method were below 5% in the quantitative determination of Rg1, Re, R1, Rd and Fa in 24 notoginseng samples and Rb1 in 21 notoginseng samples. And the differences were mostly below 10% in the quantitative determination of Rf, Rg2, R4 and N-K (the differences of these 4 constituents bigger because their contents lower) in all the 24 notoginseng samples. The results indicated that the contents assayed by the new QAMS method could be considered as accurate as those assayed by external standard method. In addition, a method for determining applicable concentration ranges of the quantitative components assayed by QAMS method was established for the first time, which could ensure its high accuracy and could be applied to QAMS methods of other TCMs. The present study demonstrated the practicability of the application of the QAMS method for the quantitative analysis of multi-component and the quality control of TCMs and TCM prescriptions. Copyright © 2014 Elsevier B.V. All rights reserved.
Regression modeling of ground-water flow

USGS Publications Warehouse

Cooley, R.L.; Naff, R.L.

1985-01-01

Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Parameter estimation procedure for complex non-linear systems: calibration of ASM No. 1 for N-removal in a full-scale oxidation ditch.

PubMed

Abusam, A; Keesman, K J; van Straten, G; Spanjers, H; Meinema, K

2001-01-01

When applied to large simulation models, the process of parameter estimation is also called calibration. Calibration of complex non-linear systems, such as activated sludge plants, is often not an easy task. On the one hand, manual calibration of such complex systems is usually time-consuming, and its results are often not reproducible. On the other hand, conventional automatic calibration methods are not always straightforward and often hampered by local minima problems. In this paper a new straightforward and automatic procedure, which is based on the response surface method (RSM) for selecting the best identifiable parameters, is proposed. In RSM, the process response (output) is related to the levels of the input variables in terms of a first- or second-order regression model. Usually, RSM is used to relate measured process output quantities to process conditions. However, in this paper RSM is used for selecting the dominant parameters, by evaluating parameters sensitivity in a predefined region. Good results obtained in calibration of ASM No. 1 for N-removal in a full-scale oxidation ditch proved that the proposed procedure is successful and reliable.
High resolution magnetic resonance imaging of the calcaneus: age-related changes in trabecular structure and comparison with dual X-ray absorptiometry measurements

NASA Technical Reports Server (NTRS)

Ouyang, X.; Selby, K.; Lang, P.; Engelke, K.; Klifa, C.; Fan, B.; Zucconi, F.; Hottya, G.; Chen, M.; Majumdar, S.;

1997-01-01

A high-resolution magnetic resonance imaging (MRI) protocol, together with specialized image processing techniques, was applied to the quantitative measurement of age-related changes in calcaneal trabecular structure. The reproducibility of the technique was assessed and the annual rates of change for several trabecular structure parameters were measured. The MR-derived trabecular parameters were compared with calcaneal bone mineral density (BMD), measured by dual X-ray absorptiometry (DXA) in the same subjects. Sagittal MR images were acquired at 1.5 T in 23 healthy women (mean age: 49.3 +/- 16.6 [SD]), using a three-dimensional gradient echo sequence. Image analysis procedures included internal gray-scale calibration, bone and marrow segmentation, and run-length methods. Three trabecular structure parameters, apparent bone volume (ABV/TV), intercept thickness (I.Th), and intercept separation (I.Sp) were calculated from the MR images. The short- and long-term precision errors (mean %CV) of these measured parameters were in the ranges 1-2% and 3-6%, respectively. Linear regression of the trabecular structure parameters vs. age showed significant correlation: ABV/TV (r2 = 33.7%, P < 0.0037), I.Th (r2 = 26.6%, P < 0.0118), I.Sp (r2 = 28.9%, P < 0.0081). These trends with age were also expressed as annual rates of change: ABV/TV (-0.52%/year), I.Th (-0.33%/year), and I.Sp (0.59%/year). Linear regression analysis also showed significant correlation between the MR-derived trabecular structure parameters and calcaneal BMD values. Although a larger group of subjects is needed to better define the age-related changes in trabecular structure parameters and their relation to BMD, these preliminary results demonstrate that high-resolution MRI may potentially be useful for the quantitative assessment of trabecular structure.

Estimation of regression laws for ground motion parameters using as case of study the Amatrice earthquake

NASA Astrophysics Data System (ADS)

Tiberi, Lara; Costa, Giovanni

2017-04-01

The possibility to directly associate the damages to the ground motion parameters is always a great challenge, in particular for civil protections. Indeed a ground motion parameter, estimated in near real time that can express the damages occurred after an earthquake, is fundamental to arrange the first assistance after an event. The aim of this work is to contribute to the estimation of the ground motion parameter that better describes the observed intensity, immediately after an event. This can be done calculating for each ground motion parameter estimated in a near real time mode a regression law which correlates the above-mentioned parameter to the observed macro-seismic intensity. This estimation is done collecting high quality accelerometric data in near field, filtering them at different frequency steps. The regression laws are calculated using two different techniques: the non linear least-squares (NLLS) Marquardt-Levenberg algorithm and the orthogonal distance methodology (ODR). The limits of the first methodology are the needed of initial values for the parameters a and b (set 1.0 in this study), and the constraint that the independent variable must be known with greater accuracy than the dependent variable. While the second algorithm is based on the estimation of the errors perpendicular to the line, rather than just vertically. The vertical errors are just the errors in the 'y' direction, so only for the dependent variable whereas the perpendicular errors take into account errors for both the variables, the dependent and the independent. This makes possible also to directly invert the relation, so the a and b values can be used also to express the gmps as function of I. For each law the standard deviation and R2 value are estimated in order to test the quality and the reliability of the found relation. The Amatrice earthquake of 24th August of 2016 is used as case of study to test the goodness of the calculated regression laws.
On the calibration process of film dosimetry: OLS inverse regression versus WLS inverse prediction.

PubMed

Crop, F; Van Rompaye, B; Paelinck, L; Vakaet, L; Thierens, H; De Wagter, C

2008-07-21

The purpose of this study was both putting forward a statistically correct model for film calibration and the optimization of this process. A reliable calibration is needed in order to perform accurate reference dosimetry with radiographic (Gafchromic) film. Sometimes, an ordinary least squares simple linear (in the parameters) regression is applied to the dose-optical-density (OD) curve with the dose as a function of OD (inverse regression) or sometimes OD as a function of dose (inverse prediction). The application of a simple linear regression fit is an invalid method because heteroscedasticity of the data is not taken into account. This could lead to erroneous results originating from the calibration process itself and thus to a lower accuracy. In this work, we compare the ordinary least squares (OLS) inverse regression method with the correct weighted least squares (WLS) inverse prediction method to create calibration curves. We found that the OLS inverse regression method could lead to a prediction bias of up to 7.3 cGy at 300 cGy and total prediction errors of 3% or more for Gafchromic EBT film. Application of the WLS inverse prediction method resulted in a maximum prediction bias of 1.4 cGy and total prediction errors below 2% in a 0-400 cGy range. We developed a Monte-Carlo-based process to optimize calibrations, depending on the needs of the experiment. This type of thorough analysis can lead to a higher accuracy for film dosimetry.
Standardization of domestic frying processes by an engineering approach.

PubMed

Franke, K; Strijowski, U

2011-05-01

An approach was developed to enable a better standardization of domestic frying of potato products. For this purpose, 5 domestic fryers differing in heating power and oil capacity were used. A very defined frying process using a highly standardized model product and a broad range of frying conditions was carried out in these fryers and the development of browning representing an important quality parameter was measured. Product-to-oil ratio, oil temperature, and frying time were varied. Quite different color changes were measured in the different fryers although the same frying process parameters were applied. The specific energy consumption for water evaporation (spECWE) during frying related to product amount was determined for all frying processes to define an engineering parameter for characterizing the frying process. A quasi-linear regression approach was applied to calculate this parameter from frying process settings and fryer properties. The high significance of the regression coefficients and a coefficient of determination close to unity confirmed the suitability of this approach. Based on this regression equation, curves for standard frying conditions (SFC curves) were calculated which describe the frying conditions required to obtain the same level of spECWE in the different domestic fryers. Comparison of browning results from the different fryers operated at conditions near the SFC curves confirmed the applicability of the approach. © 2011 Institute of Food Technologists®
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

NASA Astrophysics Data System (ADS)

Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

2018-04-01

In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

PubMed

Anderson, Carl A; McRae, Allan F; Visscher, Peter M

2006-07-01

Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Verification of spectrophotometric method for nitrate analysis in water samples

NASA Astrophysics Data System (ADS)

Kurniawati, Puji; Gusrianti, Reny; Dwisiwi, Bledug Bernanti; Purbaningtias, Tri Esti; Wiyantoko, Bayu

2017-12-01

The aim of this research was to verify the spectrophotometric method to analyze nitrate in water samples using APHA 2012 Section 4500 NO3-B method. The verification parameters used were: linearity, method detection limit, level of quantitation, level of linearity, accuracy and precision. Linearity was obtained by using 0 to 50 mg/L nitrate standard solution and the correlation coefficient of standard calibration linear regression equation was 0.9981. The method detection limit (MDL) was defined as 0,1294 mg/L and limit of quantitation (LOQ) was 0,4117 mg/L. The result of a level of linearity (LOL) was 50 mg/L and nitrate concentration 10 to 50 mg/L was linear with a level of confidence was 99%. The accuracy was determined through recovery value was 109.1907%. The precision value was observed using % relative standard deviation (%RSD) from repeatability and its result was 1.0886%. The tested performance criteria showed that the methodology was verified under the laboratory conditions.
Female Literacy Rate is a Better Predictor of Birth Rate and Infant Mortality Rate in India

PubMed Central

Saurabh, Suman; Sarkar, Sonali; Pandey, Dhruv K.

2013-01-01

Background: Educated women are known to take informed reproductive and healthcare decisions. These result in population stabilization and better infant care reflected by lower birth rates and infant mortality rates (IMRs), respectively. Materials and Methods: Our objective was to study the relationship of male and female literacy rates with crude birth rates (CBRs) and IMRs of the states and union territories (UTs) of India. The data were analyzed using linear regression. CBR and IMR were taken as the dependent variables; while the overall literacy rates, male, and female literacy rates were the independent variables. Results: CBRs were inversely related to literacy rates (slope parameter = −0.402, P < 0.001). On multiple linear regression with male and female literacy rates, a significant inverse relationship emerged between female literacy rate and CBR (slope = −0.363, P < 0.001), while male literacy rate was not significantly related to CBR (P = 0.674). IMR of the states were also inversely related to their literacy rates (slope = −1.254, P < 0.001). Multiple linear regression revealed a significant inverse relationship between IMR and female literacy (slope = −0.816, P = 0.031), whereas male literacy rate was not significantly related (P = 0.630). Conclusion: Female literacy is relatively highly important for both population stabilization and better infant health. PMID:26664840
Linear regression crash prediction models : issues and proposed solutions.

DOT National Transportation Integrated Search

2010-05-01

The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

Determination of water depth with high-resolution satellite imagery over variable bottom types

USGS Publications Warehouse

Stumpf, Richard P.; Holderied, Kristine; Sinclair, Mark

2003-01-01

A standard algorithm for determining depth in clear water from passive sensors exists; but it requires tuning of five parameters and does not retrieve depths where the bottom has an extremely low albedo. To address these issues, we developed an empirical solution using a ratio of reflectances that has only two tunable parameters and can be applied to low-albedo features. The two algorithms--the standard linear transform and the new ratio transform--were compared through analysis of IKONOS satellite imagery against lidar bathymetry. The coefficients for the ratio algorithm were tuned manually to a few depths from a nautical chart, yet performed as well as the linear algorithm tuned using multiple linear regression against the lidar. Both algorithms compensate for variable bottom type and albedo (sand, pavement, algae, coral) and retrieve bathymetry in water depths of less than 10-15 m. However, the linear transform does not distinguish depths >15 m and is more subject to variability across the studied atolls. The ratio transform can, in clear water, retrieve depths in >25 m of water and shows greater stability between different areas. It also performs slightly better in scattering turbidity than the linear transform. The ratio algorithm is somewhat noisier and cannot always adequately resolve fine morphology (structures smaller than 4-5 pixels) in water depths >15-20 m. In general, the ratio transform is more robust than the linear transform.
Analysis and Thermodynamic Prediction of Hydrogen Solution in Solid and Liquid Multicomponent Aluminum Alloys

NASA Astrophysics Data System (ADS)

Anyalebechi, P. N.

Reported experimentally determined values of hydrogen solubility in liquid and solid Al-H and Al-H-X (where X = Cu, Si, Zn, Mg, Li, Fe or Ti) systems have been critically reviewed and analyzed in terms of Wagner's interaction parameter. An attempt has been made to use Wagner's interaction parameter and statistic linear regression models derived from reported hydrogen solubility limits for binary aluminum alloys to predict the hydrogen solubility limits in liquid and solid (commercial) multicomponent aluminum alloys. Reasons for the observed poor agreement between the predicted and experimentally determined hydrogen solubility limits are discussed.
[Prevalence of vitamin D deficiency and associated factors in women and newborns in the immediate postpartum period].

PubMed

do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado Junior, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro

2015-01-01

To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95% was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α<5%. From 226 women included, 200 (88.5%) were 20 to 44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. Copyright © 2015 Sociedade de Pediatria de São Paulo. Publicado por Elsevier Editora Ltda. All rights reserved.
Discrimination and characterization of strawberry juice based on electronic nose and tongue: comparison of different juice processing approaches by LDA, PLSR, RF, and SVM.

PubMed

Qiu, Shanshan; Wang, Jun; Gao, Liping

2014-07-09

An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
Local linear estimation of concordance probability with application to covariate effects models on association for bivariate failure-time data.

PubMed

Ding, Aidong Adam; Hsieh, Jin-Jian; Wang, Weijing

2015-01-01

Bivariate survival analysis has wide applications. In the presence of covariates, most literature focuses on studying their effects on the marginal distributions. However covariates can also affect the association between the two variables. In this article we consider the latter issue by proposing a nonstandard local linear estimator for the concordance probability as a function of covariates. Under the Clayton copula, the conditional concordance probability has a simple one-to-one correspondence with the copula parameter for different data structures including those subject to independent or dependent censoring and dependent truncation. The proposed method can be used to study how covariates affect the Clayton association parameter without specifying marginal regression models. Asymptotic properties of the proposed estimators are derived and their finite-sample performances are examined via simulations. Finally, for illustration, we apply the proposed method to analyze a bone marrow transplant data set.
Non-linear auto-regressive models for cross-frequency coupling in neural time series

PubMed Central

Tallot, Lucille; Grabot, Laetitia; Doyère, Valérie; Grenier, Yves; Gramfort, Alexandre

2017-01-01

We address the issue of reliably detecting and quantifying cross-frequency coupling (CFC) in neural time series. Based on non-linear auto-regressive models, the proposed method provides a generative and parametric model of the time-varying spectral content of the signals. As this method models the entire spectrum simultaneously, it avoids the pitfalls related to incorrect filtering or the use of the Hilbert transform on wide-band signals. As the model is probabilistic, it also provides a score of the model “goodness of fit” via the likelihood, enabling easy and legitimate model selection and parameter comparison; this data-driven feature is unique to our model-based approach. Using three datasets obtained with invasive neurophysiological recordings in humans and rodents, we demonstrate that these models are able to replicate previous results obtained with other metrics, but also reveal new insights such as the influence of the amplitude of the slow oscillation. Using simulations, we demonstrate that our parametric method can reveal neural couplings with shorter signals than non-parametric methods. We also show how the likelihood can be used to find optimal filtering parameters, suggesting new properties on the spectrum of the driving signal, but also to estimate the optimal delay between the coupled signals, enabling a directionality estimation in the coupling. PMID:29227989
Estimating mono- and bi-phasic regression parameters using a mixture piecewise linear Bayesian hierarchical model

PubMed Central

Zhao, Rui; Catalano, Paul; DeGruttola, Victor G.; Michor, Franziska

2017-01-01

The dynamics of tumor burden, secreted proteins or other biomarkers over time, is often used to evaluate the effectiveness of therapy and to predict outcomes for patients. Many methods have been proposed to investigate longitudinal trends to better characterize patients and to understand disease progression. However, most approaches assume a homogeneous patient population and a uniform response trajectory over time and across patients. Here, we present a mixture piecewise linear Bayesian hierarchical model, which takes into account both population heterogeneity and nonlinear relationships between biomarkers and time. Simulation results show that our method was able to classify subjects according to their patterns of treatment response with greater than 80% accuracy in the three scenarios tested. We then applied our model to a large randomized controlled phase III clinical trial of multiple myeloma patients. Analysis results suggest that the longitudinal tumor burden trajectories in multiple myeloma patients are heterogeneous and nonlinear, even among patients assigned to the same treatment cohort. In addition, between cohorts, there are distinct differences in terms of the regression parameters and the distributions among categories in the mixture. Those results imply that longitudinal data from clinical trials may harbor unobserved subgroups and nonlinear relationships; accounting for both may be important for analyzing longitudinal data. PMID:28723910
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

ERIC Educational Resources Information Center

Haberman, Shelby J.; Sinharay, Sandip

2010-01-01

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Frequency distributions and correlations of solar X-ray flare parameters

NASA Technical Reports Server (NTRS)

Crosby, Norma B.; Aschwanden, Markus J.; Dennis, Brian R.

1993-01-01

Frequency distributions of flare parameters are determined from over 12,000 solar flares. The flare duration, the peak counting rate, the peak hard X-ray flux, the total energy in electrons, and the peak energy flux in electrons are among the parameters studied. Linear regression fits, as well as the slopes of the frequency distributions, are used to determine the correlations between these parameters. The relationship between the variations of the frequency distributions and the solar activity cycle is also investigated. Theoretical models for the frequency distribution of flare parameters are dependent on the probability of flaring and the temporal evolution of the flare energy build-up. The results of this study are consistent with stochastic flaring and exponential energy build-up. The average build-up time constant is found to be 0.5 times the mean time between flares.
A Method for Assessing the Quality of Model-Based Estimates of Ground Temperature and Atmospheric Moisture Using Satellite Data

NASA Technical Reports Server (NTRS)

Wu, Man Li C.; Schubert, Siegfried; Lin, Ching I.; Stajner, Ivanka; Einaudi, Franco (Technical Monitor)

2000-01-01

A method is developed for validating model-based estimates of atmospheric moisture and ground temperature using satellite data. The approach relates errors in estimates of clear-sky longwave fluxes at the top of the Earth-atmosphere system to errors in geophysical parameters. The fluxes include clear-sky outgoing longwave radiation (CLR) and radiative flux in the window region between 8 and 12 microns (RadWn). The approach capitalizes on the availability of satellite estimates of CLR and RadWn and other auxiliary satellite data, and multiple global four-dimensional data assimilation (4-DDA) products. The basic methodology employs off-line forward radiative transfer calculations to generate synthetic clear-sky longwave fluxes from two different 4-DDA data sets. Simple linear regression is used to relate the clear-sky longwave flux discrepancies to discrepancies in ground temperature ((delta)T(sub g)) and broad-layer integrated atmospheric precipitable water ((delta)pw). The slopes of the regression lines define sensitivity parameters which can be exploited to help interpret mismatches between satellite observations and model-based estimates of clear-sky longwave fluxes. For illustration we analyze the discrepancies in the clear-sky longwave fluxes between an early implementation of the Goddard Earth Observing System Data Assimilation System (GEOS2) and a recent operational version of the European Centre for Medium-Range Weather Forecasts data assimilation system. The analysis of the synthetic clear-sky flux data shows that simple linear regression employing (delta)T(sub g)) and broad layer (delta)pw provides a good approximation to the full radiative transfer calculations, typically explaining more thin 90% of the 6 hourly variance in the flux differences. These simple regression relations can be inverted to "retrieve" the errors in the geophysical parameters, Uncertainties (normalized by standard deviation) in the monthly mean retrieved parameters range from 7% for (delta)T(sub g) to approx. 20% for the lower tropospheric moisture between 500 hPa and surface. The regression relationships developed from the synthetic flux data, together with CLR and RadWn observed with the Clouds and Earth Radiant Energy System instrument, ire used to assess the quality of the GEOS2 T(sub g) and pw. Results showed that the GEOS2 T(sub g) is too cold over land, and pw in upper layers is too high over the tropical oceans and too low in the lower atmosphere.
Climate variations and salmonellosis transmission in Adelaide, South Australia: a comparison between regression models

NASA Astrophysics Data System (ADS)

Zhang, Ying; Bi, Peng; Hiller, Janet

2008-01-01

This is the first study to identify appropriate regression models for the association between climate variation and salmonellosis transmission. A comparison between different regression models was conducted using surveillance data in Adelaide, South Australia. By using notified salmonellosis cases and climatic variables from the Adelaide metropolitan area over the period 1990-2003, four regression methods were examined: standard Poisson regression, autoregressive adjusted Poisson regression, multiple linear regression, and a seasonal autoregressive integrated moving average (SARIMA) model. Notified salmonellosis cases in 2004 were used to test the forecasting ability of the four models. Parameter estimation, goodness-of-fit and forecasting ability of the four regression models were compared. Temperatures occurring 2 weeks prior to cases were positively associated with cases of salmonellosis. Rainfall was also inversely related to the number of cases. The comparison of the goodness-of-fit and forecasting ability suggest that the SARIMA model is better than the other three regression models. Temperature and rainfall may be used as climatic predictors of salmonellosis cases in regions with climatic characteristics similar to those of Adelaide. The SARIMA model could, thus, be adopted to quantify the relationship between climate variations and salmonellosis transmission.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression

PubMed Central

Shen, Jianzhao; Gao, Sujuan

2010-01-01

In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

PubMed

Shen, Jianzhao; Gao, Sujuan

2008-10-01

In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
X-31 aerodynamic characteristics determined from flight data

NASA Technical Reports Server (NTRS)

Kokolios, Alex

1993-01-01

The lateral aerodynamic characteristics of the X-31 were determined at angles of attack ranging from 20 to 45 deg. Estimates of the lateral stability and control parameters were obtained by applying two parameter estimation techniques, linear regression, and the extended Kalman filter to flight test data. An attempt to apply maximum likelihood to extract parameters from the flight data was also made but failed for the reasons presented. An overview of the System Identification process is given. The overview includes a listing of the more important properties of all three estimation techniques that were applied to the data. A comparison is given of results obtained from flight test data and wind tunnel data for four important lateral parameters. Finally, future research to be conducted in this area is discussed.
Tracking Electroencephalographic Changes Using Distributions of Linear Models: Application to Propofol-Based Depth of Anesthesia Monitoring.

PubMed

Kuhlmann, Levin; Manton, Jonathan H; Heyse, Bjorn; Vereecke, Hugo E M; Lipping, Tarmo; Struys, Michel M R F; Liley, David T J

2017-04-01

Tracking brain states with electrophysiological measurements often relies on short-term averages of extracted features and this may not adequately capture the variability of brain dynamics. The objective is to assess the hypotheses that this can be overcome by tracking distributions of linear models using anesthesia data, and that anesthetic brain state tracking performance of linear models is comparable to that of a high performing depth of anesthesia monitoring feature. Individuals' brain states are classified by comparing the distribution of linear (auto-regressive moving average-ARMA) model parameters estimated from electroencephalographic (EEG) data obtained with a sliding window to distributions of linear model parameters for each brain state. The method is applied to frontal EEG data from 15 subjects undergoing propofol anesthesia and classified by the observers assessment of alertness/sedation (OAA/S) scale. Classification of the OAA/S score was performed using distributions of either ARMA parameters or the benchmark feature, Higuchi fractal dimension. The highest average testing sensitivity of 59% (chance sensitivity: 17%) was found for ARMA (2,1) models and Higuchi fractal dimension achieved 52%, however, no statistical difference was observed. For the same ARMA case, there was no statistical difference if medians are used instead of distributions (sensitivity: 56%). The model-based distribution approach is not necessarily more effective than a median/short-term average approach, however, it performs well compared with a distribution approach based on a high performing anesthesia monitoring measure. These techniques hold potential for anesthesia monitoring and may be generally applicable for tracking brain states.
Linear and nonlinear spectroscopy from quantum master equations.

PubMed

Fetherolf, Jonathan H; Berkelbach, Timothy C

2017-12-28

We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
Linear and nonlinear spectroscopy from quantum master equations

NASA Astrophysics Data System (ADS)

Fetherolf, Jonathan H.; Berkelbach, Timothy C.

2017-12-01

We investigate the accuracy of the second-order time-convolutionless (TCL2) quantum master equation for the calculation of linear and nonlinear spectroscopies of multichromophore systems. We show that even for systems with non-adiabatic coupling, the TCL2 master equation predicts linear absorption spectra that are accurate over an extremely broad range of parameters and well beyond what would be expected based on the perturbative nature of the approach; non-equilibrium population dynamics calculated with TCL2 for identical parameters are significantly less accurate. For third-order (two-dimensional) spectroscopy, the importance of population dynamics and the violation of the so-called quantum regression theorem degrade the accuracy of TCL2 dynamics. To correct these failures, we combine the TCL2 approach with a classical ensemble sampling of slow microscopic bath degrees of freedom, leading to an efficient hybrid quantum-classical scheme that displays excellent accuracy over a wide range of parameters. In the spectroscopic setting, the success of such a hybrid scheme can be understood through its separate treatment of homogeneous and inhomogeneous broadening. Importantly, the presented approach has the computational scaling of TCL2, with the modest addition of an embarrassingly parallel prefactor associated with ensemble sampling. The presented approach can be understood as a generalized inhomogeneous cumulant expansion technique, capable of treating multilevel systems with non-adiabatic dynamics.
Short communication: Genetic variation of saturated fatty acids in Holsteins in the Walloon region of Belgium.

PubMed

Arnould, V M-R; Hammami, H; Soyeurt, H; Gengler, N

2010-09-01

Random regression test-day models using Legendre polynomials are commonly used for the estimation of genetic parameters and genetic evaluation for test-day milk production traits. However, some researchers have reported that these models present some undesirable properties such as the overestimation of variances at the edges of lactation. Describing genetic variation of saturated fatty acids expressed in milk fat might require the testing of different models. Therefore, 3 different functions were used and compared to take into account the lactation curve: (1) Legendre polynomials with the same order as currently applied for genetic model for production traits; 2) linear splines with 10 knots; and 3) linear splines with the same 10 knots reduced to 3 parameters. The criteria used were Akaike's information and Bayesian information criteria, percentage square biases, and log-likelihood function. These criteria indentified Legendre polynomials and linear splines with 10 knots reduced to 3 parameters models as the most useful. Reducing more complex models using eigenvalues seemed appealing because the resulting models are less time demanding and can reduce convergence difficulties, because convergence properties also seemed to be improved. Finally, the results showed that the reduced spline model was very similar to the Legendre polynomials model. Copyright (c) 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Comparison of Total Solar Irradiance with NASA/NSO Spectromagnetograph Data in Solar Cycles 22 and 23

NASA Technical Reports Server (NTRS)

Jones, Harrison P.; Branston, Detrick D.; Jones, Patricia B.; Popescu, Miruna D.

2002-01-01

An earlier study compared NASA/NSO Spectromagnetograph (SPM) data with spacecraft measurements of total solar irradiance (TSI) variations over a 1.5 year period in the declining phase of solar cycle 22. This paper extends the analysis to an eight-year period which also spans the rising and early maximum phases of cycle 23. The conclusions of the earlier work appear to be robust: three factors (sunspots, strong unipolar regions, and strong mixed polarity regions) describe most of the variation in the SPM record, but only the first two are associated with TSI. Additionally, the residuals of a linear multiple regression of TSI against SPM observations over the entire eight-year period show an unexplained, increasing, linear time variation with a rate of about 0.05 W m(exp -2) per year. Separate regressions for the periods before and after 1996 January 01 show no unexplained trends but differ substantially in regression parameters. This behavior may reflect a solar source of TSI variations beyond sunspots and faculae but more plausibly results from uncompensated non-solar effects in one or both of the TSI and SPM data sets.
The Bayesian group lasso for confounded spatial data

USGS Publications Warehouse

Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

2017-01-01

Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

Chromatographic Behaviour Predicts the Ability of Potential Nootropics to Permeate the Blood-Brain Barrier

PubMed Central

Farsa, Oldřich

2013-01-01

The log BB parameter is the logarithm of the ratio of a compound’s equilibrium concentrations in the brain tissue versus the blood plasma. This parameter is a useful descriptor in assessing the ability of a compound to permeate the blood-brain barrier. The aim of this study was to develop a Hansch-type linear regression QSAR model that correlates the parameter log BB and the retention time of drugs and other organic compounds on a reversed-phase HPLC containing an embedded amide moiety. The retention time was expressed by the capacity factor log k′. The second aim was to estimate the brain’s absorption of 2-(azacycloalkyl)acetamidophenoxyacetic acids, which are analogues of piracetam, nefiracetam, and meclofenoxate. Notably, these acids may be novel nootropics. Two simple regression models that relate log BB and log k′ were developed from an assay performed using a reversed-phase HPLC that contained an embedded amide moiety. Both the quadratic and linear models yielded statistical parameters comparable to previously published models of log BB dependence on various structural characteristics. The models predict that four members of the substituted phenoxyacetic acid series have a strong chance of permeating the barrier and being absorbed in the brain. The results of this study show that a reversed-phase HPLC system containing an embedded amide moiety is a functional in vitro surrogate of the blood-brain barrier. These results suggest that racetam-type nootropic drugs containing a carboxylic moiety could be more poorly absorbed than analogues devoid of the carboxyl group, especially if the compounds penetrate the barrier by a simple diffusion mechanism. PMID:23641330
Masticatory performance and oral health-related quality of life before and after complete denture treatment.

PubMed

Yamamoto, Saori; Shiga, Hiroshi

2018-03-13

To clarify the relationship between masticatory performance and oral health-related quality of life (OHRQoL) before and after complete denture treatment. Thirty patients wearing complete dentures were asked to chew a gummy jelly on their habitual chewing side, and the amount of glucose extraction during chewing was measured as the parameter of masticatory performance. Subjects were asked to answer the Oral Health Impact Profile (OHIP-J49) questionnaire, which consists of 49 questions related to oral problems. The total score of 49 question items along with individual domain scores within the seven domains (functional limitation, pain, psychological discomfort, physical disability, psychological disability, social disability and handicap) were calculated and used as the parameters of OHRQoL. These records were obtained before treatment and 3 months after treatment. Each parameter of masticatory performance and OHRQoL was compared before treatment and after treatment. The relationship between masticatory performance and OHRQoL was investigated, and a stepwise multiple linear regression analysis was performed. Both masticatory performance and OHRQoL were significantly improved after treatment. Furthermore, masticatory performance was significantly correlated with some parameters of OHRQoL. The stepwise multiple linear regression analysis showed functional limitation and pain as important factors affecting masticatory performance before treatment and functional limitation as important factors affecting masticatory performance after treatment. These results suggested that masticatory performance and OHRQoL are significantly improved after treatment and that there is a close relationship between the two. Moreover, functional limitation was found to be the most important factor affecting masticatory performance. Copyright © 2018 Japan Prosthodontic Society. Published by Elsevier Ltd. All rights reserved.
Pesticides Exposure and Cardiovascular Hemodynamic Parameters Among Male Workers Involved in Mosquito Control in East Coast of Malaysia.

PubMed

Samsuddin, Niza; Rampal, Krishna Gopal; Ismail, Noor Hassim; Abdullah, Nor Zamzila; Nasreen, Hashima E

2016-02-01

Research findings have linked exposure to pesticides to an increased risk of cardiovascular (CVS) diseases. Therefore, this study aimed to assess the impact of chronic mix-pesticides exposure on CVS hemodynamic parameters. A total of 198 male Malay pesticide-exposed and 195 male Malay nonexposed workers were examined. Data were collected through exposure-matrix assessment, questionnaire, blood analyses, and CVS assessment. Explanatory variables comprised of lipid profiles, paraoxonase 1 (PON1), and oxidized low-density lipoprotein (ox-LDL). Outcome measures comprised of brachial and aortic diastolic blood pressure (DBP) and systolic BP (SBP), heart rate, and pulse wave velocity (PWV). Linear regressions identified the B coefficient showing how many units of CVS parameters are associated with each unit of covariates. Diazoxonase was significantly lower and ox-LDL was higher among pesticide-exposed workers than the comparison group. The final multivariate linear regression model revealed that age, body mass index (BMI), smoking, and pesticide exposure were independent predictors of brachial and aortic DBP and SBP. Pesticide exposure was also associated with heart rate, but not with PWV. Lipid profiles, PON1 enzymes, and ox-LDL showed no association with any of the CVS parameters. Chronic mix-pesticide exposure among workers involved in mosquito control has possible association with depression of diazoxonase and the increase in ox-LDL, brachial and aortic DBP and SBP, and heart rate. This study raises concerns that those using pesticides may be exposed to hitherto unrecognized CVS risks among others. If this is confirmed by further studies, greater efforts will be needed to protect these workers. © American Journal of Hypertension, Ltd 2015. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Measurement error and outcome distributions: Methodological issues in regression analyses of behavioral coding data.

PubMed

Holsclaw, Tracy; Hallgren, Kevin A; Steyvers, Mark; Smyth, Padhraic; Atkins, David C

2015-12-01

Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased Type I and Type II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in online supplemental materials. (c) 2016 APA, all rights reserved).
Measurement error and outcome distributions: Methodological issues in regression analyses of behavioral coding data

PubMed Central

Holsclaw, Tracy; Hallgren, Kevin A.; Steyvers, Mark; Smyth, Padhraic; Atkins, David C.

2015-01-01

Behavioral coding is increasingly used for studying mechanisms of change in psychosocial treatments for substance use disorders (SUDs). However, behavioral coding data typically include features that can be problematic in regression analyses, including measurement error in independent variables, non-normal distributions of count outcome variables, and conflation of predictor and outcome variables with third variables, such as session length. Methodological research in econometrics has shown that these issues can lead to biased parameter estimates, inaccurate standard errors, and increased type-I and type-II error rates, yet these statistical issues are not widely known within SUD treatment research, or more generally, within psychotherapy coding research. Using minimally-technical language intended for a broad audience of SUD treatment researchers, the present paper illustrates the nature in which these data issues are problematic. We draw on real-world data and simulation-based examples to illustrate how these data features can bias estimation of parameters and interpretation of models. A weighted negative binomial regression is introduced as an alternative to ordinary linear regression that appropriately addresses the data characteristics common to SUD treatment behavioral coding data. We conclude by demonstrating how to use and interpret these models with data from a study of motivational interviewing. SPSS and R syntax for weighted negative binomial regression models is included in supplementary materials. PMID:26098126
Monte Carlo simulation of parameter confidence intervals for non-linear regression analysis of biological data using Microsoft Excel.

PubMed

Lambert, Ronald J W; Mytilinaios, Ioannis; Maitland, Luke; Brown, Angus M

2012-08-01

This study describes a method to obtain parameter confidence intervals from the fitting of non-linear functions to experimental data, using the SOLVER and Analysis ToolPaK Add-In of the Microsoft Excel spreadsheet. Previously we have shown that Excel can fit complex multiple functions to biological data, obtaining values equivalent to those returned by more specialized statistical or mathematical software. However, a disadvantage of using the Excel method was the inability to return confidence intervals for the computed parameters or the correlations between them. Using a simple Monte-Carlo procedure within the Excel spreadsheet (without recourse to programming), SOLVER can provide parameter estimates (up to 200 at a time) for multiple 'virtual' data sets, from which the required confidence intervals and correlation coefficients can be obtained. The general utility of the method is exemplified by applying it to the analysis of the growth of Listeria monocytogenes, the growth inhibition of Pseudomonas aeruginosa by chlorhexidine and the further analysis of the electrophysiological data from the compound action potential of the rodent optic nerve. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Fast estimation of diffusion tensors under Rician noise by the EM algorithm.

PubMed

Liu, Jia; Gasbarra, Dario; Railavo, Juha

2016-01-15

Diffusion tensor imaging (DTI) is widely used to characterize, in vivo, the white matter of the central nerve system (CNS). This biological tissue contains much anatomic, structural and orientational information of fibers in human brain. Spectral data from the displacement distribution of water molecules located in the brain tissue are collected by a magnetic resonance scanner and acquired in the Fourier domain. After the Fourier inversion, the noise distribution is Gaussian in both real and imaginary parts and, as a consequence, the recorded magnitude data are corrupted by Rician noise. Statistical estimation of diffusion leads a non-linear regression problem. In this paper, we present a fast computational method for maximum likelihood estimation (MLE) of diffusivities under the Rician noise model based on the expectation maximization (EM) algorithm. By using data augmentation, we are able to transform a non-linear regression problem into the generalized linear modeling framework, reducing dramatically the computational cost. The Fisher-scoring method is used for achieving fast convergence of the tensor parameter. The new method is implemented and applied using both synthetic and real data in a wide range of b-amplitudes up to 14,000s/mm(2). Higher accuracy and precision of the Rician estimates are achieved compared with other log-normal based methods. In addition, we extend the maximum likelihood (ML) framework to the maximum a posteriori (MAP) estimation in DTI under the aforementioned scheme by specifying the priors. We will describe how close numerically are the estimators of model parameters obtained through MLE and MAP estimation. Copyright © 2015 Elsevier B.V. All rights reserved.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy

NASA Astrophysics Data System (ADS)

Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen

2017-05-01

This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
Quantitative structure-activity relationships by neural networks and inductive logic programming. I. The inhibition of dihydrofolate reductase by pyrimidines

NASA Astrophysics Data System (ADS)

Hirst, Jonathan D.; King, Ross D.; Sternberg, Michael J. E.

1994-08-01

Neural networks and inductive logic programming (ILP) have been compared to linear regression for modelling the QSAR of the inhibition of E. coli dihydrofolate reductase (DHFR) by 2,4-diamino-5-(substitured benzyl)pyrimidines, and, in the subsequent paper [Hirst, J.D., King, R.D. and Sternberg, M.J.E., J. Comput.-Aided Mol. Design, 8 (1994) 421], the inhibition of rodent DHFR by 2,4-diamino-6,6-dimethyl-5-phenyl-dihydrotriazines. Cross-validation trials provide a statistically rigorous assessment of the predictive capabilities of the methods, with training and testing data selected randomly and all the methods developed using identical training data. For the ILP analysis, molecules are represented by attributes other than Hansch parameters. Neural networks and ILP perform better than linear regression using the attribute representation, but the difference is not statistically significant. The major benefit from the ILP analysis is the formulation of understandable rules relating the activity of the inhibitors to their chemical structure.
Analysis and generation of groundwater concentration time series

NASA Astrophysics Data System (ADS)

Crăciun, Maria; Vamoş, Călin; Suciu, Nicolae

2018-01-01

Concentration time series are provided by simulated concentrations of a nonreactive solute transported in groundwater, integrated over the transverse direction of a two-dimensional computational domain and recorded at the plume center of mass. The analysis of a statistical ensemble of time series reveals subtle features that are not captured by the first two moments which characterize the approximate Gaussian distribution of the two-dimensional concentration fields. The concentration time series exhibit a complex preasymptotic behavior driven by a nonstationary trend and correlated fluctuations with time-variable amplitude. Time series with almost the same statistics are generated by successively adding to a time-dependent trend a sum of linear regression terms, accounting for correlations between fluctuations around the trend and their increments in time, and terms of an amplitude modulated autoregressive noise of order one with time-varying parameter. The algorithm generalizes mixing models used in probability density function approaches. The well-known interaction by exchange with the mean mixing model is a special case consisting of a linear regression with constant coefficients.
RRegrs: an R package for computer-aided model selection with multiple regression models.

PubMed

Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L

2015-01-01

Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.
Improving power and robustness for detecting genetic association with extreme-value sampling design.

PubMed

Chen, Hua Yun; Li, Mingyao

2011-12-01

Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs. © 2011 Wiley Periodicals, Inc.
Simulation of broadband ground motion including nonlinear soil effects for a magnitude 6.5 earthquake on the Seattle fault, Seattle, Washington

USGS Publications Warehouse

Hartzell, S.; Leeds, A.; Frankel, A.; Williams, R.A.; Odum, J.; Stephenson, W.; Silva, W.

2002-01-01

The Seattle fault poses a significant seismic hazard to the city of Seattle, Washington. A hybrid, low-frequency, high-frequency method is used to calculate broadband (0-20 Hz) ground-motion time histories for a M 6.5 earthquake on the Seattle fault. Low frequencies (1 Hz) are calculated by a stochastic method that uses a fractal subevent size distribution to give an ω-2 displacement spectrum. Time histories are calculated for a grid of stations and then corrected for the local site response using a classification scheme based on the surficial geology. Average shear-wave velocity profiles are developed for six surficial geologic units: artificial fill, modified land, Esperance sand, Lawton clay, till, and Tertiary sandstone. These profiles together with other soil parameters are used to compare linear, equivalent-linear, and nonlinear predictions of ground motion in the frequency band 0-15 Hz. Linear site-response corrections are found to yield unreasonably large ground motions. Equivalent-linear and nonlinear calculations give peak values similar to the 1994 Northridge, California, earthquake and those predicted by regression relationships. Ground-motion variance is estimated for (1) randomization of the velocity profiles, (2) variation in source parameters, and (3) choice of nonlinear model. Within the limits of the models tested, the results are found to be most sensitive to the nonlinear model and soil parameters, notably the over consolidation ratio.
Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

DTIC Science & Technology

1974-01-01

REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Statistical evaluation of stability data: criteria for change-over-time and data variability.

PubMed

Bar, Raphael

2003-01-01

In a recently issued ICH Q1E guidance on evaluation of stability data of drug substances and products, the need to perform a statistical extrapolation of a shelf-life of a drug product or a retest period for a drug substance is based heavily on whether data exhibit a change-over-time and/or variability. However, this document suggests neither measures nor acceptance criteria of these two parameters. This paper demonstrates a useful application of simple statistical parameters for determining whether sets of stability data from either accelerated or long-term storage programs exhibit a change-over-time and/or variability. These parameters are all derived from a simple linear regression analysis first performed on the stability data. The p-value of the slope of the regression line is taken as a measure for change-over-time, and a value of 0.25 is suggested as a limit to insignificant change of the quantitative stability attributes monitored. The minimal process capability index, Cpk, calculated from the standard deviation of the regression line, is suggested as a measure for variability with a value of 2.5 as a limit for an insignificant variability. The usefulness of the above two parameters, p-value and Cpk, was demonstrated on stability data of a refrigerated drug product and on pooled data of three batches of a drug substance. In both cases, the determined parameters allowed characterization of the data in terms of change-over-time and variability. Consequently, complete evaluation of the stability data could be pursued according to the ICH guidance. It is believed that the application of the above two parameters with their acceptance criteria will allow a more unified evaluation of stability data.
SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wei, J; Chao, M

2016-06-15

Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associatedmore » algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately improving tumor motion management for radiation therapy of cancer patients.« less
Acquisition Challenge: The Importance of Incompressibility in Comparing Learning Curve Models

DTIC Science & Technology

2015-10-01

parameters for all four learning mod- els used in the study . The learning rate factor, b, is the slope of the linear regression line, which in this case is...incorporated within the DoD acquisition environment. This study tested three alternative learning models (the Stanford-B model, DeJong’s learning formula...appropriate tools to calculate accurate and reliable predictions. However, conventional learning curve methodology has been in practice since the pre
Structured chaos in a devil's staircase of the Josephson junction.

PubMed

Shukrinov, Yu M; Botha, A E; Medvedeva, S Yu; Kolahchi, M R; Irie, A

2014-09-01

The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior. These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.
Structured chaos in a devil's staircase of the Josephson junction

NASA Astrophysics Data System (ADS)

Shukrinov, Yu. M.; Botha, A. E.; Medvedeva, S. Yu.; Kolahchi, M. R.; Irie, A.

2014-09-01

The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior. These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.
Bivariate categorical data analysis using normal linear conditional multinomial probability model.

PubMed

Sun, Bingrui; Sutradhar, Brajendra

2015-02-10

Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.

Structured chaos in a devil's staircase of the Josephson junction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shukrinov, Yu. M.; Botha, A. E., E-mail: bothaae@unisa.ac.za; Medvedeva, S. Yu.

2014-09-01

The phase dynamics of Josephson junctions (JJs) under external electromagnetic radiation is studied through numerical simulations. Current-voltage characteristics, Lyapunov exponents, and Poincaré sections are analyzed in detail. It is found that the subharmonic Shapiro steps at certain parameters are separated by structured chaotic windows. By performing a linear regression on the linear part of the data, a fractal dimension of D = 0.868 is obtained, with an uncertainty of ±0.012. The chaotic regions exhibit scaling similarity, and it is shown that the devil's staircase of the system can form a backbone that unifies and explains the highly correlated and structured chaotic behavior.more » These features suggest a system possessing multiple complete devil's staircases. The onset of chaos for subharmonic steps occurs through the Feigenbaum period doubling scenario. Universality in the sequence of periodic windows is also demonstrated. Finally, the influence of the radiation and JJ parameters on the structured chaos is investigated, and it is concluded that the structured chaos is a stable formation over a wide range of parameter values.« less
Modelling long-term fire occurrence factors in Spain by accounting for local variations with geographically weighted regression

NASA Astrophysics Data System (ADS)

Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.

2013-02-01

Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local parameter estimates for all the variables and an important reduction of the autocorrelation in the residuals of the GW linear model. Despite the fitting improvement of local models, GW regression, more than an alternative to "global" or traditional regression modelling, seems to be a valuable complement to explore the non-stationary relationships between the response variable and the explanatory variables. The synergy of global and local modelling provides insights into fire management and policy and helps further our understanding of the fire problem over large areas while at the same time recognizing its local character.
Element enrichment factor calculation using grain-size distribution and functional data regression.

PubMed

Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

2015-01-01

In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Who Will Win?: Predicting the Presidential Election Using Linear Regression

ERIC Educational Resources Information Center

Lamb, John H.

2007-01-01

This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Linear theory for filtering nonlinear multiscale systems with model error

PubMed Central

Berry, Tyrus; Harlim, John

2014-01-01

In this paper, we study filtering of multiscale dynamical systems with model error arising from limitations in resolving the smaller scale processes. In particular, the analysis assumes the availability of continuous-time noisy observations of all components of the slow variables. Mathematically, this paper presents new results on higher order asymptotic expansion of the first two moments of a conditional measure. In particular, we are interested in the application of filtering multiscale problems in which the conditional distribution is defined over the slow variables, given noisy observation of the slow variables alone. From the mathematical analysis, we learn that for a continuous time linear model with Gaussian noise, there exists a unique choice of parameters in a linear reduced model for the slow variables which gives the optimal filtering when only the slow variables are observed. Moreover, these parameters simultaneously give the optimal equilibrium statistical estimates of the underlying system, and as a consequence they can be estimated offline from the equilibrium statistics of the true signal. By examining a nonlinear test model, we show that the linear theory extends in this non-Gaussian, nonlinear configuration as long as we know the optimal stochastic parametrization and the correct observation model. However, when the stochastic parametrization model is inappropriate, parameters chosen for good filter performance may give poor equilibrium statistical estimates and vice versa; this finding is based on analytical and numerical results on our nonlinear test model and the two-layer Lorenz-96 model. Finally, even when the correct stochastic ansatz is given, it is imperative to estimate the parameters simultaneously and to account for the nonlinear feedback of the stochastic parameters into the reduced filter estimates. In numerical experiments on the two-layer Lorenz-96 model, we find that the parameters estimated online, as part of a filtering procedure, simultaneously produce accurate filtering and equilibrium statistical prediction. In contrast, an offline estimation technique based on a linear regression, which fits the parameters to a training dataset without using the filter, yields filter estimates which are worse than the observations or even divergent when the slow variables are not fully observed. This finding does not imply that all offline methods are inherently inferior to the online method for nonlinear estimation problems, it only suggests that an ideal estimation technique should estimate all parameters simultaneously whether it is online or offline. PMID:25002829
New robust statistical procedures for the polytomous logistic regression models.

PubMed

Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

2018-05-17

This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Marginal regression analysis of recurrent events with coarsened censoring times.

PubMed

Hu, X Joan; Rosychuk, Rhonda J

2016-12-01

Motivated by an ongoing pediatric mental health care (PMHC) study, this article presents weakly structured methods for analyzing doubly censored recurrent event data where only coarsened information on censoring is available. The study extracted administrative records of emergency department visits from provincial health administrative databases. The available information of each individual subject is limited to a subject-specific time window determined up to concealed data. To evaluate time-dependent effect of exposures, we adapt the local linear estimation with right censored survival times under the Cox regression model with time-varying coefficients (cf. Cai and Sun, Scandinavian Journal of Statistics 2003, 30, 93-111). We establish the pointwise consistency and asymptotic normality of the regression parameter estimator, and examine its performance by simulation. The PMHC study illustrates the proposed approach throughout the article. © 2016, The International Biometric Society.
The microcomputer scientific software series 2: general linear model--regression.

Treesearch

Harold M. Rauscher

1983-01-01

The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Proton-pump inhibitor use does not affect semen quality in subfertile men.

PubMed

Keihani, Sorena; Craig, James R; Zhang, Chong; Presson, Angela P; Myers, Jeremy B; Brant, William O; Aston, Kenneth I; Emery, Benjamin R; Jenkins, Timothy G; Carrell, Douglas T; Hotaling, James M

2018-01-01

Proton-pump inhibitors (PPIs) are among the most widely used drugs worldwide. PPI use has recently been linked to adverse changes in semen quality in healthy men; however, the effects of PPI use on semen parameters remain largely unknown specifically in cases with male factor infertility. We examined whether PPI use was associated with detrimental effects on semen parameters in a large population of subfertile men. We retrospectively reviewed data from 12 257 subfertile men who had visited our fertility clinic from 2003 to 2013. Patients who reported using any PPIs for >3 months before semen sample collection were included; 7698 subfertile men taking no medication served as controls. Data were gathered on patient age, medication use, and conventional semen parameters; patients taking any known spermatotoxic medication were excluded. Linear mixed-effect regression models were used to test the effect of PPI use on semen parameters adjusting for age. A total of 248 patients (258 samples) used PPIs for at least 3 months before semen collection. In regression models, PPI use (either as the only medication or when used in combination with other nonspermatotoxic medications) was not associated with statistically significant changes in semen parameters. To our knowledge, this is the largest study to compare PPI use with semen parameters in subfertile men. Using PPIs was not associated with detrimental effects on semen quality in this retrospective study.
Preoperative Fasting C-Peptide Predicts Type 2 Diabetes Mellitus Remission in Low-BMI Chinese Patients After Roux-en-Y Gastric Bypass.

PubMed

Zhao, Lei; Li, Weizheng; Su, Zhihong; Liu, Yong; Zhu, Liyong; Zhu, Shaihong

2018-05-29

This study investigated the role of preoperative fasting C-peptide (FCP) levels in predicting diabetic outcomes in low-BMI Chinese patients following Roux-en-Y gastric bypass (RYGB) by comparing the metabolic outcomes of patients with FCP > 1 ng/ml versus FCP ≤ 1 ng/ml. The study sample included 78 type 2 diabetes mellitus patients with an average BMI < 30 kg/m 2 at baseline. Patients' parameters were analyzed before and after surgery, with a 2-year follow-up. A univariate logistic regression analysis and multivariate analysis of variance between the remission and improvement group were performed to determine factors that were associated with type 2 diabetes remission after RYGB. Linear correlation analyses between FCP and metabolic parameters were performed. Patients were divided into two groups: FCP > 1 ng/ml and FCP ≤ 1 ng/ml, with measured parameters compared between the groups. Patients' fasting plasma glucose, 2-h postprandial plasma glucose, FCP, and HbA1c improved significantly after surgery (p < 0.05). Factors associated with type 2 diabetes remission were BMI, 2hINS, and FCP at the univariate logistic regression analysis (p < 0.05). Multivariate logistic regression analysis was performed then showed the results were more related to FCP (OR = 2.39). FCP showed a significant linear correlation with fasting insulin and BMI (p < 0.05). There was a significant difference in remission rate between the FCP > 1 ng/ml and FCP ≤ 1 ng/ml groups (p = 0.01). The parameters of patients with FCP > 1 ng/ml, including BMI, plasma glucose, HbA1c, and plasma insulin, decreased markedly after surgery (p < 0.05). FCP level is a significant predictor of diabetes outcomes after RYGB in low-BMI Chinese patients. An FCP level of 1 ng/ml may be a useful threshold for predicting surgical prognosis, with FCP > 1 ng/ml predicting better clinical outcomes following RYGB.
Resistance of nickel-chromium-aluminum alloys to cyclic oxidation at 1100 C and 1200 C

NASA Technical Reports Server (NTRS)

Barrett, C. A.; Lowell, C. E.

1976-01-01

Nickel-rich alloys in the Ni-Cr-Al system were evaluated for cyclic oxidation resistance in still air at 1,100 and 1,200 C. A first approximation oxidation attack parameter Ka was derived from specific weight change data involving both a scaling growth constant and a spalling constant. An estimating equation was derived with Ka as a function of the Cr and Al content by multiple linear regression and translated into countour ternary diagrams showing regions of minimum attack. An additional factor inferred from the regression analysis was that alloys melted in zirconia crucibles had significantly greater oxidation resistance than comparable alloys melted otherwise.
Two biased estimation techniques in linear regression: Application to aircraft

NASA Technical Reports Server (NTRS)

Klein, Vladislav

1988-01-01

Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.
Research on On-Line Modeling of Fed-Batch Fermentation Process Based on v-SVR

NASA Astrophysics Data System (ADS)

Ma, Yongjun

The fermentation process is very complex and non-linear, many parameters are not easy to measure directly on line, soft sensor modeling is a good solution. This paper introduces v-support vector regression (v-SVR) for soft sensor modeling of fed-batch fermentation process. v-SVR is a novel type of learning machine. It can control the accuracy of fitness and prediction error by adjusting the parameter v. An on-line training algorithm is discussed in detail to reduce the training complexity of v-SVR. The experimental results show that v-SVR has low error rate and better generalization with appropriate v.
Predicting Reactive Intermediate Quantum Yields from Dissolved Organic Matter Photolysis Using Optical Properties and Antioxidant Capacity.

PubMed

Mckay, Garrett; Huang, Wenxi; Romera-Castillo, Cristina; Crouch, Jenna E; Rosario-Ortiz, Fernando L; Jaffé, Rudolf

2017-05-16

The antioxidant capacity and formation of photochemically produced reactive intermediates (RI) was studied for water samples collected from the Florida Everglades with different spatial (marsh versus estuarine) and temporal (wet versus dry season) characteristics. Measured RI included triplet excited states of dissolved organic matter ( 3 DOM*), singlet oxygen ( 1 O 2 ), and the hydroxyl radical ( • OH). Single and multiple linear regression modeling were performed using a broad range of extrinsic (to predict RI formation rates, R RI ) and intrinsic (to predict RI quantum yields, Φ RI ) parameters. Multiple linear regression models consistently led to better predictions of R RI and Φ RI for our data set but poor prediction of Φ RI for a previously published data set,1 probably because the predictors are intercorrelated (Pearson's r > 0.5). Single linear regression models were built with data compiled from previously published studies (n ≈ 120) in which E2:E3, S, and Φ RI values were measured, which revealed a high degree of similarity between RI-optical property relationships across DOM samples of diverse sources. This study reveals that • OH formation is, in general, decoupled from 3 DOM* and 1 O 2 formation, providing supporting evidence that 3 DOM* is not a • OH precursor. Finally, Φ RI for 1 O 2 and 3 DOM* correlated negatively with antioxidant activity (a surrogate for electron donating capacity) for the collected samples, which is consistent with intramolecular oxidation of DOM moieties by 3 DOM*.
The Digital Shoreline Analysis System (DSAS) Version 4.0 - An ArcGIS extension for calculating shoreline change

USGS Publications Warehouse

Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan

2009-01-01

The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
Isovolumic relaxation period as an index of left ventricular relaxation under different afterload conditions--comparison with the time constant of left ventricular pressure decay in the dog.

PubMed

Ochi, H; Ikuma, I; Toda, H; Shimada, T; Morioka, S; Moriyama, K

1989-12-01

In order to determine whether isovolumic relaxation period (IRP) reflects left ventricular relaxation under different afterload conditions, 17 anesthetized, open chest dogs were studied, and the left ventricular pressure decay time constant (T) was calculated. In 12 dogs, angiotensin II and nitroprusside were administered, with the heart rate constant at 90 beats/min. Multiple linear regression analysis showed that the aortic dicrotic notch pressure (AoDNP) and T were major determinants of IRP, while left ventricular end-diastolic pressure was a minor determinant. Multiple linear regression analysis, correlating T with IRP and AoDNP, did not further improve the correlation coefficient compared with that between T and IRP. We concluded that correction of the IRP by AoDNP is not necessary to predict T from additional multiple linear regression. The effects of ascending aortic constriction or angiotensin II on IRP were examined in five dogs, after pretreatment with propranolol. Aortic constriction caused a significant decrease in IRP and T, while angiotensin II produced a significant increase in IRP and T. IRP was affected by the change of afterload. However, the IRP and T values were always altered in the same direction. These results demonstrate that IRP is substituted for T and it reflects left ventricular relaxation even in different afterload conditions. We conclude that IRP is a simple parameter easily used to evaluate left ventricular relaxation in clinical situations.
[Quantitative relationship between gas chromatographic retention time and structural parameters of alkylphenols].

PubMed

Ruan, Xiaofang; Zhang, Ruisheng; Yao, Xiaojun; Liu, Mancang; Fan, Botao

2007-03-01

Alkylphenols are a group of permanent pollutants in the environment and could adversely disturb the human endocrine system. It is therefore important to effectively separate and measure the alkylphenols. To guide the chromatographic analysis of these compounds in practice, the development of quantitative relationship between the molecular structure and the retention time of alkylphenols becomes necessary. In this study, topological, constitutional, geometrical, electrostatic and quantum-chemical descriptors of 44 alkylphenols were calculated using a software, CODESSA, and these descriptors were pre-selected using the heuristic method. As a result, three-descriptor linear model (LM) was developed to describe the relationship between the molecular structure and the retention time of alkylphenols. Meanwhile, the non-linear regression model was also developed based on support vector machine (SVM) using the same three descriptors. The correlation coefficient (R(2)) for the LM and SVM was 0.98 and 0. 92, and the corresponding root-mean-square error was 0. 99 and 2. 77, respectively. By comparing the stability and prediction ability of the two models, it was found that the linear model was a better method for describing the quantitative relationship between the retention time of alkylphenols and the molecular structure. The results obtained suggested that the linear model could be applied for the chromatographic analysis of alkylphenols with known molecular structural parameters.
A water quality index model using stepwise regression and neural networks models for the Piabanha River basin in Rio de Janeiro, Brazil

NASA Astrophysics Data System (ADS)

Villas Boas, M. D.; Olivera, F.; Azevedo, J. S.

2013-12-01

The evaluation of water quality through 'indexes' is widely used in environmental sciences. There are a number of methods available for calculating water quality indexes (WQI), usually based on site-specific parameters. In Brazil, WQI were initially used in the 1970s and were adapted from the methodology developed in association with the National Science Foundation (Brown et al, 1970). Specifically, the WQI 'IQA/SCQA', developed by the Institute of Water Management of Minas Gerais (IGAM), is estimated based on nine parameters: Temperature Range, Biochemical Oxygen Demand, Fecal Coliforms, Nitrate, Phosphate, Turbidity, Dissolved Oxygen, pH and Electrical Conductivity. The goal of this study was to develop a model for calculating the IQA/SCQA, for the Piabanha River basin in the State of Rio de Janeiro (Brazil), using only the parameters measurable by a Multiparameter Water Quality Sonde (MWQS) available in the study area. These parameters are: Dissolved Oxygen, pH and Electrical Conductivity. The use of this model will allow to further the water quality monitoring network in the basin, without requiring significant increases of resources. The water quality measurement with MWQS is less expensive than the laboratory analysis required for the other parameters. The water quality data used in the study were obtained by the Geological Survey of Brazil in partnership with other public institutions (i.e. universities and environmental institutes) as part of the project "Integrated Studies in Experimental and Representative Watersheds". Two models were developed to correlate the values of the three measured parameters and the IQA/SCQA values calculated based on all nine parameters. The results were evaluated according to the following validation statistics: coefficient of determination (R2), Root Mean Square Error (RMSE), Akaike information criterion (AIC) and Final Prediction Error (FPE). The first model was a linear stepwise regression between three independent variables (input) and one dependent variable (output) to establish an equation relating input to output. This model produced the following statistics: R2 = 0.85, RMSE = 6.19, AIC =0.65 and FPE = 1.93. The second model was a Feedforward Neural Network with one tan-sigmoid hidden layer (4 neurons) and one linear output layer. The neural network was trained based on a backpropagation algorithm using the input as predictors and the output as target. The following statistics were found: R2 = 0.95, RMSE = 4.86, AIC= 0.33 and FPE = 1.39. The second model produced a better fit than the first one, having a greater R2 and smaller RMSE, AIC and FPE. The best performance of the second method can be attributed to the fact that the water quality parameters often exhibit nonlinear behaviors and neural networks are capable of representing nonlinear relationship efficiently, while the regression is limited to linear relationships. References: Brown, R.M., McLelland, N.I., Deininger, R.A., Tozer, R.G.1970. A Water Quality Index-Do we dare? Water & Sewage Works, October: 339-343.
Effect of Stress Corrosion and Cyclic Fatigue on Fluorapatite Glass-Ceramic

NASA Astrophysics Data System (ADS)

Joshi, Gaurav V.

2011-12-01

Objective: The objective of this study was to test the following hypotheses: 1. Both cyclic degradation and stress corrosion mechanisms result in subcritical crack growth in a fluorapatite glass-ceramic. 2. There is an interactive effect of stress corrosion and cyclic fatigue to cause subcritical crack growth (SCG) for this material. 3. The material that exhibits rising toughness curve (R-curve) behavior also exhibits a cyclic degradation mechanism. Materials and Methods: The material tested was a fluorapatite glass-ceramic (IPS e.max ZirPress, Ivoclar-Vivadent). Rectangular beam specimens with dimensions of 25 mm x 4 mm x 1.2 mm were fabricated using the press-on technique. Two groups of specimens (N=30) with polished (15 mum) or air abraded surface were tested under rapid monotonic loading. Additional polished specimens were subjected to cyclic loading at two frequencies, 2 Hz (N=44) and 10 Hz (N=36), and at different stress amplitudes. All tests were performed using a fully articulating four-point flexure fixture in deionized water at 37°C. The SCG parameters were determined by using a statistical approach by Munz and Fett (1999). The fatigue lifetime data were fit to a general log-linear model in ALTA PRO software (Reliasoft). Fractographic techniques were used to determine the critical flaw sizes to estimate fracture toughness. To determine the presence of R-curve behavior, non-linear regression was used. Results: Increasing the frequency of cycling did not cause a significant decrease in lifetime. The parameters of the general log-linear model showed that only stress corrosion has a significant effect on lifetime. The parameters are presented in the following table.* SCG parameters (n=19--21) were similar for both frequencies. The regression model showed that the fracture toughness was significantly dependent (p<0.05) on critical flaw size. Conclusions: 1. Cyclic fatigue does not have a significant effect on the SCG in the fluorapatite glass-ceramic IPS e.max ZirPress. 2. There was no interactive effect between cyclic degradation and stress corrosion for this material. 3. The material exhibited a low level of R-curve behavior. It did not exhibit cyclic degradation. *Please refer to dissertation for table.
Comparison of petroleum generation kinetics by isothermal hydrous and nonisothermal open-system pyrolysis

USGS Publications Warehouse

Lewan, M.D.; Ruble, T.E.

2002-01-01

This study compares kinetic parameters determined by open-system pyrolysis and hydrous pyrolysis using aliquots of source rocks containing different kerogen types. Kinetic parameters derived from these two pyrolysis methods not only differ in the conditions employed and products generated, but also in the derivation of the kinetic parameters (i.e., isothermal linear regression and non-isothermal nonlinear regression). Results of this comparative study show that there is no correlation between kinetic parameters derived from hydrous pyrolysis and open-system pyrolysis. Hydrous-pyrolysis kinetic parameters determine narrow oil windows that occur over a wide range of temperatures and depths depending in part on the organic-sulfur content of the original kerogen. Conversely, open-system kinetic parameters determine broad oil windows that show no significant differences with kerogen types or their organic-sulfur contents. Comparisons of the kinetic parameters in a hypothetical thermal-burial history (2.5 ??C/my) show open-system kinetic parameters significantly underestimate the extent and timing of oil generation for Type-US kerogen and significantly overestimate the extent and timing of petroleum formation for Type-I kerogen compared to hydrous pyrolysis kinetic parameters. These hypothetical differences determined by the kinetic parameters are supported by natural thermal-burial histories for the Naokelekan source rock (Type-IIS kerogen) in the Zagros basin of Iraq and for the Green River Formation (Type-I kerogen) in the Uinta basin of Utah. Differences in extent and timing of oil generation determined by open-system pyrolysis and hydrous pyrolysis can be attributed to the former not adequately simulating natural oil generation conditions, products, and mechanisms.

A novel approach for prediction of tacrolimus blood concentration in liver transplantation patients in the intensive care unit through support vector regression.

PubMed

Van Looy, Stijn; Verplancke, Thierry; Benoit, Dominique; Hoste, Eric; Van Maele, Georges; De Turck, Filip; Decruyenaere, Johan

2007-01-01

Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR). Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis). Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR. Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.
Optimal Estimation of Clock Values and Trends from Finite Data

NASA Technical Reports Server (NTRS)

Greenhall, Charles

2005-01-01

We show how to solve two problems of optimal linear estimation from a finite set of phase data. Clock noise is modeled as a stochastic process with stationary dth increments. The covariance properties of such a process are contained in the generalized autocovariance function (GACV). We set up two principles for optimal estimation: with the help of the GACV, these principles lead to a set of linear equations for the regression coefficients and some auxiliary parameters. The mean square errors of the estimators are easily calculated. The method can be used to check the results of other methods and to find good suboptimal estimators based on a small subset of the available data.
Systemic inflammation is associated with myocardial fibrosis, diastolic dysfunction, and cardiac hypertrophy in patients with hypertrophic cardiomyopathy

PubMed Central

Fang, Lu; Ellims, Andris H; Beale, Anna L; Taylor, Andrew J; Murphy, Andrew; Dart, Anthony M

2017-01-01

Background: Regional or diffuse fibrosis is an early feature of hypertrophic cardiomyopathy (HCM) and is related to poor prognosis. Previous studies have documented low-grade inflammation in HCM. The aim of this study was to examine the relationships between circulating inflammatory markers and myocardial fibrosis, systolic and diastolic dysfunction, and the degree of cardiac hypertrophy in HCM patients. Methods and results: Fifty HCM patients were recruited while 20 healthy subjects served as the control group. Seventeen inflammatory cytokines/chemokines were measured in plasma. Cardiac magnetic resonance imaging and echocardiography were used to assess cardiac phenotypes. Tumour necrosis factor (TNF)-α, interleukin (IL)-6 and serum amyloid P (SAP) were significantly increased in HCM patients compared to controls. IL-6, IL-4, and monocyte chemotactic protein (MCP)-1 were correlated with regional fibrosis while stromal cell-derived factor-1 and MCP-1 were correlated with diffuse fibrosis. Fractalkine and interferon-γ were associated with left ventricular wall thickness. The above associations remained significant in a linear regression model including age, gender, body mass index and family history. TNF-α, IL-6, SAP, MCP-1 and IL-10 were associated with parameters of diastolic dysfunction. White blood cells were also increased in HCM patients and correlated with diffuse fibrosis and diastolic dysfunction. However the associations between parameters of systemic inflammation and diastolic dysfunction were weakened in the linear regression analysis. Conclusions: Systemic inflammation is associated with parameters of the disease severity of HCM patients, particularly regional and diffuse fibrosis. Modifying inflammation may reduce myocardial fibrosis in HCM patients. PMID:29218105
Development of a predictive model for lead, cadmium and fluorine soil-water partition coefficients using sparse multiple linear regression analysis.

PubMed

Nakamura, Kengo; Yasutaka, Tetsuo; Kuwatani, Tatsu; Komai, Takeshi

2017-11-01

In this study, we applied sparse multiple linear regression (SMLR) analysis to clarify the relationships between soil properties and adsorption characteristics for a range of soils across Japan and identify easily-obtained physical and chemical soil properties that could be used to predict K and n values of cadmium, lead and fluorine. A model was first constructed that can easily predict the K and n values from nine soil parameters (pH, cation exchange capacity, specific surface area, total carbon, soil organic matter from loss on ignition and water holding capacity, the ratio of sand, silt and clay). The K and n values of cadmium, lead and fluorine of 17 soil samples were used to verify the SMLR models by the root mean square error values obtained from 512 combinations of soil parameters. The SMLR analysis indicated that fluorine adsorption to soil may be associated with organic matter, whereas cadmium or lead adsorption to soil is more likely to be influenced by soil pH, IL. We found that an accurate K value can be predicted from more than three soil parameters for most soils. Approximately 65% of the predicted values were between 33 and 300% of their measured values for the K value; 76% of the predicted values were within ±30% of their measured values for the n value. Our findings suggest that adsorption properties of lead, cadmium and fluorine to soil can be predicted from the soil physical and chemical properties using the presented models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Application of LANDSAT to the Surveillance and Control of Eutrophication in Saginaw Bay

NASA Technical Reports Server (NTRS)

Rogers, R. H. (Principal Investigator)

1975-01-01

The author has identified the following significant results. LANDSAT digital data and ground truth measurements for Saginaw Bay (Lake Huron), Michigan, for 3 June 1974 can be correlated by stepwise linear regression technique and the resulting equations used to estimate invisible water quality parameters in nonsampled areas. Correlation of these parameters with each other indicates that the transport of Saginaw River water can now be traced by a number of water quality features, one or more of which are directly detected by LANDSAT. Five of the 12 water quality parameters are best correlated with LANDSAT band 6 measurements alone. One parameter (temperature) relates to band 5 alone and the remaining six may be predicted with varying degrees of accuracy from a combination of two bands (first band 6 and generally band 4 second).
[Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

PubMed

Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

2017-05-10

We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Event terms in the response spectra prediction equation and their deviation due to stress drop variations

NASA Astrophysics Data System (ADS)

Kawase, H.; Nakano, K.

2015-12-01

We investigated the characteristics of strong ground motions separated from acceleration Fourier spectra and acceleration response spectra of 5% damping calculated from weak and moderate ground motions observed by K-NET, KiK-net, and the JMA Shindokei Network in Japan using the generalized spectral inversion method. The separation method used the outcrop motions at YMGH01 as reference where we extracted site responses due to shallow weathered layers. We include events with JMA magnitude equal to or larger than 4.5 observed from 1996 to 2011. We find that our frequency-dependent Q values are comparable to those of previous studies. From the corner frequencies of Fourier source spectra, we calculate Brune's stress parameters and found a clear magnitude dependence, in which smaller events tend to spread over a wider range while maintaining the same maximum value. We confirm that this is exactly the case for several mainshock-aftershock sequences. The average stress parameters for crustal earthquakes are much smaller than those of subduction zone, which can be explained by their depth dependence. We then compared the strong motion characteristics based on the acceleration response spectra and found that the separated characteristics of strong ground motions are different, especially in the lower frequency range less than 1Hz. These differences comes from the difference between Fourier spectra and response spectra found in the observed data; that is, predominant components in high frequency range of Fourier spectra contribute to increase the response in lower frequency range with small Fourier amplitude because strong high frequency component acts as an impulse to a Single-Degree-of-Freedom system. After the separation of the source terms for 5% damping response spectra we can obtain regression coefficients with respect to the magnitude, which lead to a new GMPE as shown in Fig.1 on the left. Although stress drops for inland earthquakes are 1/7 of the subduction-zone earthquakes, we can see linear regression works quite well. After this linear regression we correlate residuals as a function of Brune's stress parameters of corresponding events as shown in Fig.1 on the right for the case of 1Hz. We found quite good linear correlation, which makes aleatoric uncertainty 40 to 60 % smaller than the original.
[Ultrasonic measurements of fetal thalamus, caudate nucleus and lenticular nucleus in prenatal diagnosis].

PubMed

Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei

2015-05-19

To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.

PubMed

Li, Runze; Li, Yan

2009-07-01

In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
Orthogonal Regression: A Teaching Perspective

ERIC Educational Resources Information Center

Carr, James R.

2012-01-01

A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Deciphering factors controlling groundwater arsenic spatial variability in Bangladesh

NASA Astrophysics Data System (ADS)

Tan, Z.; Yang, Q.; Zheng, C.; Zheng, Y.

2017-12-01

Elevated concentrations of geogenic arsenic in groundwater have been found in many countries to exceed 10 μg/L, the WHO's guideline value for drinking water. A common yet unexplained characteristic of groundwater arsenic spatial distribution is the extensive variability at various spatial scales. This study investigates factors influencing the spatial variability of groundwater arsenic in Bangladesh to improve the accuracy of models predicting arsenic exceedance rate spatially. A novel boosted regression tree method is used to establish a weak-learning ensemble model, which is compared to a linear model using a conventional stepwise logistic regression method. The boosted regression tree models offer the advantage of parametric interaction when big datasets are analyzed in comparison to the logistic regression. The point data set (n=3,538) of groundwater hydrochemistry with 19 parameters was obtained by the British Geological Survey in 2001. The spatial data sets of geological parameters (n=13) were from the Consortium for Spatial Information, Technical University of Denmark, University of East Anglia and the FAO, while the soil parameters (n=42) were from the Harmonized World Soil Database. The aforementioned parameters were regressed to categorical groundwater arsenic concentrations below or above three thresholds: 5 μg/L, 10 μg/L and 50 μg/L to identify respective controlling factors. Boosted regression tree method outperformed logistic regression methods in all three threshold levels in terms of accuracy, specificity and sensitivity, resulting in an improvement of spatial distribution map of probability of groundwater arsenic exceeding all three thresholds when compared to disjunctive-kriging interpolated spatial arsenic map using the same groundwater arsenic dataset. Boosted regression tree models also show that the most important controlling factors of groundwater arsenic distribution include groundwater iron content and well depth for all three thresholds. The probability of a well with iron content higher than 5mg/L to contain greater than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be more than 91%, 85% and 51%, respectively, while the probability of a well from depth more than 160m to contain more than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be less than 38%, 25% and 14%, respectively.
The association of very-low-density lipoprotein with ankle-brachial index in peritoneal dialysis patients with controlled serum low-density lipoprotein cholesterol level

PubMed Central

2013-01-01

Background Peripheral artery disease (PAD) represents atherosclerotic disease and is a risk factor for death in peritoneal dialysis (PD) patients, who tend to show an atherogenic lipid profile. In this study, we investigated the relationship between lipid profile and ankle-brachial index (ABI) as an index of atherosclerosis in PD patients with controlled serum low-density lipoprotein (LDL) cholesterol level. Methods Thirty-five PD patients, whose serum LDL cholesterol level was controlled at less than 120mg/dl, were enrolled in this cross-sectional study in Japan. The proportions of cholesterol level to total cholesterol level (cholesterol proportion) in 20 lipoprotein fractions and the mean size of lipoprotein particles were measured using an improved method, namely, high-performance gel permeation chromatography. Multivariate linear regression analysis was adjusted for diabetes mellitus and cardiovascular and/or cerebrovascular diseases. Results The mean (standard deviation) age was 61.6 (10.5) years; PD vintage, 38.5 (28.1) months; ABI, 1.07 (0.22). A low ABI (0.9 or lower) was observed in 7 patients (low-ABI group). The low-ABI group showed significantly higher cholesterol proportions in the chylomicron fraction and large very-low-density lipoproteins (VLDLs) (Fractions 3–5) than the high-ABI group (ABI>0.9). Adjusted multivariate linear regression analysis showed that ABI was negatively associated with serum VLDL cholesterol level (parameter estimate=-0.00566, p=0.0074); the cholesterol proportions in large VLDLs (Fraction 4, parameter estimate=-3.82, p=0.038; Fraction 5, parameter estimate=-3.62, p=0.0039) and medium VLDL (Fraction 6, parameter estimate=-3.25, p=0.014); and the size of VLDL particles (parameter estimate=-0.0352, p=0.032). Conclusions This study showed that the characteristics of VLDL particles were associated with ABI among PD patients. Lowering serum VLDL level may be an effective therapy against atherosclerosis in PD patients after the control of serum LDL cholesterol level. PMID:24093487
Application of Multiregressive Linear Models, Dynamic Kriging Models and Neural Network Models to Predictive Maintenance of Hydroelectric Power Systems

NASA Astrophysics Data System (ADS)

Lucifredi, A.; Mazzieri, C.; Rossi, M.

2000-05-01

Since the operational conditions of a hydroelectric unit can vary within a wide range, the monitoring system must be able to distinguish between the variations of the monitored variable caused by variations of the operation conditions and those due to arising and progressing of failures and misoperations. The paper aims to identify the best technique to be adopted for the monitoring system. Three different methods have been implemented and compared. Two of them use statistical techniques: the first, the linear multiple regression, expresses the monitored variable as a linear function of the process parameters (independent variables), while the second, the dynamic kriging technique, is a modified technique of multiple linear regression representing the monitored variable as a linear combination of the process variables in such a way as to minimize the variance of the estimate error. The third is based on neural networks. Tests have shown that the monitoring system based on the kriging technique is not affected by some problems common to the other two models e.g. the requirement of a large amount of data for their tuning, both for training the neural network and defining the optimum plane for the multiple regression, not only in the system starting phase but also after a trivial operation of maintenance involving the substitution of machinery components having a direct impact on the observed variable. Or, in addition, the necessity of different models to describe in a satisfactory way the different ranges of operation of the plant. The monitoring system based on the kriging statistical technique overrides the previous difficulties: it does not require a large amount of data to be tuned and is immediately operational: given two points, the third can be immediately estimated; in addition the model follows the system without adapting itself to it. The results of the experimentation performed seem to indicate that a model based on a neural network or on a linear multiple regression is not optimal, and that a different approach is necessary to reduce the amount of work during the learning phase using, when available, all the information stored during the initial phase of the plant to build the reference baseline, elaborating, if it is the case, the raw information available. A mixed approach using the kriging statistical technique and neural network techniques could optimise the result.
Practical Session: Simple Linear Regression

NASA Astrophysics Data System (ADS)

Clausel, M.; Grégoire, G.

2014-12-01

Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
A modified temporal criterion to meta-optimize the extended Kalman filter for land cover classification of remotely sensed time series

NASA Astrophysics Data System (ADS)

Salmon, B. P.; Kleynhans, W.; Olivier, J. C.; van den Bergh, F.; Wessels, K. J.

2018-05-01

Humans are transforming land cover at an ever-increasing rate. Accurate geographical maps on land cover, especially rural and urban settlements are essential to planning sustainable development. Time series extracted from MODerate resolution Imaging Spectroradiometer (MODIS) land surface reflectance products have been used to differentiate land cover classes by analyzing the seasonal patterns in reflectance values. The proper fitting of a parametric model to these time series usually requires several adjustments to the regression method. To reduce the workload, a global setting of parameters is done to the regression method for a geographical area. In this work we have modified a meta-optimization approach to setting a regression method to extract the parameters on a per time series basis. The standard deviation of the model parameters and magnitude of residuals are used as scoring function. We successfully fitted a triply modulated model to the seasonal patterns of our study area using a non-linear extended Kalman filter (EKF). The approach uses temporal information which significantly reduces the processing time and storage requirements to process each time series. It also derives reliability metrics for each time series individually. The features extracted using the proposed method are classified with a support vector machine and the performance of the method is compared to the original approach on our ground truth data.
Priors in Whole-Genome Regression: The Bayesian Alphabet Returns

PubMed Central

Gianola, Daniel

2013-01-01

Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p. PMID:23636739
Morse Code, Scrabble, and the Alphabet

ERIC Educational Resources Information Center

Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

2004-01-01

In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Gross motor development in full-term Greek infants assessed by the Alberta Infant Motor Scale: reference values and socioeconomic impact.

PubMed

Syrengelas, Dimitrios; Kalampoki, Vassiliki; Kleisiouni, Paraskevi; Konstantinou, Dimitrios; Siahanidou, Tania

2014-07-01

The aims of this study were to investigate gross motor development in Greek infants and establish AIMS percentile curves and to examine possible association of AIMS scores with socioeconomic parameters. Mean AIMS scores of 1068 healthy Greek full-term infants were compared at monthly age level with the respective mean scores of the Canadian normative sample. In a subgroup of 345 study participants, parents provided, via interview, information about family socioeconomic status. Multiple linear regression analysis was performed to evaluate the relationship of infant motor development with socioeconomic parameters. Mean AIMS scores did not differ significantly between Greek and Canadian infants in any of the 19 monthly levels of age. In multiple linear regression analysis, the educational level of the mother and also whether the infant was being raised by grandparents/babysitter were significantly associated with gross motor development (p=0.02 and p<0.001, respectively), whereas there was no significant correlation of mean AIMS scores with gender, birth order, maternal age, paternal educational level and family monthly income. Gross motor development of healthy Greek full-term infants, assessed by AIMS during the first 19months of age, follows a similar course to that of the original Canadian sample. Specific socioeconomic factors are associated with the infants' motor development. Copyright © 2014 Elsevier Ltd. All rights reserved.
Searching for the main anti-bacterial components in artificial Calculus bovis using UPLC and microcalorimetry coupled with multi-linear regression analysis.

PubMed

Zang, Qing-Ce; Wang, Jia-Bo; Kong, Wei-Jun; Jin, Cheng; Ma, Zhi-Jie; Chen, Jing; Gong, Qian-Feng; Xiao, Xiao-He

2011-12-01

The fingerprints of artificial Calculus bovis extracts from different solvents were established by ultra-performance liquid chromatography (UPLC) and the anti-bacterial activities of artificial C. bovis extracts on Staphylococcus aureus (S. aureus) growth were studied by microcalorimetry. The UPLC fingerprints were evaluated using hierarchical clustering analysis. Some quantitative parameters obtained from the thermogenic curves of S. aureus growth affected by artificial C. bovis extracts were analyzed using principal component analysis. The spectrum-effect relationships between UPLC fingerprints and anti-bacterial activities were investigated using multi-linear regression analysis. The results showed that peak 1 (taurocholate sodium), peak 3 (unknown compound), peak 4 (cholic acid), and peak 6 (chenodeoxycholic acid) are more significant than the other peaks with the standard parameter estimate 0.453, -0.166, 0.749, 0.025, respectively. So, compounds cholic acid, taurocholate sodium, and chenodeoxycholic acid might be the major anti-bacterial components in artificial C. bovis. Altogether, this work provides a general model of the combination of UPLC chromatography and anti-bacterial effect to study the spectrum-effect relationships of artificial C. bovis extracts, which can be used to discover the main anti-bacterial components in artificial C. bovis or other Chinese herbal medicines with anti-bacterial effects. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Application of database methods to the prediction of B3LYP-optimized polyhedral water cluster geometries and electronic energies

NASA Astrophysics Data System (ADS)

Anick, David J.

2003-12-01

A method is described for a rapid prediction of B3LYP-optimized geometries for polyhedral water clusters (PWCs). Starting with a database of 121 B3LYP-optimized PWCs containing 2277 H-bonds, linear regressions yield formulas correlating O-O distances, O-O-O angles, and H-O-H orientation parameters, with local and global cluster descriptors. The formulas predict O-O distances with a rms error of 0.85 pm to 1.29 pm and predict O-O-O angles with a rms error of 0.6° to 2.2°. An algorithm is given which uses the O-O and O-O-O formulas to determine coordinates for the oxygen nuclei of a PWC. The H-O-H formulas then determine positions for two H's at each O. For 15 test clusters, the gap between the electronic energy of the predicted geometry and the true B3LYP optimum ranges from 0.11 to 0.54 kcal/mol or 4 to 18 cal/mol per H-bond. Linear regression also identifies 14 parameters that strongly correlate with PWC electronic energy. These descriptors include the number of H-bonds in which both oxygens carry a non-H-bonding H, the number of quadrilateral faces, the number of symmetric angles in 5- and in 6-sided faces, and the square of the cluster's estimated dipole moment.

A regression analysis of filler particle content to predict composite wear.

PubMed

Jaarda, M J; Wang, R F; Lang, B R

1997-01-01

It has been hypothesized that composite wear is correlated to filler particle content. There is a paucity of research to substantiate this theory despite numerous projects evaluating the correlation. The purpose of this study was to determine whether a linear relationship existed between composite wear and filler particle content of 12 composites. In vivo wear data had been previously collected for the 12 composites and served as basis for this study. Scanning electron microscopy and backscatter electron imaging were combined with digital imaging analysis to develop "profile maps" of the filler particle composition of the composites. These profile maps included eight parameters: (1) total number of filler particles/28742.6 microns2, (2) percent of area occupied by all of the filler particles, (3) mean filler particle size, (4) percent of area occupied by the matrix, (5) percent of area occupied by filler particles, r (radius) 1.0 < or = micron, (6) percent of area occupied by filler particles, r = 1.0 < or = 4.5 microns, (7) percent of area occupied by filler particles, r = 4.5 < or = 10 microns, and (8) percent of area occupied by filler particles, r > 10 microns. Forward stepwise regression analyses were used with composite wear as the dependent variable and the eight parameters as independent variables. The results revealed a linear relationship between composite wear and the filler particle content. A mathematical formula was developed to predict composite wear.
Advanced statistics: linear regression, part II: multiple linear regression.

PubMed

Marill, Keith A

2004-01-01

The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

NASA Astrophysics Data System (ADS)

Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

2017-11-01

Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
Scalable Regression Tree Learning on Hadoop using OpenPlanet

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor

As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework usingmore » a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.« less
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

PubMed

Ferrari, Alberto; Comelli, Mario

2016-12-01

In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Investigation of pajama properties on skin under mild cold conditions: the interaction between skin and clothing.

PubMed

Yao, Lei; Gohel, Mayur D I; Li, Yi; Chung, Waiyee J

2011-07-01

Clothing is considered the second skin of the human body. The aim of this study was to determine clothing-wearer interaction on skin physiology under mild cold conditions. Skin physiological parameters, subjective sensory response, stress level, and physical properties of clothing fabric from two longitude parallel-designed wear trials were studied. The wear trials involved four kinds of pajamas made from cotton or polyester material that had hydrophilic or hydrophobic treatment, conducted for three weeks under mild cold conditions. Statistical tools, factor analysis, hierarchical linear regression, and logistic regression were applied to analyze the strong predictors of skin physiological parameters, stress level, and sensory response. A framework was established to illustrate clothing-wearer interactions with clothing fabric properties, skin physiology, stress level, and sensory response under mild cold conditions. Fabric has various effects on the human body under mild cold conditions. A fabric's properties influence skin physiology, sensation, and psychological response. © 2011 The International Society of Dermatology.
Characteristics of Venture Capital Network and Its Correlation with Regional Economy: Evidence from China.

PubMed

Jin, Yonghong; Zhang, Qi; Shan, Lifei; Li, Sai-Ping

2015-01-01

Financial networks have been extensively studied as examples of real world complex networks. In this paper, we establish and study the network of venture capital (VC) firms in China. We compute and analyze the statistical properties of the network, including parameters such as degrees, mean lengths of the shortest paths, clustering coefficient and robustness. We further study the topology of the network and find that it has small-world behavior. A multiple linear regression model is introduced to study the relation between network parameters and major regional economic indices in China. From the result of regression, we find that, economic aggregate (including the total GDP, investment, consumption and net export), upgrade of industrial structure, employment and remuneration of a region are all positively correlated with the degree and the clustering coefficient of the VC sub-network of the region, which suggests that the development of the VC industry has substantial effects on regional economy in China.
Characteristics of Venture Capital Network and Its Correlation with Regional Economy: Evidence from China

PubMed Central

Jin, Yonghong; Zhang, Qi; Shan, Lifei; Li, Sai-Ping

2015-01-01

Financial networks have been extensively studied as examples of real world complex networks. In this paper, we establish and study the network of venture capital (VC) firms in China. We compute and analyze the statistical properties of the network, including parameters such as degrees, mean lengths of the shortest paths, clustering coefficient and robustness. We further study the topology of the network and find that it has small-world behavior. A multiple linear regression model is introduced to study the relation between network parameters and major regional economic indices in China. From the result of regression, we find that, economic aggregate (including the total GDP, investment, consumption and net export), upgrade of industrial structure, employment and remuneration of a region are all positively correlated with the degree and the clustering coefficient of the VC sub-network of the region, which suggests that the development of the VC industry has substantial effects on regional economy in China. PMID:26340555
Community psychiatry: results of a public opinion survey.

PubMed

Lauber, Christoph; Nordt, Carlos; Haker, Helene; Falcato, Luis; Rössler, Wulf

2006-05-01

Mental health authorities must know the public's attitude to community psychiatry when planning community mental health services. However, previous studies have only investigated the impact of demographic variables on the attitude to community psychiatry. To assess the influence of psychological and sociological parameters on the public opinion of community psychiatry in Switzerland. Linear regression analyses of the results of a public opinion survey on a representative population sample in Switzerland (n = 1737). Most respondents have positive attitudes to community psychiatry. In the regression analysis (R2 adjusted = 21.2%), negative emotions towards mentally ill people as depicted in the vignette, great social distance, a positive attitude to restrictions, negative stereotypes, high rigidity and no participation in community activities significantly influenced negative attitudes to community psychiatry. Additionally, other parameters, e.g. contact with mentally ill people and the nationality of the interviewee, have a significant influence. In planning psychiatric community services, general individual traits and emotive issues should be considered because they influence the response towards community psychiatry facilities in the host community.
OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

PubMed

Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

2012-01-01

The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Quality of life in breast cancer patients--a quantile regression analysis.

PubMed

Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

2008-01-01

Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.

PubMed

Kasza, Jessica; Wolfe, Rory

2014-01-01

A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Comparing the analytical performances of Micro-NIR and FT-NIR spectrometers in the evaluation of acerola fruit quality, using PLS and SVM regression algorithms.

PubMed

Malegori, Cristina; Nascimento Marques, Emanuel José; de Freitas, Sergio Tonetto; Pimentel, Maria Fernanda; Pasquini, Celio; Casiraghi, Ernestina

2017-04-01

The main goal of this study was to investigate the analytical performances of a state-of-the-art device, one of the smallest dispersion NIR spectrometers on the market (MicroNIR 1700), making a critical comparison with a benchtop FT-NIR spectrometer in the evaluation of the prediction accuracy. In particular, the aim of this study was to estimate in a non-destructive manner, titratable acidity and ascorbic acid content in acerola fruit during ripening, in a view of direct applicability in field of this new miniaturised handheld device. Acerola (Malpighia emarginata DC.) is a super-fruit characterised by a considerable amount of ascorbic acid, ranging from 1.0% to 4.5%. However, during ripening, acerola colour changes and the fruit may lose as much as half of its ascorbic acid content. Because the variability of chemical parameters followed a non-strictly linear profile, two different regression algorithms were compared: PLS and SVM. Regression models obtained with Micro-NIR spectra give better results using SVM algorithm, for both ascorbic acid and titratable acidity estimation. FT-NIR data give comparable results using both SVM and PLS algorithms, with lower errors for SVM regression. The prediction ability of the two instruments was statistically compared using the Passing-Bablok regression algorithm; the outcomes are critically discussed together with the regression models, showing the suitability of the portable Micro-NIR for in field monitoring of chemical parameters of interest in acerola fruits. Copyright © 2016 Elsevier B.V. All rights reserved.
Above-bottom biomass retrieval of aquatic plants with regression models and SfM data acquired by a UAV platform - A case study in Wild Duck Lake Wetland, Beijing, China

NASA Astrophysics Data System (ADS)

Jing, Ran; Gong, Zhaoning; Zhao, Wenji; Pu, Ruiliang; Deng, Lei

2017-12-01

Above-bottom biomass (ABB) is considered as an important parameter for measuring the growth status of aquatic plants, and is of great significance for assessing health status of wetland ecosystems. In this study, Structure from Motion (SfM) technique was used to rebuild the study area with high overlapped images acquired by an unmanned aerial vehicle (UAV). We generated orthoimages and SfM dense point cloud data, from which vegetation indices (VIs) and SfM point cloud variables including average height (HAVG), standard deviation of height (HSD) and coefficient of variation of height (HCV) were extracted. These VIs and SfM point cloud variables could effectively characterize the growth status of aquatic plants, and thus they could be used to develop a simple linear regression model (SLR) and a stepwise linear regression model (SWL) with field measured ABB samples of aquatic plants. We also utilized a decision tree method to discriminate different types of aquatic plants. The experimental results indicated that (1) the SfM technique could effectively process high overlapped UAV images and thus be suitable for the reconstruction of fine texture feature of aquatic plant canopy structure; and (2) an SWL model based on point cloud variables: HAVG, HSD, HCV and two VIs: NGRDI, ExGR as independent variables has produced the best predictive result of ABB of aquatic plants in the study area, with a coefficient of determination of 0.84 and a relative root mean square error of 7.13%. In this analysis, a novel method for the quantitative inversion of a growth parameter (i.e., ABB) of aquatic plants in wetlands was demonstrated.
[Stature estimation for Sichuan Han nationality female based on X-ray technology with measurement of lumbar vertebrae].

PubMed

Qing, Si-han; Chang, Yun-feng; Dong, Xiao-ai; Li, Yuan; Chen, Xiao-gang; Shu, Yong-kang; Deng, Zhen-hua

2013-10-01

To establish the mathematical models of stature estimation for Sichuan Han female with measurement of lumbar vertebrae by X-ray to provide essential data for forensic anthropology research. The samples, 206 Sichuan Han females, were divided into three groups including group A, B and C according to the ages. Group A (206 samples) consisted of all ages, group B (116 samples) were 20-45 years old and 90 samples over 45 years old were group C. All the samples were examined lumbar vertebrae through CR technology, including the parameters of five centrums (L1-L5) as anterior border, posterior border and central heights (x1-x15), total central height of lumbar spine (x16), and the real height of every sample. The linear regression analysis was produced using the parameters to establish the mathematical models of stature estimation. Sixty-two trained subjects were tested to verify the accuracy of the mathematical models. The established mathematical models by hypothesis test of linear regression equation model were statistically significant (P<0.05). The standard errors of the equation were 2.982-5.004 cm, while correlation coefficients were 0.370-0.779 and multiple correlation coefficients were 0.533-0.834. The return tests of the highest correlation coefficient and multiple correlation coefficient of each group showed that the highest accuracy of the multiple regression equation, y = 100.33 + 1.489 x3 - 0.548 x6 + 0.772 x9 + 0.058 x12 + 0.645 x15, in group A were 80.6% (+/- lSE) and 100% (+/- 2SE). The established mathematical models in this study could be applied for the stature estimation for Sichuan Han females.
Screening-level models to estimate partition ratios of organic chemicals between polymeric materials, air and water.

PubMed

Reppas-Chrysovitsinos, Efstathios; Sobek, Anna; MacLeod, Matthew

2016-06-15

Polymeric materials flowing through the technosphere are repositories of organic chemicals throughout their life cycle. Equilibrium partition ratios of organic chemicals between these materials and air (KMA) or water (KMW) are required for models of fate and transport, high-throughput exposure assessment and passive sampling. KMA and KMW have been measured for a growing number of chemical/material combinations, but significant data gaps still exist. We assembled a database of 363 KMA and 910 KMW measurements for 446 individual compounds and nearly 40 individual polymers and biopolymers, collected from 29 studies. We used the EPI Suite and ABSOLV software packages to estimate physicochemical properties of the compounds and we employed an empirical correlation based on Trouton's rule to adjust the measured KMA and KMW values to a standard reference temperature of 298 K. Then, we used a thermodynamic triangle with Henry's law constant to calculate a complete set of 1273 KMA and KMW values. Using simple linear regression, we developed a suite of single parameter linear free energy relationship (spLFER) models to estimate KMA from the EPI Suite-estimated octanol-air partition ratio (KOA) and KMW from the EPI Suite-estimated octanol-water (KOW) partition ratio. Similarly, using multiple linear regression, we developed a set of polyparameter linear free energy relationship (ppLFER) models to estimate KMA and KMW from ABSOLV-estimated Abraham solvation parameters. We explored the two LFER approaches to investigate (1) their performance in estimating partition ratios, and (2) uncertainties associated with treating all different polymers as a single "bulk" polymeric material compartment. The models we have developed are suitable for screening assessments of the tendency for organic chemicals to be emitted from materials, and for use in multimedia models of the fate of organic chemicals in the indoor environment. In screening applications we recommend that KMA and KMW be modeled as 0.06 ×KOA and 0.06 ×KOW respectively, with an uncertainty range of a factor of 15.
Sensitivity analysis of respiratory parameter uncertainties: impact of criterion function form and constraints.

PubMed

Lutchen, K R

1990-08-01

A sensitivity analysis based on weighted least-squares regression is presented to evaluate alternative methods for fitting lumped-parameter models to respiratory impedance data. The goal is to maintain parameter accuracy simultaneously with practical experiment design. The analysis focuses on predicting parameter uncertainties using a linearized approximation for joint confidence regions. Applications are with four-element parallel and viscoelastic models for 0.125- to 4-Hz data and a six-element model with separate tissue and airway properties for input and transfer impedance data from 2-64 Hz. The criterion function form was evaluated by comparing parameter uncertainties when data are fit as magnitude and phase, dynamic resistance and compliance, or real and imaginary parts of input impedance. The proper choice of weighting can make all three criterion variables comparable. For the six-element model, parameter uncertainties were predicted when both input impedance and transfer impedance are acquired and fit simultaneously. A fit to both data sets from 4 to 64 Hz could reduce parameter estimate uncertainties considerably from those achievable by fitting either alone. For the four-element models, use of an independent, but noisy, measure of static compliance was assessed as a constraint on model parameters. This may allow acceptable parameter uncertainties for a minimum frequency of 0.275-0.375 Hz rather than 0.125 Hz. This reduces data acquisition requirements from a 16- to a 5.33- to 8-s breath holding period. These results are approximations, and the impact of using the linearized approximation for the confidence regions is discussed.
Influence of anthropometric parameters on ultrasound measurements of Os calcis.

PubMed

Hans, D; Schott, A M; Arlot, M E; Sornay, E; Delmas, P D; Meunier, P J

1995-01-01

Few data have been published concerning the influence of height, weight and body mass index (BMI) on broadband ultrasound attenuation (BUA), speed of sound (SOS) and Lunar "stiffness" index, and always in small population samples. The first ain of the present cross-sectional study was to determine whether anthropometric factors have a significant influence on ultrasound measurements. The second objective was to establish whether these parameters have real effect on whether their influence is due only to measurement errors. We measured, in 271 healthy French women (mean age 77 +/- 11 years; range 31-97 years), the following parameters: age, height, weight, lean and fat body mass, heel width, foot length, knee height and external malleolus (HEM). Simple linear regression analyses between ultrasound and anthropometric parameters were performed. Age, height, and heel width were significant predictors of SOS; age, height, weight, foot length, heel width, HEM, fat mass and lean mass were significant predictors of BUA; age, height, weight, heel width, HEM, fat mass and lean mass were significant predictors of stiffness. In the multiple regression analysis, once the analysis had been adjusted for age, only heel width was a significant predictor for SOS (p = 0.0007), weight for BUA (p = 0.0001), and weight (p = 0.0001) and heel width (p = 0.004) for the stiffness index. Besides their statistical meaning, the regression coefficients have a more clinically relevant interpretation which is developed in the text. These results confirm the influence of anthropometric factors on the ultrasonic parameter values, because BUA and SOS were in part dependent on heel width and weight. The influence of the position of the transducer on the calcaneus should be taken into account to optimize the methods of measurement using ultrasound.
Prediction of pork quality parameters by applying fractals and data mining on MRI.

PubMed

Caballero, Daniel; Pérez-Palacios, Trinidad; Caro, Andrés; Amigo, José Manuel; Dahl, Anders B; ErsbØll, Bjarne K; Antequera, Teresa

2017-09-01

This work firstly investigates the use of MRI, fractal algorithms and data mining techniques to determine pork quality parameters non-destructively. The main objective was to evaluate the capability of fractal algorithms (Classical Fractal algorithm, CFA; Fractal Texture Algorithm, FTA and One Point Fractal Texture Algorithm, OPFTA) to analyse MRI in order to predict quality parameters of loin. In addition, the effect of the sequence acquisition of MRI (Gradient echo, GE; Spin echo, SE and Turbo 3D, T3D) and the predictive technique of data mining (Isotonic regression, IR and Multiple linear regression, MLR) were analysed. Both fractal algorithm, FTA and OPFTA are appropriate to analyse MRI of loins. The sequence acquisition, the fractal algorithm and the data mining technique seems to influence on the prediction results. For most physico-chemical parameters, prediction equations with moderate to excellent correlation coefficients were achieved by using the following combinations of acquisition sequences of MRI, fractal algorithms and data mining techniques: SE-FTA-MLR, SE-OPFTA-IR, GE-OPFTA-MLR, SE-OPFTA-MLR, with the last one offering the best prediction results. Thus, SE-OPFTA-MLR could be proposed as an alternative technique to determine physico-chemical traits of fresh and dry-cured loins in a non-destructive way with high accuracy. Copyright © 2017. Published by Elsevier Ltd.
Use of probabilistic weights to enhance linear regression myoelectric control

NASA Astrophysics Data System (ADS)

Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

2015-12-01

Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

Weibull Modulus Estimated by the Non-linear Least Squares Method: A Solution to Deviation Occurring in Traditional Weibull Estimation

NASA Astrophysics Data System (ADS)

Li, T.; Griffiths, W. D.; Chen, J.

2017-11-01

The Maximum Likelihood method and the Linear Least Squares (LLS) method have been widely used to estimate Weibull parameters for reliability of brittle and metal materials. In the last 30 years, many researchers focused on the bias of Weibull modulus estimation, and some improvements have been achieved, especially in the case of the LLS method. However, there is a shortcoming in these methods for a specific type of data, where the lower tail deviates dramatically from the well-known linear fit in a classic LLS Weibull analysis. This deviation can be commonly found from the measured properties of materials, and previous applications of the LLS method on this kind of dataset present an unreliable linear regression. This deviation was previously thought to be due to physical flaws ( i.e., defects) contained in materials. However, this paper demonstrates that this deviation can also be caused by the linear transformation of the Weibull function, occurring in the traditional LLS method. Accordingly, it may not be appropriate to carry out a Weibull analysis according to the linearized Weibull function, and the Non-linear Least Squares method (Non-LS) is instead recommended for the Weibull modulus estimation of casting properties.
Flow-covariate prediction of stream pesticide concentrations.

PubMed

Mosquin, Paul L; Aldworth, Jeremy; Chen, Wenlin

2018-01-01

Potential peak functions (e.g., maximum rolling averages over a given duration) of annual pesticide concentrations in the aquatic environment are important exposure parameters (or target quantities) for ecological risk assessments. These target quantities require accurate concentration estimates on nonsampled days in a monitoring program. We examined stream flow as a covariate via universal kriging to improve predictions of maximum m-day (m = 1, 7, 14, 30, 60) rolling averages and the 95th percentiles of atrazine concentration in streams where data were collected every 7 or 14 d. The universal kriging predictions were evaluated against the target quantities calculated directly from the daily (or near daily) measured atrazine concentration at 32 sites (89 site-yr) as part of the Atrazine Ecological Monitoring Program in the US corn belt region (2008-2013) and 4 sites (62 site-yr) in Ohio by the National Center for Water Quality Research (1993-2008). Because stream flow data are strongly skewed to the right, 3 transformations of the flow covariate were considered: log transformation, short-term flow anomaly, and normalized Box-Cox transformation. The normalized Box-Cox transformation resulted in predictions of the target quantities that were comparable to those obtained from log-linear interpolation (i.e., linear interpolation on the log scale) for 7-d sampling. However, the predictions appeared to be negatively affected by variability in regression coefficient estimates across different sample realizations of the concentration time series. Therefore, revised models incorporating seasonal covariates and partially or fully constrained regression parameters were investigated, and they were found to provide much improved predictions in comparison with those from log-linear interpolation for all rolling average measures. Environ Toxicol Chem 2018;37:260-273. © 2017 SETAC. © 2017 SETAC.
Substituting values for censored data from Texas, USA, reservoirs inflated and obscured trends in analyses commonly used for water quality target development.

PubMed

Grantz, Erin; Haggard, Brian; Scott, J Thad

2018-06-12

We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.
Modified retrieval algorithm for three types of precipitation distribution using x-band synthetic aperture radar

NASA Astrophysics Data System (ADS)

Xie, Yanan; Zhou, Mingliang; Pan, Dengke

2017-10-01

The forward-scattering model is introduced to describe the response of normalized radar cross section (NRCS) of precipitation with synthetic aperture radar (SAR). Since the distribution of near-surface rainfall is related to the rate of near-surface rainfall and horizontal distribution factor, a retrieval algorithm called modified regression empirical and model-oriented statistical (M-M) based on the volterra integration theory is proposed. Compared with the model-oriented statistical and volterra integration (MOSVI) algorithm, the biggest difference is that the M-M algorithm is based on the modified regression empirical algorithm rather than the linear regression formula to retrieve the value of near-surface rainfall rate. Half of the empirical parameters are reduced in the weighted integral work and a smaller average relative error is received while the rainfall rate is less than 100 mm/h. Therefore, the algorithm proposed in this paper can obtain high-precision rainfall information.
QSTR of the toxicity of some organophosphorus compounds by using the quantum chemical and topological descriptors.

PubMed

Senior, Samir A; Madbouly, Magdy D; El massry, Abdel-Moneim

2011-09-01

Quantum chemical and topological descriptors of some organophosphorus compounds (OP) were correlated with their toxicity LD(50) as a dermal. The quantum chemical parameters were obtained using B3LYP/LANL2DZdp-ECP optimization. Using linear regression analysis, equations were derived to calculate the theoretical LD(50) of the studied compounds. The inclusion of quantum parameters, having both charge indices and topological indices, affects the toxicity of the studied compounds resulting in high correlation coefficient factors for the obtained equations. Two of the new four firstly supposed descriptors give higher correlation coefficients namely the Heteroatom Corrected Extended Connectivity Randic index ((1)X(HCEC)) and the Density Randic index ((1)X(Den)). The obtained linear equations were applied to predict the toxicity of some related structures. It was found that the sulfur atoms in these compounds must be replaced by oxygen atoms to achieve improved toxicity. Copyright © 2011 Elsevier Ltd. All rights reserved.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

PubMed

Duarte, Belmiro P M; Wong, Weng Kee

2015-08-01

This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

PubMed Central

Duarte, Belmiro P. M.; Wong, Weng Kee

2014-01-01

Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159
Missing heritability in the tails of quantitative traits? A simulation study on the impact of slightly altered true genetic models.

PubMed

Pütter, Carolin; Pechlivanis, Sonali; Nöthen, Markus M; Jöckel, Karl-Heinz; Wichmann, Heinz-Erich; Scherag, André

2011-01-01

Genome-wide association studies have identified robust associations between single nucleotide polymorphisms and complex traits. As the proportion of phenotypic variance explained is still limited for most of the traits, larger and larger meta-analyses are being conducted to detect additional associations. Here we investigate the impact of the study design and the underlying assumption about the true genetic effect in a bimodal mixture situation on the power to detect associations. We performed simulations of quantitative phenotypes analysed by standard linear regression and dichotomized case-control data sets from the extremes of the quantitative trait analysed by standard logistic regression. Using linear regression, markers with an effect in the extremes of the traits were almost undetectable, whereas analysing extremes by case-control design had superior power even for much smaller sample sizes. Two real data examples are provided to support our theoretical findings and to explore our mixture and parameter assumption. Our findings support the idea to re-analyse the available meta-analysis data sets to detect new loci in the extremes. Moreover, our investigation offers an explanation for discrepant findings when analysing quantitative traits in the general population and in the extremes. Copyright © 2011 S. Karger AG, Basel.
Multiple regression equations modelling of groundwater of Ajmer-Pushkar railway line region, Rajasthan (India).

PubMed

Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra

2010-01-01

In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.
[From clinical judgment to linear regression model.

PubMed

Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

2013-01-01

When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Improved parameter inference in catchment models: 1. Evaluating parameter uncertainty

NASA Astrophysics Data System (ADS)

Kuczera, George

1983-10-01

A Bayesian methodology is developed to evaluate parameter uncertainty in catchment models fitted to a hydrologic response such as runoff, the goal being to improve the chance of successful regionalization. The catchment model is posed as a nonlinear regression model with stochastic errors possibly being both autocorrelated and heteroscedastic. The end result of this methodology, which may use Box-Cox power transformations and ARMA error models, is the posterior distribution, which summarizes what is known about the catchment model parameters. This can be simplified to a multivariate normal provided a linearization in parameter space is acceptable; means of checking and improving this assumption are discussed. The posterior standard deviations give a direct measure of parameter uncertainty, and study of the posterior correlation matrix can indicate what kinds of data are required to improve the precision of poorly determined parameters. Finally, a case study involving a nine-parameter catchment model fitted to monthly runoff and soil moisture data is presented. It is shown that use of ordinary least squares when its underlying error assumptions are violated gives an erroneous description of parameter uncertainty.
Real-time soil sensing based on fiber optics and spectroscopy

NASA Astrophysics Data System (ADS)

Li, Minzan

2005-08-01

Using NIR spectroscopic techniques, correlation analysis and regression analysis for soil parameter estimation was conducted with raw soil samples collected in a cornfield and a forage field. Soil parameters analyzed were soil moisture, soil organic matter, nitrate nitrogen, soil electrical conductivity and pH. Results showed that all soil parameters could be evaluated by NIR spectral reflectance. For soil moisture, a linear regression model was available at low moisture contents below 30 % db, while an exponential model can be used in a wide range of moisture content up to 100 % db. Nitrate nitrogen estimation required a multi-spectral exponential model and electrical conductivity could be evaluated by a single spectral regression. According to the result above mentioned, a real time soil sensor system based on fiber optics and spectroscopy was developed. The sensor system was composed of a soil subsoiler with four optical fiber probes, a spectrometer, and a control unit. Two optical fiber probes were used for illumination and the other two optical fiber probes for collecting soil reflectance from visible to NIR wavebands at depths around 30 cm. The spectrometer was used to obtain the spectra of reflected lights. The control unit consisted of a data logging device, a personal computer, and a pulse generator. The experiment showed that clear photo-spectral reflectance was obtained from the underground soil. The soil reflectance was equal to that obtained by the desktop spectrophotometer in laboratory tests. Using the spectral reflectance, the soil parameters, such as soil moisture, pH, EC and SOM, were evaluated.
Spatial and temporal variability of reference evapotranspiration and influenced meteorological factors in the Jialing River Basin, China

NASA Astrophysics Data System (ADS)

Herath, Imali Kaushalya; Ye, Xuchun; Wang, Jianli; Bouraima, Abdel-Kabirou

2018-02-01

Reference evapotranspiration (ETr) is one of the important parameters in the hydrological cycle. The spatio-temporal variation of ETr and other meteorological parameters that influence ETr were investigated in the Jialing River Basin (JRB), China. The ETr was estimated using the CROPWAT 8.0 computer model based on the Penman-Montieth equation for the period 1964-2014. Mean temperature (MT), relative humidity (RH), sunshine duration (SD), and wind speed (WS) were the main input parameters of CROPWAT while 12 meteorological stations were evaluated. Linear regression and Mann-Kendall methods were applied to study the spatio-temporal trends while the inverse distance weighted (IDW) method was used to identify the spatial distribution of ETr. Stepwise regression and partial correlation methods were used to identify the meteorological variables that most significantly influenced the changes in ETr. The highest annual ETr was found in the northern part of the basin, whereas the lowest rate was recorded in the western part. In the autumn, the highest ETr was recorded in the southeast part of JRB. The annual ETr reflected neither significant increasing nor decreasing trends. Except for the summer, ETr is slightly increasing in other seasons. The MT significantly increased whereas SD and RH were significantly decreased during the 50-year period. Partial correlation and stepwise regression methods found that the impact of meteorological parameters on ETr varies on an annual and seasonal basis while SD, MT, and RH contributed to the changes of annual and seasonal ETr in the JRB.
Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

PubMed

Hemmila, April; McGill, Jim; Ritter, David

2008-03-01

To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Regression and multivariate models for predicting particulate matter concentration level.

PubMed

Nazif, Amina; Mohammed, Nurul Izma; Malakahmad, Amirhossein; Abualqumboz, Motasem S

2018-01-01

The devastating health effects of particulate matter (PM 10 ) exposure by susceptible populace has made it necessary to evaluate PM 10 pollution. Meteorological parameters and seasonal variation increases PM 10 concentration levels, especially in areas that have multiple anthropogenic activities. Hence, stepwise regression (SR), multiple linear regression (MLR) and principal component regression (PCR) analyses were used to analyse daily average PM 10 concentration levels. The analyses were carried out using daily average PM 10 concentration, temperature, humidity, wind speed and wind direction data from 2006 to 2010. The data was from an industrial air quality monitoring station in Malaysia. The SR analysis established that meteorological parameters had less influence on PM 10 concentration levels having coefficient of determination (R 2 ) result from 23 to 29% based on seasoned and unseasoned analysis. While, the result of the prediction analysis showed that PCR models had a better R 2 result than MLR methods. The results for the analyses based on both seasoned and unseasoned data established that MLR models had R 2 result from 0.50 to 0.60. While, PCR models had R 2 result from 0.66 to 0.89. In addition, the validation analysis using 2016 data also recognised that the PCR model outperformed the MLR model, with the PCR model for the seasoned analysis having the best result. These analyses will aid in achieving sustainable air quality management strategies.
Linearity versus Nonlinearity of Offspring-Parent Regression: An Experimental Study of Drosophila Melanogaster

PubMed Central

Gimelfarb, A.; Willis, J. H.

1994-01-01

An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Divergent estimation error in portfolio optimization and in linear regression

NASA Astrophysics Data System (ADS)

Kondor, I.; Varga-Haszonits, I.

2008-08-01

The problem of estimation error in portfolio optimization is discussed, in the limit where the portfolio size N and the sample size T go to infinity such that their ratio is fixed. The estimation error strongly depends on the ratio N/T and diverges for a critical value of this parameter. This divergence is the manifestation of an algorithmic phase transition, it is accompanied by a number of critical phenomena, and displays universality. As the structure of a large number of multidimensional regression and modelling problems is very similar to portfolio optimization, the scope of the above observations extends far beyond finance, and covers a large number of problems in operations research, machine learning, bioinformatics, medical science, economics, and technology.
Inverse sequential procedures for the monitoring of time series

NASA Technical Reports Server (NTRS)

Radok, Uwe; Brown, Timothy J.

1995-01-01

When one or more new values are added to a developing time series, they change its descriptive parameters (mean, variance, trend, coherence). A 'change index (CI)' is developed as a quantitative indicator that the changed parameters remain compatible with the existing 'base' data. CI formulate are derived, in terms of normalized likelihood ratios, for small samples from Poisson, Gaussian, and Chi-Square distributions, and for regression coefficients measuring linear or exponential trends. A substantial parameter change creates a rapid or abrupt CI decrease which persists when the length of the bases is changed. Except for a special Gaussian case, the CI has no simple explicit regions for tests of hypotheses. However, its design ensures that the series sampled need not conform strictly to the distribution form assumed for the parameter estimates. The use of the CI is illustrated with both constructed and observed data samples, processed with a Fortran code 'Sequitor'.
UV-Vis spectroscopy and density functional study of solvent effect on the charge transfer band of the n → σ* complexes of 2-Methylpyridine and 2-Chloropyridine with molecular iodine

NASA Astrophysics Data System (ADS)

Gogoi, Pallavi; Mohan, Uttam; Borpuzari, Manash Protim; Boruah, Abhijit; Baruah, Surjya Kumar

2017-03-01

UV-Vis spectroscopy has established that Pyridine substitutes form n→σ* charge transfer (CT) complexes with molecular Iodine. This study is a combined approach of purely experimental UV-Vis spectroscopy, Multiple linear regression theory and Computational chemistry to analyze the effect of solvent upon the charge transfer band of 2-Methylpyridine-I2 and 2-Chloropyridine-I2 complexes. Regression analysis verifies the dependence of the CT band upon different solvent parameters. Dielectric constant and refractive index are considered among the bulk solvent parameters and Hansen, Kamlet and Catalan parameters are taken into consideration at the molecular level. Density Functional Theory results explain well the blue shift of the CT bands in polar medium as an outcome of stronger donor acceptor interaction. A logarithmic relation between the bond length of the bridging atoms of the donor and the acceptor with the dielectric constant of the medium is established. Tauc plot and TDDFT study indicates a non-vertical electronic transition in the complexes. Buckingham and Lippert Mataga equations are applied to check the Polarizability effect on the CT band.
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

PubMed

Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

2014-03-01

In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.

Reliable two-dimensional phase unwrapping method using region growing and local linear estimation.

PubMed

Zhou, Kun; Zaitsev, Maxim; Bao, Shanglian

2009-10-01

In MRI, phase maps can provide useful information about parameters such as field inhomogeneity, velocity of blood flow, and the chemical shift between water and fat. As phase is defined in the (-pi,pi] range, however, phase wraps often occur, which complicates image analysis and interpretation. This work presents a two-dimensional phase unwrapping algorithm that uses quality-guided region growing and local linear estimation. The quality map employs the variance of the second-order partial derivatives of the phase as the quality criterion. Phase information from unwrapped neighboring pixels is used to predict the correct phase of the current pixel using a linear regression method. The algorithm was tested on both simulated and real data, and is shown to successfully unwrap phase images that are corrupted by noise and have rapidly changing phase. (c) 2009 Wiley-Liss, Inc.
Derivation of the linear-logistic model and Cox's proportional hazard model from a canonical system description.

PubMed

Voit, E O; Knapp, R G

1997-08-15

The linear-logistic regression model and Cox's proportional hazard model are widely used in epidemiology. Their successful application leaves no doubt that they are accurate reflections of observed disease processes and their associated risks or incidence rates. In spite of their prominence, it is not a priori evident why these models work. This article presents a derivation of the two models from the framework of canonical modeling. It begins with a general description of the dynamics between risk sources and disease development, formulates this description in the canonical representation of an S-system, and shows how the linear-logistic model and Cox's proportional hazard model follow naturally from this representation. The article interprets the model parameters in terms of epidemiological concepts as well as in terms of general systems theory and explains the assumptions and limitations generally accepted in the application of these epidemiological models.
Unitary Response Regression Models

ERIC Educational Resources Information Center

Lipovetsky, S.

2007-01-01

The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
An Expert System for the Evaluation of Cost Models

DTIC Science & Technology

1990-09-01

contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
A frequency domain global parameter estimation method for multiple reference frequency response measurements

NASA Astrophysics Data System (ADS)

Shih, C. Y.; Tsuei, Y. G.; Allemang, R. J.; Brown, D. L.

1988-10-01

A method of using the matrix Auto-Regressive Moving Average (ARMA) model in the Laplace domain for multiple-reference global parameter identification is presented. This method is particularly applicable to the area of modal analysis where high modal density exists. The method is also applicable when multiple reference frequency response functions are used to characterise linear systems. In order to facilitate the mathematical solution, the Forsythe orthogonal polynomial is used to reduce the ill-conditioning of the formulated equations and to decouple the normal matrix into two reduced matrix blocks. A Complex Mode Indicator Function (CMIF) is introduced, which can be used to determine the proper order of the rational polynomials.
A new linear least squares method for T1 estimation from SPGR signals with multiple TRs

NASA Astrophysics Data System (ADS)

Chang, Lin-Ching; Koay, Cheng Guan; Basser, Peter J.; Pierpaoli, Carlo

2009-02-01

The longitudinal relaxation time, T1, can be estimated from two or more spoiled gradient recalled echo x (SPGR) images with two or more flip angles and one or more repetition times (TRs). The function relating signal intensity and the parameters are nonlinear; T1 maps can be computed from SPGR signals using nonlinear least squares regression. A widely-used linear method transforms the nonlinear model by assuming a fixed TR in SPGR images. This constraint is not desirable since multiple TRs are a clinically practical way to reduce the total acquisition time, to satisfy the required resolution, and/or to combine SPGR data acquired at different times. A new linear least squares method is proposed using the first order Taylor expansion. Monte Carlo simulations of SPGR experiments are used to evaluate the accuracy and precision of the estimated T1 from the proposed linear and the nonlinear methods. We show that the new linear least squares method provides T1 estimates comparable in both precision and accuracy to those from the nonlinear method, allowing multiple TRs and reducing computation time significantly.
Computer Mapping of Water Quality in Saginaw Bay with LANDSAT Digital Data

NASA Technical Reports Server (NTRS)

Rogers, R. H. (Principal Investigator); Shah, N. J.; Smith, V. E.; Mckeon, J. B.

1976-01-01

The author has identified the following significant results. LANDSAT digital data and ground truth measurements for Saginaw Bay (Lake Huron), Michigan, for 31 July 1975 were correlated by stepwise linear regression and the resulting equations used to estimate invisible water quality parameters in nonsampled areas. Chloride, conductivity, total Kjeldahl nitrogen, total phosphorus, and chlorophyll a were best correlated with the ratio of LANDSAT Band 4 to Band 5. Temperature and Secchi depth correlate best with Band 5.
Markers of adiposity in HIV/AIDS patients: Agreement between waist circumference, waist-to-hip ratio, waist-to-height ratio and body mass index

PubMed Central

Ngu, Roland Cheofor; Kadia, Benjamin Momo; Tianyi, Frank-Leonel; Choukem, Simeon Pierre

2018-01-01

Background Waist circumference (WC), waist-to-hip ratio (WHR) and waist-to-height ratio (WHtR) are all independent predictors of cardio-metabolic risk and therefore important in HIV/AIDS patients on antiretroviral therapy at risk of increased visceral adiposity. This study aimed to assess the extent of agreement between these parameters and the body mass index (BMI), as anthropometric parameters and in classifying cardio-metabolic risk in HIV/AIDS patients. Methods A secondary analysis of data from a cross-sectional study involving 200 HIV/AIDS patients was done. Anthropometric parameters were measured from participants using standard guidelines and central obesity defined according to recommended criteria. Increased cardio-metabolic risk was defined according to the standard cut-off values for all four parameters. Data were analyzed using STATA version 14.1. Results The prevalence of WC-defined central obesity, WHR-defined central obesity and WHtR > 0.50 were 33.5%, 44.5% and 36.5%, respectively. The prevalence of BMI-defined overweight and obesity was 40.5%. After adjusting for gender and HAART status, there was a significant linear association and correlation between WC and BMI (regression equation: WC (cm) = 37.184 + 1.756 BMI (Kg/m2) + 0.825 Male + 1.002 HAART, (p < 0.001, r = 0.65)), and between WHtR and BMI (regression equation: WHtR = 0.223 + 0.011 BMI (Kg/m2)– 0.0153 Male + 0.003 HAART, (p < 0.001, r = 0.65)), but not between WHR and BMI (p = 0.097, r = 0.13). There was no agreement between the WC, WHtR and BMI, and minimal agreement between the WHR and BMI, in identifying patients with an increased cardio-metabolic risk. Conclusion Despite the observed linear association and correlation between these anthropometric parameters, the routine use of WC, WHR and WHtR as better predictors of cardio-metabolic risk should be encouraged in these patients, due to their minimal agreement with BMI in identifying HIV/AIDS patients with increased cardio-metabolic risk. HAART status does not appear to significantly affect the association between these anthropometric parameters. PMID:29566089
Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

PubMed

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.
Relationship Between Earthquake b-Values and Crustal Stresses in a Young Orogenic Belt

NASA Astrophysics Data System (ADS)

Wu, Yih-Min; Chen, Sean Kuanhsiang; Huang, Ting-Chung; Huang, Hsin-Hua; Chao, Wei-An; Koulakov, Ivan

2018-02-01

It has been reported that earthquake b-values decrease linearly with the differential stresses in the continental crust and subduction zones. Here we report a regression-derived relation between earthquake b-values and crustal stresses using the Anderson fault parameter (Aϕ) in a young orogenic belt of Taiwan. This regression relation is well established by using a large and complete earthquake catalog for Taiwan. The data set consists of b-values and Aϕ values derived from relocated earthquakes and focal mechanisms, respectively. Our results show that b-values decrease linearly with the Aϕ values at crustal depths with a high correlation coefficient of -0.9. Thus, b-values could be used as stress indicators for orogenic belts. However, the state of stress is relatively well correlated with the surface geological setting with respect to earthquake b-values in Taiwan. Temporal variations in the b-value could constitute one of the main reasons for the spatial heterogeneity of b-values. We therefore suggest that b-values could be highly sensitive to temporal stress variations.
A generalized partially linear mean-covariance regression model for longitudinal proportional data, with applications to the analysis of quality of life data from cancer clinical trials.

PubMed

Zheng, Xueying; Qin, Guoyou; Tu, Dongsheng

2017-05-30

Motivated by the analysis of quality of life data from a clinical trial on early breast cancer, we propose in this paper a generalized partially linear mean-covariance regression model for longitudinal proportional data, which are bounded in a closed interval. Cholesky decomposition of the covariance matrix for within-subject responses and generalized estimation equations are used to estimate unknown parameters and the nonlinear function in the model. Simulation studies are performed to evaluate the performance of the proposed estimation procedures. Our new model is also applied to analyze the data from the cancer clinical trial that motivated this research. In comparison with available models in the literature, the proposed model does not require specific parametric assumptions on the density function of the longitudinal responses and the probability function of the boundary values and can capture dynamic changes of time or other interested variables on both mean and covariance of the correlated proportional responses. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

PubMed Central

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666
Max dD/Dt: A Novel Parameter to Assess Fetal Cardiac Contractility and a Substitute for Max dP/Dt.

PubMed

Fujita, Yasuyuki; Kiyokoba, Ryo; Yumoto, Yasuo; Kato, Kiyoko

2018-07-01

Aortic pulse waveforms are composed of a forward wave from the heart and a reflection wave from the periphery. We focused on this forward wave and suggested a new parameter, the maximum slope of aortic pulse waveforms (max dD/dt), for fetal cardiac contractility. Max dD/dt was calculated from fetal aortic pulse waveforms recorded with an echo-tracking system. A normal range of max dD/dt was constructed in 105 healthy fetuses using linear regression analysis. Twenty-two fetuses with suspected fetal cardiac dysfunction were divided into normal and decreased max dD/dt groups, and their clinical parameters were compared. Max dD/dt of aortic pulse waveforms increased linearly with advancing gestational age (r = 0.93). The decreased max dD/dt was associated with abnormal cardiotocography findings and short- and long-term prognosis. In conclusion, max dD/dt calculated from the aortic pulse waveforms in fetuses can substitute for max dP/dt, an index of cardiac contractility in adults. Copyright © 2018 World Federation for Ultrasound in Medicine and Biology. Published by Elsevier Inc. All rights reserved.
HYDRORECESSION: A toolbox for streamflow recession analysis

NASA Astrophysics Data System (ADS)

Arciniega, S.

2015-12-01

Streamflow recession curves are hydrological signatures allowing to study the relationship between groundwater storage and baseflow and/or low flows at the catchment scale. Recent studies have showed that streamflow recession analysis can be quite sensitive to the combination of different models, extraction techniques and parameter estimation methods. In order to better characterize streamflow recession curves, new methodologies combining multiple approaches have been recommended. The HYDRORECESSION toolbox, presented here, is a Matlab graphical user interface developed to analyse streamflow recession time series with the support of different tools allowing to parameterize linear and nonlinear storage-outflow relationships through four of the most useful recession models (Maillet, Boussinesq, Coutagne and Wittenberg). The toolbox includes four parameter-fitting techniques (linear regression, lower envelope, data binning and mean squared error) and three different methods to extract hydrograph recessions segments (Vogel, Brutsaert and Aksoy). In addition, the toolbox has a module that separates the baseflow component from the observed hydrograph using the inverse reservoir algorithm. Potential applications provided by HYDRORECESSION include model parameter analysis, hydrological regionalization and classification, baseflow index estimates, catchment-scale recharge and low-flows modelling, among others. HYDRORECESSION is freely available for non-commercial and academic purposes.
On three dimensional object recognition and pose-determination: An abstraction based approach. Ph.D. Thesis - Michigan Univ. Final Report

NASA Technical Reports Server (NTRS)

Quek, Kok How Francis

1990-01-01

A method of computing reliable Gaussian and mean curvature sign-map descriptors from the polynomial approximation of surfaces was demonstrated. Such descriptors which are invariant under perspective variation are suitable for hypothesis generation. A means for determining the pose of constructed geometric forms whose algebraic surface descriptors are nonlinear in terms of their orienting parameters was developed. This was done by means of linear functions which are capable of approximating nonlinear forms and determining their parameters. It was shown that biquadratic surfaces are suitable companion linear forms for cylindrical approximation and parameter estimation. The estimates provided the initial parametric approximations necessary for a nonlinear regression stage to fine tune the estimates by fitting the actual nonlinear form to the data. A hypothesis-based split-merge algorithm for extraction and pose determination of cylinders and planes which merge smoothly into other surfaces was developed. It was shown that all split-merge algorithms are hypothesis-based. A finite-state algorithm for the extraction of the boundaries of run-length regions was developed. The computation takes advantage of the run list topology and boundary direction constraints implicit in the run-length encoding.
Regression equations for estimation of annual peak-streamflow frequency for undeveloped watersheds in Texas using an L-moment-based, PRESS-minimized, residual-adjusted approach

USGS Publications Warehouse

Asquith, William H.; Roussel, Meghan C.

2009-01-01

Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed characteristics of drainage area, dimensionless main-channel slope, and mean annual precipitation. The residuals of the nine equations are spatially mapped, and residuals for the 10-year recurrence interval are selected for generalization to 1-degree latitude and longitude quadrangles. The generalized residual is referred to as the OmegaEM parameter and represents a generalized terrain and climate index that expresses peak-streamflow potential not otherwise represented in the three watershed characteristics. The OmegaEM parameter was assigned to each station, and using OmegaEM, nine additional regression equations are computed. Because of favorable diagnostics, the OmegaEM equations are expected to be generally reliable estimators of peak-streamflow frequency for undeveloped and ungaged stream locations in Texas. The mean residual standard error, adjusted R-squared, and percentage reduction of PRESS by use of OmegaEM are 0.30log10, 0.86, and -21 percent, respectively. Inclusion of the OmegaEM parameter provides a substantial reduction in the PRESS statistic of the regression equations and removes considerable spatial dependency in regression residuals. Although the OmegaEM parameter requires interpretation on the part of analysts and the potential exists that different analysts could estimate different values for a given watershed, the authors suggest that typical uncertainty in the OmegaEM estimate might be about +or-0.1010. Finally, given the two ensembles of equations reported herein and those in previous reports, hydrologic design engineers and other analysts have several different methods, which represent different analytical tracks, to make comparisons of peak-streamflow frequency estimates for ungaged watersheds in the study area.
Compound Identification Using Penalized Linear Regression on Metabolomics

PubMed Central

Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho

2014-01-01

Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
MRI-based assessment of the pineal gland in a large population of children aged 0-5 years and comparison with pineoblastoma: part I, the solid gland.

PubMed

Galluzzi, Paolo; de Jong, Marcus C; Sirin, Selma; Maeder, Philippe; Piu, Pietro; Cerase, Alfonso; Monti, Lucia; Brisse, Hervé J; Castelijns, Jonas A; de Graaf, Pim; Goericke, Sophia L

2016-07-01

Differentiation between normal solid (non-cystic) pineal glands and pineal pathologies on brain MRI is difficult. The aim of this study was to assess the size of the solid pineal gland in children (0-5 years) and compare the findings with published pineoblastoma cases. We retrospectively analyzed the size (width, height, planimetric area) of solid pineal glands in 184 non-retinoblastoma patients (73 female, 111 male) aged 0-5 years on MRI. The effect of age and gender on gland size was evaluated. Linear regression analysis was performed to analyze the relation between size and age. Ninety-nine percent prediction intervals around the mean were added to construct a normal size range per age, with the upper bound of the predictive interval as the parameter of interest as a cutoff for normalcy. There was no significant interaction of gender and age for all the three pineal gland parameters (width, height, and area). Linear regression analysis gave 99 % upper prediction bounds of 7.9, 4.8, and 25.4 mm(2), respectively, for width, height, and area. The slopes (size increase per month) of each parameter were 0.046, 0.023, and 0.202, respectively. Ninety-three percent (95 % CI 66-100 %) of asymptomatic solid pineoblastomas were larger in size than the 99 % upper bound. This study establishes norms for solid pineal gland size in non-retinoblastoma children aged 0-5 years. Knowledge of the size of the normal pineal gland is helpful for detection of pineal gland abnormalities, particularly pineoblastoma.
Sperm quality variables as indicators of bull fertility may be breed dependent.

PubMed

Morrell, Jane M; Nongbua, Thanapol; Valeanu, Sabina; Lima Verde, Isabel; Lundstedt-Enkel, Katrin; Edman, Anders; Johannisson, Anders

2017-10-01

A means of discriminating among bulls of high fertility based on sperm quality is needed by breeding centers. The objective of the study was to examine parameters of sperm quality in bulls of known fertility to identify useful indicators of fertility. Frozen semen was available from bulls of known fertility (Viking Genetics, Skara, Sweden): Swedish Red (n=31), Holstein (n=25) and Others (one each of Charolais, Limousin, Blonde, SKB). After thawing, the sperm samples were analyzed for motility (computer assisted sperm analysis), plasma membrane integrity, chromatin integrity, acrosome status, mitochondrial activity and reactive oxygen species. A fertility index score based on the adjusted 56-day non-return rate for >1000 inseminations was available for each bull. Multivariate data analysis (Partial Least Squares Regression and Orthogonal Partial Least Squares Regression) was performed to identify variables related to fertility; Pearson univariate correlations were made on the parameters of interest. Breed of bull affected the relationship of sperm quality variables and fertility index score, as follows: Swedish Red: %DNA Fragmentation Index, r=-0.56, P<0.01; intact plasma membrane, r=0.40, P<0.05; membrane damaged, not acrosome reacted, r=-0.6, P<0.01; Linearity, r=0.37, P<0.05; there was a trend towards significance for Wobble, r=0.34, P=0.08. Holstein: Linearity was significant r=0.46, P<0.05; there was a trend towards significance for Wobble, r=0.45, P=0.08. In conclusion, breed has a greater effect on sperm quality than previously realized; different parameters of sperm quality are needed to indicate potential fertility in different breeds. Copyright © 2017 Elsevier B.V. All rights reserved.
Adiposity as a full mediator of the influence of cardiorespiratory fitness and inflammation in schoolchildren: The FUPRECOL Study.

PubMed

Garcia-Hermoso, A; Agostinis-Sobrinho, C; Mota, J; Santos, R M; Correa-Bautista, J E; Ramírez-Vélez, R

2017-06-01

Studies in the paediatric population have shown inconsistent associations between cardiorespiratory fitness and inflammation independently of adiposity. The purpose of this study was (i) to analyse the combined association of cardiorespiratory fitness and adiposity with high-sensitivity C-reactive protein (hs-CRP), and (ii) to determine whether adiposity acts as a mediator on the association between cardiorespiratory fitness and hs-CRP in children and adolescents. This cross-sectional study included 935 (54.7% girls) healthy children and adolescents from Bogotá, Colombia. The 20 m shuttle run test was used to estimate cardiorespiratory fitness. We assessed the following adiposity parameters: body mass index, waist circumference, and fat mass index and the sum of subscapular and triceps skinfold thickness. High sensitivity assays were used to obtain hs-CRP. Linear regression models were fitted for mediation analyses examined whether the association between cardiorespiratory fitness and hs-CRP was mediated by each of adiposity parameters according to Baron and Kenny procedures. Lower levels of hs-CRP were associated with the best schoolchildren profiles (high cardiorespiratory fitness + low adiposity) (p for trend <0.001 in the four adiposity parameters), compared with unfit and overweight (low cardiorespiratory fitness + high adiposity) counterparts. Linear regression models suggest a full mediation of adiposity on the association between cardiorespiratory fitness and hs-CRP levels. Our findings seem to emphasize the importance of obesity prevention in childhood, suggesting that having high levels of cardiorespiratory fitness may not counteract the negative consequences ascribed to adiposity on hs-CRP. Copyright © 2017 The Italian Society of Diabetology, the Italian Society for the Study of Atherosclerosis, the Italian Society of Human Nutrition, and the Department of Clinical Medicine and Surgery, Federico II University. Published by Elsevier B.V. All rights reserved.

The rs3736228 polymorphism in the LRP5 gene is associated with calcaneal ultrasound parameter but not with body composition in a cohort of young Caucasian adults.

PubMed

Correa-Rodríguez, María; Schmidt-RioValle, Jacqueline; Rueda-Medina, Blanca

2017-11-01

The aim of the present study was to investigate the possible influence of low-density lipoprotein receptor-related protein 5 (LRP5) and sclerostin (SOST) genes as genetic factors contributing to calcaneal quantitative ultrasound (QUS) and body composition variables in a population of young Caucasian adults. The study population comprised a total of 575 individuals (mean age 20.41years; SD 2.36) whose bone mass was assessed through QUS to determine broadband ultrasound attenuation (BUA, dB/MHz). Body composition measurements were performed using a body composition analyser. Seven single-nucleotide polymorphisms (SNPs) of LRP5 (rs2306862, rs599083, rs556442 and rs3736228) and SOST (rs4792909, rs851054 and rs2023794) were selected as genetic markers and genotyped using TaqMan OpenArray ® technology. Linear regression analysis was used to test the possible association of the tested SNPs with QUS and body composition parameters. Linear regression analysis revealed that the rs3736228 SNP of LPR5 was significantly associated with BUA after adjustment for age, sex, weight, height, physical activity and calcium intake (P = 0.028, β (95% CI) = 0.089 (0.099-1.691). For the remaining SNPs, no significant association with the QUS measurement was observed. Regarding body composition, no significant association was found between LRP5 and SOST polymorphisms and body mass index, total fat mass and total lean mass after adjustment for age and sex as covariates. We concluded that the rs3736228 LRP5 genetic polymorphism influences calcaneal QUS parameter in a population of young Caucasian adults. This finding suggests that LRP5 might be an important genetic marker contributing to bone mass accrual early in life.
Roundness variation in JPEG images affects the automated process of nuclear immunohistochemical quantification: correction with a linear regression model.

PubMed

López, Carlos; Jaén Martinez, Joaquín; Lejeune, Marylène; Escrivà, Patricia; Salvadó, Maria T; Pons, Lluis E; Alvaro, Tomás; Baucells, Jordi; García-Rojo, Marcial; Cugat, Xavier; Bosch, Ramón

2009-10-01

The volume of digital image (DI) storage continues to be an important problem in computer-assisted pathology. DI compression enables the size of files to be reduced but with the disadvantage of loss of quality. Previous results indicated that the efficiency of computer-assisted quantification of immunohistochemically stained cell nuclei may be significantly reduced when compressed DIs are used. This study attempts to show, with respect to immunohistochemically stained nuclei, which morphometric parameters may be altered by the different levels of JPEG compression, and the implications of these alterations for automated nuclear counts, and further, develops a method for correcting this discrepancy in the nuclear count. For this purpose, 47 DIs from different tissues were captured in uncompressed TIFF format and converted to 1:3, 1:23 and 1:46 compression JPEG images. Sixty-five positive objects were selected from these images, and six morphological parameters were measured and compared for each object in TIFF images and those of the different compression levels using a set of previously developed and tested macros. Roundness proved to be the only morphological parameter that was significantly affected by image compression. Factors to correct the discrepancy in the roundness estimate were derived from linear regression models for each compression level, thereby eliminating the statistically significant differences between measurements in the equivalent images. These correction factors were incorporated in the automated macros, where they reduced the nuclear quantification differences arising from image compression. Our results demonstrate that it is possible to carry out unbiased automated immunohistochemical nuclear quantification in compressed DIs with a methodology that could be easily incorporated in different systems of digital image analysis.
Empirical relations of rock properties of outcrop and core samples from the Northwest German Basin for geothermal drilling

NASA Astrophysics Data System (ADS)

Reyer, D.; Philipp, S. L.

2014-09-01

Information about geomechanical and physical rock properties, particularly uniaxial compressive strength (UCS), are needed for geomechanical model development and updating with logging-while-drilling methods to minimise costs and risks of the drilling process. The following parameters with importance at different stages of geothermal exploitation and drilling are presented for typical sedimentary and volcanic rocks of the Northwest German Basin (NWGB): physical (P wave velocities, porosity, and bulk and grain density) and geomechanical parameters (UCS, static Young's modulus, destruction work and indirect tensile strength both perpendicular and parallel to bedding) for 35 rock samples from quarries and 14 core samples of sandstones and carbonate rocks. With regression analyses (linear- and non-linear) empirical relations are developed to predict UCS values from all other parameters. Analyses focus on sedimentary rocks and were repeated separately for clastic rock samples or carbonate rock samples as well as for outcrop samples or core samples. Empirical relations have high statistical significance for Young's modulus, tensile strength and destruction work; for physical properties, there is a wider scatter of data and prediction of UCS is less precise. For most relations, properties of core samples plot within the scatter of outcrop samples and lie within the 90% prediction bands of developed regression functions. The results indicate the applicability of empirical relations that are based on outcrop data on questions related to drilling operations when the database contains a sufficient number of samples with varying rock properties. The presented equations may help to predict UCS values for sedimentary rocks at depth, and thus develop suitable geomechanical models for the adaptation of the drilling strategy on rock mechanical conditions in the NWGB.
Simulation of uphill/downhill running on a level treadmill using additional horizontal force.

PubMed

Gimenez, Philippe; Arnal, Pierrick J; Samozino, Pierre; Millet, Guillaume Y; Morin, Jean-Benoit

2014-07-18

Tilting treadmills allow a convenient study of biomechanics during uphill/downhill running, but they are not commonly available and there is even fewer tilting force-measuring treadmill. The aim of the present study was to compare uphill/downhill running on a treadmill (inclination of ± 8%) with running on a level treadmill using additional backward or forward pulling forces to simulate the effect of gravity. This comparison specifically focused on the energy cost of running, stride frequency (SF), electromyographic activity (EMG), leg and foot angles at foot strike, and ground impact shock. The main results are that SF, impact shock, and leg and foot angle parameters determined were very similar and significantly correlated between the two methods, the intercept and slope of the linear regression not differing significantly from zero and unity, respectively. The correlation of oxygen uptake (V̇O2) data between both methods was not significant during uphill running (r=0.42; P>0.05). V̇O2 data were correlated during downhill running (r=0.74; P<0.01) but there was a significant difference between the methods (bias=-2.51 ± 1.94 ml min(-1) kg(-1)). Linear regressions for EMG of vastus lateralis, biceps femoris, gastrocnemius lateralis, soleus and tibialis anterior were not different from the identity line but the systematic bias was elevated for this parameter. In conclusion, this method seems appropriate for the study of SF, leg and foot angle, impact shock parameters but is less applicable for physiological variables (EMG and energy cost) during uphill/downhill running when using a tilting force-measuring treadmill is not possible. Copyright © 2014 Elsevier Ltd. All rights reserved.
Effect of body mass index on hemiparetic gait.

PubMed

Sheffler, Lynne R; Bailey, Stephanie Nogan; Gunzler, Douglas; Chae, John

2014-10-01

To evaluate the relationship between body mass index (BMI) and spatiotemporal, kinematic, and kinetic gait parameters in chronic hemiparetic stroke survivors. Secondary analysis of data collected in a randomized controlled trial comparing two 12-week ambulation training treatments. Academic medical center. Chronic hemiparetic stroke survivors (N = 108, >3 months poststroke) Linear regression analyses were performed of BMI, and selected pretreatment gait parameters were recorded using quantitative gait analysis. Spatiotemporal, kinematic, and kinetic gait parameters. A series of linear regression models that controlled for age, gender, stroke type (ischemic versus hemorrhagic), interval poststroke, level of motor impairment (Fugl-Meyer score), and walking speed found BMI to be positively associated with step width (m) (β = 0.364, P < .001), positively associated with peak hip abduction angle of the nonparetic limb during stance (deg) (β = 0.177, P = .040), negatively associated with ankle dorsiflexion angle at initial contact of the paretic limb (deg) (β = -0.222, P = .023), and negatively associated with peak ankle power at push-off (W/kg) of the paretic limb (W/kg)(β = -0.142, P = .026). When walking at a similar speed, chronic hemiparetic stroke subjects with a higher BMI demonstrated greater step width, greater hip hiking of the paretic lower limb, less paretic limb dorsiflexion at initial contact, and less paretic ankle power at push-off as compared to stroke subjects with a lower BMI and similar level of motor impairment. Further studies are necessary to determine the clinical relevance of these findings with respect to rehabilitation strategies for gait dysfunction in hemiparetic patients with higher BMIs. Copyright © 2014 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Possible association between obesity and periodontitis in patients with Down syndrome.

PubMed

Culebras-Atienza, E; Silvestre, F-J; Silvestre-Rangil, J

2018-05-01

The present study was carried out to evaluate the possible association between obesity and periodontitis in patients with DS, and to explore which measure of obesity is most closely correlated to periodontitis. A prospective observational study was made to determine whether obesity is related to periodontal disease in patients with DS. The anthropometric variables were body height and weight, which were used to calculate BMI and stratify the patients into three categories: < 25(normal weight), 25-29.9 (overweight) and ≥ 30.0 kg/m2 (obese). Waist circumference and hip circumference in turn was recorded as the greatest circumference at the level of the buttocks, while the waist/hip ratio (WHR) was calculated. Periodontal evaluation was made of all teeth recording the plaque index (PI), pocket depth (PD), clinical attachment level (CAL) and the gingival index. We generated a multivariate linear regression model to examine the relationship between PD and the frequency of tooth brushing, gender, BMI, WHI, WHR, age and PI. Significant positive correlations were observed among the anthropometric parameters BMI, WHR, WHI and among the periodontal parameters PI, PD, CAL and GI. The only positive correlation between the anthropometric and periodontal parameters corresponded to WHR. Upon closer examination, the distribution of WHR was seen to differ according to gender. Among the women, the correlation between WHR and the periodontal variables decreased to nonsignificant levels. In contrast, among the males the correlation remained significant and even increased. In a multivariate linear regression model, the coefficients relating PD to PI, WHR and age were positive and significant in all cases. Our results suggest that there may indeed be an association between obesity and periodontitis in male patients with DS. Also, we found a clear correlation with WHR, which was considered to be the ideal adiposity indicator in this context.
Genetic analyses of stillbirth in relation to litter size using random regression models.

PubMed

Chen, C Y; Misztal, I; Tsuruta, S; Herring, W O; Holl, J; Culbertson, M

2010-12-01

Estimates of genetic parameters for number of stillborns (NSB) in relation to litter size (LS) were obtained with random regression models (RRM). Data were collected from 4 purebred Duroc nucleus farms between 2004 and 2008. Two data sets with 6,575 litters for the first parity (P1) and 6,259 litters for the second to fifth parity (P2-5) with a total of 8,217 and 5,066 animals in the pedigree were analyzed separately. Number of stillborns was studied as a trait on sow level. Fixed effects were contemporary groups (farm-year-season) and fixed cubic regression coefficients on LS with Legendre polynomials. Models for P2-5 included the fixed effect of parity. Random effects were additive genetic effects for both data sets with permanent environmental effects included for P2-5. Random effects modeled with Legendre polynomials (RRM-L), linear splines (RRM-S), and degree 0 B-splines (RRM-BS) with regressions on LS were used. For P1, the order of polynomial, the number of knots, and the number of intervals used for respective models were quadratic, 3, and 3, respectively. For P2-5, the same parameters were linear, 2, and 2, respectively. Heterogeneous residual variances were considered in the models. For P1, estimates of heritability were 12 to 15%, 5 to 6%, and 6 to 7% in LS 5, 9, and 13, respectively. For P2-5, estimates were 15 to 17%, 4 to 5%, and 4 to 6% in LS 6, 9, and 12, respectively. For P1, average estimates of genetic correlations between LS 5 to 9, 5 to 13, and 9 to 13 were 0.53, -0.29, and 0.65, respectively. For P2-5, same estimates averaged for RRM-L and RRM-S were 0.75, -0.21, and 0.50, respectively. For RRM-BS with 2 intervals, the correlation was 0.66 between LS 5 to 7 and 8 to 13. Parameters obtained by 3 RRM revealed the nonlinear relationship between additive genetic effect of NSB and the environmental deviation of LS. The negative correlations between the 2 extreme LS might possibly indicate different genetic bases on incidence of stillbirth.
The use of artificial neural networks and multiple linear regression to predict rate of medical waste generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari

2009-11-15

Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performancemore » of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.« less
Strengthen forensic entomology in court--the need for data exploration and the validation of a generalised additive mixed model.

PubMed

Baqué, Michèle; Amendt, Jens

2013-01-01

Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Control Variate Selection for Multiresponse Simulation.

DTIC Science & Technology

1987-05-01

M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

ERIC Educational Resources Information Center

Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

2011-01-01

This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

PubMed

Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

2018-05-30

NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.
Quantile Regression in the Study of Developmental Sciences

PubMed Central

Petscher, Yaacov; Logan, Jessica A. R.

2014-01-01

Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

EPA Science Inventory

We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Model-based Bayesian inference for ROC data analysis

NASA Astrophysics Data System (ADS)

Lei, Tianhu; Bae, K. Ty

2013-03-01

This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

PubMed

Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

2012-05-01

Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Association between bibliometric parameters, reporting and methodological quality of randomised controlled trials in vascular and endovascular surgery.

PubMed

Hajibandeh, Shahab; Hajibandeh, Shahin; Antoniou, George A; Green, Patrick A; Maden, Michelle; Torella, Francesco

2017-04-01

Purpose We aimed to investigate association between bibliometric parameters, reporting and methodological quality of vascular and endovascular surgery randomised controlled trials. Methods The most recent 75 and oldest 75 randomised controlled trials published in leading journals over a 10-year period were identified. The reporting quality was analysed using the CONSORT statement, and methodological quality with the Intercollegiate Guidelines Network checklist. We used exploratory univariate and multivariable linear regression analysis to investigate associations. Findings Bibliometric parameters such as type of journal, study design reported in title, number of pages; external funding, industry sponsoring and number of citations are associated with reporting quality. Moreover, parameters such as type of journal, subject area and study design reported in title are associated with methodological quality. Conclusions The bibliometric parameters of randomised controlled trials may be independent predictors for their reporting and methodological quality. Moreover, the reporting quality of randomised controlled trials is associated with their methodological quality and vice versa.
Uterine Fibroids: Correlation of T2 Signal Intensity with Semiquantitative Perfusion MR Parameters in Patients Screened for MR-guided High-Intensity Focused Ultrasound Ablation.

PubMed

Kim, Young-Sun; Lee, Jeong-Won; Choi, Chel Hun; Kim, Byoung-Gie; Bae, Duk-Soo; Rhim, Hyunchul; Lim, Hyo Keun

2016-03-01

To evaluate the relationships between T2 signal intensity and semiquantitative perfusion magnetic resonance (MR) parameters of uterine fibroids in patients who were screened for MR-guided high-intensity focused ultrasound (HIFU) ablation. Institutional review board approval was granted, and informed consents were waived. One hundred seventy most symptom-relevant, nondegenerated uterine fibroids (mean diameter, 7.3 cm; range, 3.0-17.2 cm) in 170 women (mean age, 43.5 years; range, 24-56 years) undergoing screening MR examinations for MR-guided HIFU ablation from October 2009 to April 2014 were retrospectively analyzed. Fibroid signal intensity was assessed as the ratio of the fibroid T2 signal intensity to that of skeletal muscle. Parameters of semiquantitative perfusion MR imaging obtained during screening MR examination (peak enhancement, percentage of relative peak enhancement, time to peak [in seconds], wash-in rate [per seconds], and washout rate [per seconds]) were investigated to assess their relationships with T2 signal ratio by using multiple linear regression analysis. Correlations between T2 signal intensity and independently significant perfusion parameters were then evaluated according to fibroid type by using Spearman correlation test. Multiple linear regression analysis revealed that relative peak enhancement showed an independently significant correlation with T2 signal ratio (Β = 0.004, P < .001). Submucosal intracavitary (n = 20, ρ = 0.275, P = .240) and type III (n = 18, ρ = 0.082, P = .748) fibroids failed to show significant correlations between perfusion and T2 signal intensity, while significant correlations were found for all other fibroid types (ρ = 0.411-0.629, P < .05). In possible candidates for MR-guided HIFU ablation, the T2 signal intensity of nondegenerated uterine fibroids showed an independently significant positive correlation with relative peak enhancement in most cases, except those of submucosal intracavitary or type III fibroids.
Fatness mediates the influence of muscular fitness on metabolic syndrome in Colombian collegiate students.

PubMed

García-Hermoso, Antonio; Carrillo, Hugo Alejandro; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavidez, Daniel Humberto; Correa-Bautista, Jorge Enrique; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio; Peterson, Mark D; Ramírez-Vélez, Robinson

2017-01-01

The purpose of this study was two-fold: to analyze the association between muscular fitness (MF) and clustering of metabolic syndrome (MetS) components, and to determine if fatness parameters mediate the association between MF and MetS clustering in Colombian collegiate students. This cross-sectional study included a total of 886 (51.9% women) healthy collegiate students (21.4 ± 3.3 years old). Standing broad jump and isometric handgrip dynamometry were used as indicators of lower and upper body MF, respectively. Also, a MF score was computed by summing the standardized values of both tests, and used to classify adults as fit or unfit. We also assessed fat mass, body mass index, waist-to-height ratio, and abdominal visceral fat, and categorized individuals as low and high fat using international cut-offs. A MetS cluster score was derived by calculating the sum of the sample-specific z-scores from the triglycerides, HDL cholesterol, fasting glucose, waist circumference, and arterial blood pressure. Linear regression models were used to examine whether the association between MF and MetS cluster was mediated by the fatness parameters. Data were collected from 2013 to 2016 and the analysis was done in 2016. Findings revealed that the best profiles (fit + low fat) were associated with lower levels of the MetS clustering (p <0.001 in the four fatness parameters), compared with unfit and fat (unfit + high fat) counterparts. Linear regression models indicated a partial mediating effect for fatness parameters in the association of MF with MetS clustering. Our findings indicate that efforts to improve MF in young adults may decrease MetS risk partially through an indirect effect on improvements to adiposity levels. Thus, weight reduction should be taken into account as a complementary goal to improvements in MF within exercise programs.

Fatness mediates the influence of muscular fitness on metabolic syndrome in Colombian collegiate students

PubMed Central

Carrillo, Hugo Alejandro; González-Ruíz, Katherine; Vivas, Andrés; Triana-Reina, Héctor Reynaldo; Martínez-Torres, Javier; Prieto-Benavidez, Daniel Humberto; Ramos-Sepúlveda, Jeison Alexander; Villa-González, Emilio

2017-01-01

The purpose of this study was two-fold: to analyze the association between muscular fitness (MF) and clustering of metabolic syndrome (MetS) components, and to determine if fatness parameters mediate the association between MF and MetS clustering in Colombian collegiate students. This cross-sectional study included a total of 886 (51.9% women) healthy collegiate students (21.4 ± 3.3 years old). Standing broad jump and isometric handgrip dynamometry were used as indicators of lower and upper body MF, respectively. Also, a MF score was computed by summing the standardized values of both tests, and used to classify adults as fit or unfit. We also assessed fat mass, body mass index, waist-to-height ratio, and abdominal visceral fat, and categorized individuals as low and high fat using international cut-offs. A MetS cluster score was derived by calculating the sum of the sample-specific z-scores from the triglycerides, HDL cholesterol, fasting glucose, waist circumference, and arterial blood pressure. Linear regression models were used to examine whether the association between MF and MetS cluster was mediated by the fatness parameters. Data were collected from 2013 to 2016 and the analysis was done in 2016. Findings revealed that the best profiles (fit + low fat) were associated with lower levels of the MetS clustering (p <0.001 in the four fatness parameters), compared with unfit and fat (unfit + high fat) counterparts. Linear regression models indicated a partial mediating effect for fatness parameters in the association of MF with MetS clustering. Our findings indicate that efforts to improve MF in young adults may decrease MetS risk partially through an indirect effect on improvements to adiposity levels. Thus, weight reduction should be taken into account as a complementary goal to improvements in MF within exercise programs. PMID:28296952
Incorporating TPC observed parameters and QuikSCAT surface wind observations into hurricane initialization using 4D-VAR approaches

NASA Astrophysics Data System (ADS)

Park, Kyungjeen

This study aims to develop an objective hurricane initialization scheme which incorporates not only forecast model constraints but also observed features such as the initial intensity and size. It is based on the four-dimensional variational (4D-Var) bogus data assimilation (BDA) scheme originally proposed by Zou and Xiao (1999). The 4D-Var BDA consists of two steps: (i) specifying a bogus sea level pressure (SLP) field based on parameters observed by the Tropical Prediction Center (TPC) and (ii) assimilating the bogus SLP field under a forecast model constraint to adjust all model variables. This research focuses on improving the specification of the bogus SLP indicated in the first step. Numerical experiments are carried out for Hurricane Bonnie (1998) and Hurricane Gordon (2000) to test the sensitivity of hurricane track and intensity forecasts to specification of initial vortex. Major results are listed below: (1) A linear regression model is developed for determining the size of initial vortex based on the TPC observed radius of 34kt. (2) A method is proposed to derive a radial profile of SLP from QuikSCAT surface winds. This profile is shown to be more realistic than ideal profiles derived from Fujita's and Holland's formulae. (3) It is found that it takes about 1 h for hurricane prediction model to develop a conceptually correct hurricane structure, featuring a dominant role of hydrostatic balance at the initial time and a dynamic adjustment in less than 30 minutes. (4) Numerical experiments suggest that track prediction is less sensitive to the specification of initial vortex structure than intensity forecast. (5) Hurricane initialization using QuikSCAT-derived initial vortex produced a reasonably good forecast for hurricane landfall, with a position error of 25 km and a 4-h delay at landfalling. (6) Numerical experiments using the linear regression model for the size specification considerably outperforms all the other formulations tested in terms of the intensity prediction for both Hurricanes. For examples, the maximum track error is less than 110 km during the entire three-day forecasts for both hurricanes. The simulated Hurricane Gordon using the linear regression model made a nearly perfect landfall, with no position error and only 1-h error in landfalling time. (7) Diagnosis of model output indicates that the initial vortex specified by the linear regression model produces larger surface fluxes of sensible heat, latent heat and moisture, as well as stronger downward angular momentum transport than all the other schemes do. These enhanced energy supplies offset the energy lost caused by friction and gravity wave propagation, allowing for the model to maintain a strong and realistic hurricane during the entire forward model integration.
Viscoelastic Parameters for Quantifying Liver Fibrosis: Three-Dimensional Multifrequency MR Elastography Study on Thin Liver Rat Slices

PubMed Central

Ronot, Maxime; Lambert, Simon A.; Wagner, Mathilde; Garteiser, Philippe; Doblas, Sabrina; Albuquerque, Miguel; Paradis, Valérie; Vilgrain, Valérie; Sinkus, Ralph; Van Beers, Bernard E.

2014-01-01

Objective To assess in a high-resolution model of thin liver rat slices which viscoelastic parameter at three-dimensional multifrequency MR elastography has the best diagnostic performance for quantifying liver fibrosis. Materials and Methods The study was approved by the ethics committee for animal care of our institution. Eight normal rats and 42 rats with carbon tetrachloride induced liver fibrosis were used in the study. The rats were sacrificed, their livers were resected and three-dimensional MR elastography of 5±2 mm liver slices was performed at 7T with mechanical frequencies of 500, 600 and 700 Hz. The complex shear, storage and loss moduli, and the coefficient of the frequency power law were calculated. At histopathology, fibrosis and inflammation were assessed with METAVIR score, fibrosis was further quantified with morphometry. The diagnostic value of the viscoelastic parameters for assessing fibrosis severity was evaluated with simple and multiple linear regressions, receiver operating characteristic analysis and Obuchowski measures. Results At simple regression, the shear, storage and loss moduli were associated with the severity of fibrosis. At multiple regression, the storage modulus at 600 Hz was the only parameter associated with fibrosis severity (r = 0.86, p<0.0001). This parameter had an Obuchowski measure of 0.89+/−0.03. This measure was significantly larger than that of the loss modulus (0.78+/−0.04, p = 0.028), but not than that of the complex shear modulus (0.88+/−0.03, p = 0.84). Conclusion Our high resolution, three-dimensional multifrequency MR elastography study of thin liver slices shows that the storage modulus is the viscoelastic parameter that has the best association with the severity of liver fibrosis. However, its diagnostic performance does not differ significantly from that of the complex shear modulus. PMID:24722733
Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

DTIC Science & Technology

2015-07-15

Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

NASA Astrophysics Data System (ADS)

Cepowski, Tomasz

2017-06-01

The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
Effect of solidification parameters on mechanical properties of directionally solidified Al-Rich Al-Cu alloys

NASA Astrophysics Data System (ADS)

Çadırlı, Emin

2013-05-01

Al(100-x)-Cux alloys (x=3 wt%, 6 wt%, 15 wt%, 24 wt% and 33 wt%) were prepared using metals of 99.99% high purity in vacuum atmosphere. These alloys were directionally solidified under steady-state conditions by using a Bridgman-type directional solidification furnace. Solidification parameters (G, V and ), microstructure parameters (λ1, λ2 and λE) and mechanical properties (HV, σ) of the Al-Cu alloys were measured. Microstructure parameters were expressed as functions of solidification parameters by using a linear regression analysis. The dependency of HV, σ on the cooling rate, microstructure parameters and composition were determined. According to experimental results, the microhardness and ultimate tensile strength of the solidified samples was increased by increasing the cooling rate and Cu content, but decreased with increasing microstructure parameters. The microscopic fracture surfaces of the different samples were observed using scanning electron microscopy. Fractographic analysis of the tensile fracture surfaces showed that the type of fracture significantly changed from ductile to brittle depending on the composition.
Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

ERIC Educational Resources Information Center

Li, Deping; Oranje, Andreas

2007-01-01

Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Indole 3-acetic acid, indoxyl sulfate and paracresyl-sulfate do not influence anemia parameters in hemodialysis patients.

PubMed

Bataille, Stanislas; Pelletier, Marion; Sallée, Marion; Berland, Yvon; McKay, Nathalie; Duval, Ariane; Gentile, Stéphanie; Mouelhi, Yosra; Brunet, Philippe; Burtey, Stéphane

2017-07-26

The main reason for anemia in renal failure patients is the insufficient erythropoietin production by the kidneys. Beside erythropoietin deficiency, in vitro studies have incriminated uremic toxins in the pathophysiology of anemia but clinical data are sparse. In order to assess if indole 3-acetic acid (IAA), indoxyl sulfate (IS), and paracresyl sulfate (PCS) -three protein bound uremic toxins- are clinically implicated in end-stage renal disease anemia we studied the correlation between IAA, IS and PCS plasmatic concentrations with hemoglobin and Erythropoietin Stimulating Agents (ESA) use in hemodialysis patients. Between June and July 2014, we conducted an observational cross sectional study in two hemodialysis center. Three statistical approaches were conducted. First, we compared patients treated with ESA and those not treated. Second, we performed linear regression models between IAA, IS, and PCS plasma concentrations and hemoglobin, the ESA dose over hemoglobin ratio (ESA/Hemoglobin) or the ESA resistance index (ERI). Third, we used a polytomous logistic regression model to compare groups of patients with no/low/high ESA dose and low/high hemoglobin statuses. Overall, 240 patients were included in the study. Mean age ± SD was 67.6 ± 16.0 years, 55.4% were men and 42.5% had diabetes mellitus. When compared with ESA treated patients, patients with no ESA had higher hemoglobin (mean 11.4 ± 1.1 versus 10.6 ± 1.2 g/dL; p <0.001), higher transferrin saturation (TSAT, 31.1 ± 16.3% versus 23.1 ± 11.5%; p < 0.001), less frequently an IV iron prescription (52.1 versus 65.7%, p = 0.04) and were more frequently treated with hemodiafiltration (53.5 versus 36.7%). In univariate analysis, IAA, IS or PCS plasma concentrations did not differ between the two groups. In the linear model, IAA plasma concentration was not associated with hemoglobin, but was negatively associated with ESA/Hb (p = 0.02; R = 0.18) and with the ERI (p = 0.03; R = 0.17). IS was associated with none of the three anemia parameters. PCS was positively associated with hemoglobin (p = 0.03; R = 0.14), but negatively with ESA/Hb (p = 0.03; R = 0.17) and the ERI (p = 0.02; R = 0.19). In multivariate analysis, the association of IAA concentration with ESA/Hb or ERI was not statistically significant, neither was the association of PCS with ESA/Hb or ERI. Identically, in the subgroup of 76 patients with no inflammation (CRP <5 mg/L) and no iron deficiency (TSAT >20%) linear regression between IAA, IS or PCS and any anemia parameter did not reach significance. In the third model, univariate analysis showed no intergroup significant differences for IAA and IS. Regarding PCS, the Low Hb/High ESA group had lower concentrations. However, when we compared PCS with the other significant characteristics of the five groups to the Low Hb/high ESA (our reference group), the polytomous logistic regression model didn't show any significant difference for PCS. In our study, using three different statistical models, we were unable to show any correlation between IAA, IS and PCS plasmatic concentrations and any anemia parameter in hemodialysis patients. Indolic uremic toxins and PCS have no or a very low effect on anemia parameters.
A linear regression approach to evaluate the green supply chain management impact on industrial organizational performance.

PubMed

Mumtaz, Ubaidullah; Ali, Yousaf; Petrillo, Antonella

2018-05-15

The increase in the environmental pollution is one of the most important topic in today's world. In this context, the industrial activities can pose a significant threat to the environment. To manage problems associate to industrial activities several methods, techniques and approaches have been developed. Green supply chain management (GSCM) is considered one of the most important "environmental management approach". In developing countries such as Pakistan the implementation of GSCM practices is still in its initial stages. Lack of knowledge about its effects on economic performance is the reason because of industries fear to implement these practices. The aim of this research is to perceive the effects of GSCM practices on organizational performance in Pakistan. In this research the GSCM practices considered are: internal practices, external practices, investment recovery and eco-design. While, the performance parameters considered are: environmental pollution, operational cost and organizational flexibility. A set of hypothesis propose the effect of each GSCM practice on the performance parameters. Factor analysis and linear regression are used to analyze the survey data of Pakistani industries, in order to authenticate these hypotheses. The findings of this research indicate a decrease in environmental pollution and operational cost with the implementation of GSCM practices, whereas organizational flexibility has not improved for Pakistani industries. These results aim to help managers regarding their decision of implementing GSCM practices in the industrial sector of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.
Is serum zinc level associated with prediabetes and diabetes?: a cross-sectional study from Bangladesh.

PubMed

Islam, Md Rafiqul; Arslan, Iqbal; Attia, John; McEvoy, Mark; McElduff, Patrick; Basher, Ariful; Rahman, Waliur; Peel, Roseanne; Akhter, Ayesha; Akter, Shahnaz; Vashum, Khanrin P; Milton, Abul Hasnat

2013-01-01

To determine serum zinc level and other relevant biological markers in normal, prediabetic and diabetic individuals and their association with Homeostasis Model Assessment (HOMA) parameters. This cross-sectional study was conducted between March and December 2009. Any patient aged ≥ 30 years attending the medicine outpatient department of a medical university hospital in Dhaka, Bangladesh and who had a blood glucose level ordered by a physician was eligible to participate. A total of 280 participants were analysed. On fasting blood sugar results, 51% were normal, 13% had prediabetes and 36% had diabetes. Mean serum zinc level was lowest in prediabetic compared to normal and diabetic participants (mean differences were approximately 65 ppb/L and 33 ppb/L, respectively). In multiple linear regression, serum zinc level was found to be significantly lower in prediabetes than in those with normoglycemia. Beta cell function was significantly lower in prediabetes than normal participants. Adjusted linear regression for HOMA parameters did not show a statistically significant association between serum zinc level, beta cell function (P = 0.07) and insulin resistance (P = 0.08). Low serum zinc accentuated the increase in insulin resistance seen with increasing BMI. Participants with prediabetes have lower zinc levels than controls and zinc is significantly associated with beta cell function and insulin resistance. Further longitudinal population based studies are warranted and controlled trials would be valuable for establishing whether zinc supplementation in prediabetes could be a useful strategy in preventing progression to Type 2 diabetes.
Characterizing the performance of the Conway-Maxwell Poisson generalized linear model.

PubMed

Francis, Royce A; Geedipally, Srinivas Reddy; Guikema, Seth D; Dhavala, Soma Sekhar; Lord, Dominique; LaRocca, Sarah

2012-01-01

Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway-Maxwell Poisson (COM-Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM-Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM-Poisson GLM, and (2) estimate the prediction accuracy of the COM-Poisson GLM using simulated data sets. The results of the study indicate that the COM-Poisson GLM is flexible enough to model under-, equi-, and overdispersed data sets with different sample mean values. The results also show that the COM-Poisson GLM yields accurate parameter estimates. The COM-Poisson GLM provides a promising and flexible approach for performing count data regression. © 2011 Society for Risk Analysis.
Scleral birefringence as measured by polarization-sensitive optical coherence tomography and ocular biometric parameters of human eyes in vivo.

PubMed

Yamanari, Masahiro; Nagase, Satoko; Fukuda, Shinichi; Ishii, Kotaro; Tanaka, Ryosuke; Yasui, Takeshi; Oshika, Tetsuro; Miura, Masahiro; Yasuno, Yoshiaki

2014-05-01

The relationship between scleral birefringence and biometric parameters of human eyes in vivo is investigated. Scleral birefringence near the limbus of 21 healthy human eyes was measured using polarization-sensitive optical coherence tomography. Spherical equivalent refractive error, axial eye length, and intraocular pressure (IOP) were measured in all subjects. IOP and scleral birefringence of human eyes in vivo was found to have statistically significant correlations (r = -0.63, P = 0.002). The slope of linear regression was -2.4 × 10(-2) deg/μm/mmHg. Neither spherical equivalent refractive error nor axial eye length had significant correlations with scleral birefringence. To evaluate the direct influence of IOP to scleral birefringence, scleral birefringence of 16 ex vivo porcine eyes was measured under controlled IOP of 5-60 mmHg. In these ex vivo porcine eyes, the mean linear regression slope between controlled IOP and scleral birefringence was -9.9 × 10(-4) deg/μm/mmHg. In addition, porcine scleral collagen fibers were observed with second-harmonic-generation (SHG) microscopy. SHG images of porcine sclera, measured on the external surface at the superior side to the cornea, showed highly aligned collagen fibers parallel to the limbus. In conclusion, scleral birefringence of healthy human eyes was correlated with IOP, indicating that the ultrastructure of scleral collagen was correlated with IOP. It remains to show whether scleral collagen ultrastructure of human eyes is affected by IOP as a long-term effect.
Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

PubMed

Ernst, Anja F; Albers, Casper J

2017-01-01

Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

PubMed Central

Ernst, Anja F.

2017-01-01

Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Comparing The Effectiveness of a90/95 Calculations (Preprint)

DTIC Science & Technology

2006-09-01

Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
AucPR: an AUC-based approach using penalized regression for disease prediction with high-dimensional omics data.

PubMed

Yu, Wenbao; Park, Taesung

2014-01-01

It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes. We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.
Potential pitfalls when denoising resting state fMRI data using nuisance regression.

PubMed

Bright, Molly G; Tench, Christopher R; Murphy, Kevin

2017-07-01

In resting state fMRI, it is necessary to remove signal variance associated with noise sources, leaving cleaned fMRI time-series that more accurately reflect the underlying intrinsic brain fluctuations of interest. This is commonly achieved through nuisance regression, in which the fit is calculated of a noise model of head motion and physiological processes to the fMRI data in a General Linear Model, and the "cleaned" residuals of this fit are used in further analysis. We examine the statistical assumptions and requirements of the General Linear Model, and whether these are met during nuisance regression of resting state fMRI data. Using toy examples and real data we show how pre-whitening, temporal filtering and temporal shifting of regressors impact model fit. Based on our own observations, existing literature, and statistical theory, we make the following recommendations when employing nuisance regression: pre-whitening should be applied to achieve valid statistical inference of the noise model fit parameters; temporal filtering should be incorporated into the noise model to best account for changes in degrees of freedom; temporal shifting of regressors, although merited, should be achieved via optimisation and validation of a single temporal shift. We encourage all readers to make simple, practical changes to their fMRI denoising pipeline, and to regularly assess the appropriateness of the noise model used. By negotiating the potential pitfalls described in this paper, and by clearly reporting the details of nuisance regression in future manuscripts, we hope that the field will achieve more accurate and precise noise models for cleaning the resting state fMRI time-series. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Correlation and simple linear regression.

PubMed

Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

2003-06-01

In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

PubMed

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-02-01

A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Maxillary arch dimensions associated with acoustic parameters in prepubertal children.

PubMed

Hamdan, Abdul-Latif; Khandakji, Mohannad; Macari, Anthony Tannous

2018-04-18

To evaluate the association between maxillary arch dimensions and fundamental frequency and formants of voice in prepubertal subjects. Thirty-five consecutive prepubertal patients seeking orthodontic treatment were recruited (mean age = 11.41 ± 1.46 years; range, 8 to 13.7 years). Participants with a history of respiratory infection, laryngeal manipulation, dysphonia, congenital facial malformations, or history of orthodontic treatment were excluded. Dental measurements included maxillary arch length, perimeter, depth, and width. Voice parameters comprising fundamental frequency (f0_sustained), Habitual pitch (f0_count), Jitter, Shimmer, and different formant frequencies (F1, F2, F3, and F4) were measured using acoustic analysis prior to initiation of any orthodontic treatment. Pearson's correlation coefficients were used to measure the strength of associations between different dental and voice parameters. Multiple linear regressions were computed for the predictions of different dental measurements. Arch width and arch depth had moderate significant negative correlations with f0 ( r = -0.52; P = .001 and r = -0.39; P = .022, respectively) and with habitual frequency ( r = -0.51; P = .0014 and r = -0.34; P = .04, respectively). Arch depth and arch length were significantly correlated with formant F3 and formant F4, respectively. Predictors of arch depth included frequencies of F3 vowels, with a significant regression equation ( P-value < .001; R 2 = 0.49). Similarly, fundamental frequency f0 and frequencies of formant F3 vowels were predictors of arch width, with a significant regression equation ( P-value < .001; R 2 = 0.37). There is a significant association between arch dimensions, particularly arch length and depth, and voice parameters. The formant most predictive of arch depth and width is the third formant, along with fundamental frequency of voice.

Climbing fibers predict movement kinematics and performance errors.

PubMed

Streng, Martha L; Popa, Laurentiu S; Ebner, Timothy J

2017-09-01

Requisite for understanding cerebellar function is a complete characterization of the signals provided by complex spike (CS) discharge of Purkinje cells, the output neurons of the cerebellar cortex. Numerous studies have provided insights into CS function, with the most predominant view being that they are evoked by error events. However, several reports suggest that CSs encode other aspects of movements and do not always respond to errors or unexpected perturbations. Here, we evaluated CS firing during a pseudo-random manual tracking task in the monkey ( Macaca mulatta ). This task provides extensive coverage of the work space and relative independence of movement parameters, delivering a robust data set to assess the signals that activate climbing fibers. Using reverse correlation, we determined feedforward and feedback CSs firing probability maps with position, velocity, and acceleration, as well as position error, a measure of tracking performance. The direction and magnitude of the CS modulation were quantified using linear regression analysis. The major findings are that CSs significantly encode all three kinematic parameters and position error, with acceleration modulation particularly common. The modulation is not related to "events," either for position error or kinematics. Instead, CSs are spatially tuned and provide a linear representation of each parameter evaluated. The CS modulation is largely predictive. Similar analyses show that the simple spike firing is modulated by the same parameters as the CSs. Therefore, CSs carry a broader array of signals than previously described and argue for climbing fiber input having a prominent role in online motor control. NEW & NOTEWORTHY This article demonstrates that complex spike (CS) discharge of cerebellar Purkinje cells encodes multiple parameters of movement, including motor errors and kinematics. The CS firing is not driven by error or kinematic events; instead it provides a linear representation of each parameter. In contrast with the view that CSs carry feedback signals, the CSs are predominantly predictive of upcoming position errors and kinematics. Therefore, climbing fibers carry multiple and predictive signals for online motor control. Copyright © 2017 the American Physiological Society.
U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

DTIC Science & Technology

2017-10-01

ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Relationships between lifestyle patterns and cardio-renal-metabolic parameters in patients with type 2 diabetes mellitus: A cross-sectional study.

PubMed

Ogihara, Takeshi; Mita, Tomoya; Osonoi, Yusuke; Osonoi, Takeshi; Saito, Miyoko; Tamasawa, Atsuko; Nakayama, Shiho; Someya, Yuki; Ishida, Hidenori; Gosho, Masahiko; Kanazawa, Akio; Watada, Hirotaka

2017-01-01

While individuals tend to show accumulation of certain lifestyle patterns, the effect of such patterns in real daily life on cardio-renal-metabolic parameters remains largely unknown. This study aimed to assess clustering of lifestyle patterns and investigate the relationships between such patterns and cardio-renal-metabolic parameters. The study participants were 726 Japanese type 2 diabetes mellitus (T2DM) outpatients free of history of cardiovascular diseases. The relationship between lifestyle patterns and cardio-renal-metabolic parameters was investigated by linear and logistic regression analyses. Factor analysis identified three lifestyle patterns. Subjects characterized by evening type, poor sleep quality and depressive status (type 1 pattern) had high levels of HbA1c, alanine aminotransferase and albuminuria. Subjects characterized by high consumption of food, alcohol and cigarettes (type 2 pattern) had high levels of γ-glutamyl transpeptidase, triglycerides, HDL-cholesterol, blood pressure, and brachial-ankle pulse wave velocity. Subjects characterized by high physical activity (type 3 pattern) had low uric acid and mild elevation of alanine aminotransferase and aspartate aminotransferase. In multivariate regression analysis adjusted by age, gender and BMI, type 1 pattern was associated with higher HbA1c levels, systolic BP and brachial-ankle pulse wave velocity. Type 2 pattern was associated with higher HDL-cholesterol levels, triglycerides, aspartate aminotransferase, ɤ- glutamyl transpeptidase levels, and diastolic BP. The study identified three lifestyle patterns that were associated with distinct cardio-metabolic-renal parameters in T2DM patients. UMIN000010932.
Estimation of actual evapotranspiration in the Nagqu river basin of the Tibetan Plateau

NASA Astrophysics Data System (ADS)

Zou, Mijun; Zhong, Lei; Ma, Yaoming; Hu, Yuanyuan; Feng, Lu

2018-05-01

As a critical component of the energy and water cycle, terrestrial actual evapotranspiration (ET) can be influenced by many factors. This study was mainly devoted to providing accurate and continuous estimations of actual ET for the Tibetan Plateau (TP) and analyzing the effects of its impact factors. In this study, summer observational data from the Coordinated Enhanced Observing Period (CEOP) Asia-Australia Monsoon Project (CAMP) on the Tibetan Plateau (CAMP/Tibet) for 2003 to 2004 was selected to determine actual ET and investigate its relationship with energy, hydrological, and dynamical parameters. Multiple-layer air temperature, relative humidity, net radiation flux, wind speed, precipitation, and soil moisture were used to estimate actual ET. The regression model simulation results were validated with independent data retrieved using the combinatory method. The results suggested that significant correlations exist between actual ET and hydro-meteorological parameters in the surface layer of the Nagqu river basin, among which the most important factors are energy-related elements (net radiation flux and air temperature). The results also suggested that how ET is eventually affected by precipitation and two-layer wind speed difference depends on whether their positive or negative feedback processes have a more important role. The multivariate linear regression method provided reliable estimations of actual ET; thus, 6-parameter simplified schemes and 14-parameter regular schemes were established.
Linear regression in astronomy. II

NASA Technical Reports Server (NTRS)

Feigelson, Eric D.; Babu, Gutti J.

1992-01-01

A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Background stratified Poisson regression analysis of cohort data.

PubMed

Richardson, David B; Langholz, Bryan

2012-03-01

Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
A Constrained Linear Estimator for Multiple Regression

ERIC Educational Resources Information Center

Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

2010-01-01

"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Isotherm, kinetic, and thermodynamic study of ciprofloxacin sorption on sediments.

PubMed

Mutavdžić Pavlović, Dragana; Ćurković, Lidija; Grčić, Ivana; Šimić, Iva; Župan, Josip

2017-04-01

In this study, equilibrium isotherms, kinetics and thermodynamics of ciprofloxacin on seven sediments in a batch sorption process were examined. The effects of contact time, initial ciprofloxacin concentration, temperature and ionic strength on the sorption process were studied. The K d parameter from linear sorption model was determined by linear regression analysis, while the Freundlich and Dubinin-Radushkevich (D-R) sorption models were applied to describe the equilibrium isotherms by linear and nonlinear methods. The estimated K d values varied from 171 to 37,347 mL/g. The obtained values of E (free energy estimated from D-R isotherm model) were between 3.51 and 8.64 kJ/mol, which indicated a physical nature of ciprofloxacin sorption on studied sediments. According to obtained n values as measure of intensity of sorption estimate from Freundlich isotherm model (from 0.69 to 1.442), ciprofloxacin sorption on sediments can be categorized from poor to moderately difficult sorption characteristics. Kinetics data were best fitted by the pseudo-second-order model (R 2 > 0.999). Thermodynamic parameters including the Gibbs free energy (ΔG°), enthalpy (ΔH°) and entropy (ΔS°) were calculated to estimate the nature of ciprofloxacin sorption. Results suggested that sorption on sediments was a spontaneous exothermic process.
Advanced Statistical Analyses to Reduce Inconsistency of Bond Strength Data.

PubMed

Minamino, T; Mine, A; Shintani, A; Higashi, M; Kawaguchi-Uemura, A; Kabetani, T; Hagino, R; Imai, D; Tajiri, Y; Matsumoto, M; Yatani, H

2017-11-01

This study was designed to clarify the interrelationship of factors that affect the value of microtensile bond strength (µTBS), focusing on nondestructive testing by which information of the specimens can be stored and quantified. µTBS test specimens were prepared from 10 noncarious human molars. Six factors of µTBS test specimens were evaluated: presence of voids at the interface, X-ray absorption coefficient of resin, X-ray absorption coefficient of dentin, length of dentin part, size of adhesion area, and individual differences of teeth. All specimens were observed nondestructively by optical coherence tomography and micro-computed tomography before µTBS testing. After µTBS testing, the effect of these factors on µTBS data was analyzed by the general linear model, linear mixed effects regression model, and nonlinear regression model with 95% confidence intervals. By the general linear model, a significant difference in individual differences of teeth was observed ( P < 0.001). A significantly positive correlation was shown between µTBS and length of dentin part ( P < 0.001); however, there was no significant nonlinearity ( P = 0.157). Moreover, a significantly negative correlation was observed between µTBS and size of adhesion area ( P = 0.001), with significant nonlinearity ( P = 0.014). No correlation was observed between µTBS and X-ray absorption coefficient of resin ( P = 0.147), and there was no significant nonlinearity ( P = 0.089). Additionally, a significantly positive correlation was observed between µTBS and X-ray absorption coefficient of dentin ( P = 0.022), with significant nonlinearity ( P = 0.036). A significant difference was also observed between the presence and absence of voids by linear mixed effects regression analysis. Our results showed correlations between various parameters of tooth specimens and µTBS data. To evaluate the performance of the adhesive more precisely, the effect of tooth variability and a method to reduce variation in bond strength values should also be considered.
Processing multispectral data obtained by orbital platforms of the LANDSAT series for studies of water quality in Guanabara Bay. M.S. Thesis

NASA Technical Reports Server (NTRS)

Dejesusparada, N. (Principal Investigator); Verdesio, J. J.

1981-01-01

The relationship existing between Guanabara Bay water quality ground truth parameters and LANDSAT MSS video data was investigated. The parameters considered were: chorophyll content, water transparency usng the Secchi disk, salinity, and dissolved ammonia. Data from two overflights was used, and methods of processing digital data were compared. Linear and nonlinear regression analyses were utilized, comparing original data with processed data by using the correlation coefficient and the estimation mean error. It was determined that better quality data are obtained by using radiometric correction programs with a physical basis, contrast ratio, and normalization. Incidental locations of floating vegetation, changes in bottom depth, oil slicks, and ships at anchor were made.
Bone mineral density and correlation factor analysis in normal Taiwanese children.

PubMed

Shu, San-Ging

2007-01-01

Our aim was to establish reference data and linear regression equations for lumbar bone mineral density (BMD) in normal Taiwanese children. Several influencing factors of lumbar BMD were investigated. Two hundred fifty-seven healthy children were recruited from schools, 136 boys and 121 girls, aged 4-18 years were enrolled on a voluntary basis with written consent. Their height, weight, blood pressure, puberty stage, bone age and lumbar BMD (L2-4) by dual energy x-ray absorptiometry (DEXA) were measured. Data were analyzed using Pearson correlation and stepwise regression tests. All measurements increased with age. Prior to age 8, there was no gender difference. Parameters such as height, weight, and bone age (BA) in girls surpassed boys between ages 8-13 without statistical significance (p> or =0.05). This was reversed subsequently after age 14 in height (p<0.05). BMD difference had the same trend but was not statistically significant either. The influencing power of puberty stage and bone age over BMD was almost equal to or higher than that of height and weight. All the other factors correlated with BMD to variable powers. Multiple linear regression equations for boys and girls were formulated. BMD reference data is provided and can be used to monitor childhood pathological conditions. However, BMD in those with abnormal bone age or pubertal development could need modifications to ensure accuracy.
On the design of classifiers for crop inventories

NASA Technical Reports Server (NTRS)

Heydorn, R. P.; Takacs, H. C.

1986-01-01

Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Assessment of mechanical properties of isolated bovine intervertebral discs from multi-parametric magnetic resonance imaging.

PubMed

Recuerda, Maximilien; Périé, Delphine; Gilbert, Guillaume; Beaudoin, Gilles

2012-10-12

The treatment planning of spine pathologies requires information on the rigidity and permeability of the intervertebral discs (IVDs). Magnetic resonance imaging (MRI) offers great potential as a sensitive and non-invasive technique for describing the mechanical properties of IVDs. However, the literature reported small correlation coefficients between mechanical properties and MRI parameters. Our hypothesis is that the compressive modulus and the permeability of the IVD can be predicted by a linear combination of MRI parameters. Sixty IVDs were harvested from bovine tails, and randomly separated in four groups (in-situ, digested-6h, digested-18h, digested-24h). Multi-parametric MRI acquisitions were used to quantify the relaxation times T1 and T2, the magnetization transfer ratio MTR, the apparent diffusion coefficient ADC and the fractional anisotropy FA. Unconfined compression, confined compression and direct permeability measurements were performed to quantify the compressive moduli and the hydraulic permeabilities. Differences between groups were evaluated from a one way ANOVA. Multi linear regressions were performed between dependent mechanical properties and independent MRI parameters to verify our hypothesis. A principal component analysis was used to convert the set of possibly correlated variables into a set of linearly uncorrelated variables. Agglomerative Hierarchical Clustering was performed on the 3 principal components. Multilinear regressions showed that 45 to 80% of the Young's modulus E, the aggregate modulus in absence of deformation HA0, the radial permeability kr and the axial permeability in absence of deformation k0 can be explained by the MRI parameters within both the nucleus pulposus and the annulus pulposus. The principal component analysis reduced our variables to two principal components with a cumulative variability of 52-65%, which increased to 70-82% when considering the third principal component. The dendograms showed a natural division into four clusters for the nucleus pulposus and into three or four clusters for the annulus fibrosus. The compressive moduli and the permeabilities of isolated IVDs can be assessed mostly by MT and diffusion sequences. However, the relationships have to be improved with the inclusion of MRI parameters more sensitive to IVD degeneration. Before the use of this technique to quantify the mechanical properties of IVDs in vivo on patients suffering from various diseases, the relationships have to be defined for each degeneration state of the tissue that mimics the pathology. Our MRI protocol associated to principal component analysis and agglomerative hierarchical clustering are promising tools to classify the degenerated intervertebral discs and further find biomarkers and predictive factors of the evolution of the pathologies.
Tri-axial tactile sensing element

NASA Astrophysics Data System (ADS)

Castellanos-Ramos, Julián.; Navas-González, Rafael; Vidal-Verdú, F.

2013-05-01

A 13 x 13 square millimetre tri-axial taxel is presented which is suitable for some medical applications, for instance in assistive robotics that involves contact with humans or in prosthetics. Finite Element Analysis is carried out to determine what structure is the best to obtain a uniform distribution of pressure on the sensing areas underneath the structure. This structure has been fabricated in plastic with a 3D printer and a commercial tactile sensor has been used to implement the sensing areas. A three axis linear motorized translation stage with a tri-axial precision force sensor is used to find the parameters of the linear regression model and characterize the proposed taxel. The results are analysed to see to what extent the goal has been reached in this specific implementation.
Methods for estimating confidence intervals in interrupted time series analyses of health interventions.

PubMed

Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis

2009-02-01

Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
SigrafW: An easy-to-use program for fitting enzyme kinetic data.

PubMed

Leone, Francisco Assis; Baranauskas, José Augusto; Furriel, Rosa Prazeres Melo; Borin, Ivana Aparecida

2005-11-01

SigrafW is Windows-compatible software developed using the Microsoft® Visual Basic Studio program that uses the simplified Hill equation for fitting kinetic data from allosteric and Michaelian enzymes. SigrafW uses a modified Fibonacci search to calculate maximal velocity (V), the Hill coefficient (n), and the enzyme-substrate apparent dissociation constant (K). The estimation of V, K, and the sum of the squares of residuals is performed using a Wilkinson nonlinear regression at any Hill coefficient (n). In contrast to many currently available kinetic analysis programs, SigrafW shows several advantages for the determination of kinetic parameters of both hyperbolic and nonhyperbolic saturation curves. No initial estimates of the kinetic parameters are required, a measure of the goodness-of-the-fit for each calculation performed is provided, the nonlinear regression used for calculations eliminates the statistical bias inherent in linear transformations, and the software can be used for enzyme kinetic simulations either for educational or research purposes. Persons interested in receiving a free copy of the software should contact Dr. F. A. Leone. Copyright © 2005 International Union of Biochemistry and Molecular Biology, Inc.
An empirical-statistical model for laser cladding of Ti-6Al-4V powder on Ti-6Al-4V substrate

NASA Astrophysics Data System (ADS)

Nabhani, Mohammad; Razavi, Reza Shoja; Barekat, Masoud

2018-03-01

In this article, Ti-6Al-4V powder alloy was directly deposited on Ti-6Al-4V substrate using laser cladding process. In this process, some key parameters such as laser power (P), laser scanning rate (V) and powder feeding rate (F) play important roles. Using linear regression analysis, this paper develops the empirical-statistical relation between these key parameters and geometrical characteristics of single clad tracks (i.e. clad height, clad width, penetration depth, wetting angle, and dilution) as a combined parameter (PαVβFγ). The results indicated that the clad width linearly depended on PV-1/3 and powder feeding rate had no effect on it. The dilution controlled by a combined parameter as VF-1/2 and laser power was a dispensable factor. However, laser power was the dominant factor for the clad height, penetration depth, and wetting angle so that they were proportional to PV-1F1/4, PVF-1/8, and P3/4V-1F-1/4, respectively. Based on the results of correlation coefficient (R > 0.9) and analysis of residuals, it was confirmed that these empirical-statistical relations were in good agreement with the measured values of single clad tracks. Finally, these relations led to the design of a processing map that can predict the geometrical characteristics of the single clad tracks based on the key parameters.
Oracle estimation of parametric models under boundary constraints.

PubMed

Wong, Kin Yau; Goldberg, Yair; Fine, Jason P

2016-12-01

In many classical estimation problems, the parameter space has a boundary. In most cases, the standard asymptotic properties of the estimator do not hold when some of the underlying true parameters lie on the boundary. However, without knowledge of the true parameter values, confidence intervals constructed assuming that the parameters lie in the interior are generally over-conservative. A penalized estimation method is proposed in this article to address this issue. An adaptive lasso procedure is employed to shrink the parameters to the boundary, yielding oracle inference which adapt to whether or not the true parameters are on the boundary. When the true parameters are on the boundary, the inference is equivalent to that which would be achieved with a priori knowledge of the boundary, while if the converse is true, the inference is equivalent to that which is obtained in the interior of the parameter space. The method is demonstrated under two practical scenarios, namely the frailty survival model and linear regression with order-restricted parameters. Simulation studies and real data analyses show that the method performs well with realistic sample sizes and exhibits certain advantages over standard methods. © 2016, The International Biometric Society.
Pre-natal exposures to cocaine and alcohol and physical growth patterns to age 8 years

PubMed Central

Lumeng, Julie C.; Cabral, Howard J.; Gannon, Katherine; Heeren, Timothy; Frank, Deborah A.

2007-01-01

Two hundred and two primarily African American/Caribbean children (classified by maternal report and infant meconium as 38 heavier, 74 lighter and 89 not cocaine-exposed) were measured repeatedly from birth to age 8 years to assess whether there is an independent effect of prenatal cocaine exposure on physical growth patterns. Children with fetal alcohol syndrome identifiable at birth were excluded. At birth, cocaine and alcohol exposures were significantly and independently associated with lower weight, length and head circumference in cross-sectional multiple regression analyses. The relationship over time of pre-natal exposures to weight, height, and head circumference was then examined by multiple linear regression using mixed linear models including covariates: child’s gestational age, gender, ethnicity, age at assessment, current caregiver, birth mother’s use of alcohol, marijuana and tobacco during the pregnancy and pre-pregnancy weight (for child’s weight) and height (for child’s height and head circumference). The cocaine effects did not persist beyond infancy in piecewise linear mixed models, but a significant and independent negative effect of pre-natal alcohol exposure persisted for weight, height, and head circumference. Catch-up growth in cocaine-exposed infants occurred primarily by 6 months of age for all growth parameters, with some small fluctuations in growth rates in the preschool age range but no detectable differences between heavier versus unexposed nor lighter versus unexposed thereafter. PMID:17412558
Feature Extraction of Event-Related Potentials Using Wavelets: An Application to Human Performance Monitoring

NASA Technical Reports Server (NTRS)

Trejo, Leonard J.; Shensa, Mark J.; Remington, Roger W. (Technical Monitor)

1998-01-01

This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many f ree parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation,-, algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance.

Feature extraction of event-related potentials using wavelets: an application to human performance monitoring

NASA Technical Reports Server (NTRS)

Trejo, L. J.; Shensa, M. J.

1999-01-01

This report describes the development and evaluation of mathematical models for predicting human performance from discrete wavelet transforms (DWT) of event-related potentials (ERP) elicited by task-relevant stimuli. The DWT was compared to principal components analysis (PCA) for representation of ERPs in linear regression and neural network models developed to predict a composite measure of human signal detection performance. Linear regression models based on coefficients of the decimated DWT predicted signal detection performance with half as many free parameters as comparable models based on PCA scores. In addition, the DWT-based models were more resistant to model degradation due to over-fitting than PCA-based models. Feed-forward neural networks were trained using the backpropagation algorithm to predict signal detection performance based on raw ERPs, PCA scores, or high-power coefficients of the DWT. Neural networks based on high-power DWT coefficients trained with fewer iterations, generalized to new data better, and were more resistant to overfitting than networks based on raw ERPs. Networks based on PCA scores did not generalize to new data as well as either the DWT network or the raw ERP network. The results show that wavelet expansions represent the ERP efficiently and extract behaviorally important features for use in linear regression or neural network models of human performance. The efficiency of the DWT is discussed in terms of its decorrelation and energy compaction properties. In addition, the DWT models provided evidence that a pattern of low-frequency activity (1 to 3.5 Hz) occurring at specific times and scalp locations is a reliable correlate of human signal detection performance. Copyright 1999 Academic Press.
The H,G_1,G_2 photometric system with scarce observational data

NASA Astrophysics Data System (ADS)

Penttilä, A.; Granvik, M.; Muinonen, K.; Wilkman, O.

2014-07-01

The H,G_1,G_2 photometric system was officially adopted at the IAU General Assembly in Beijing, 2012. The system replaced the H,G system from 1985. The 'photometric system' is a parametrized model V(α; params) for the magnitude-phase relation of small Solar System bodies, and the main purpose is to predict the magnitude at backscattering, H := V(0°), i.e., the (absolute) magnitude of the object. The original H,G system was designed using the best available data in 1985, but since then new observations have been made showing certain features, especially near backscattering, to which the H,G function has troubles adjusting to. The H,G_1,G_2 system was developed especially to address these issues [1]. With a sufficient number of high-accuracy observations and with a wide phase-angle coverage, the H,G_1,G_2 system performs well. However, with scarce low-accuracy data the system has troubles producing a reliable fit, as would any other three-parameter nonlinear function. Therefore, simultaneously with the H,G_1,G_2 system, a two-parameter version of the model, the H,G_{12} system, was introduced [1]. The two-parameter version ties the parameters G_1,G_2 into a single parameter G_{12} by a linear relation, and still uses the H,G_1,G_2 system in the background. This version dramatically improves the possibility to receive a reliable phase-curve fit to scarce data. The amount of observed small bodies is increasing all the time, and so is the need to produce estimates for the absolute magnitude/diameter/albedo and other size/composition related parameters. The lack of small-phase-angle observations is especially topical for near-Earth objects (NEOs). With these, even the two- parameter version faces problems. The previous procedure with the H,G system in such circumstances has been that the G-parameter has been fixed to some constant value, thus only fitting a single-parameter function. In conclusion, there is a definitive need for a reliable procedure to produce photometric fits to very scarce and low-accuracy data. There are a few details that should be considered with the H,G_1,G_2 or H,G_{12} systems with scarce data. The first point is the distribution of errors in the fit. The original H,G system allowed linear regression in the flux space, thus making the estimation computationally easier. The same principle was repeated with the H,G_1,G_2 system. There is, however, a major hidden assumption in the transformation. With regression modeling, the residuals should be distributed symmetrically around zero. If they are normally distributed, even better. We have noticed that, at least with some NEO observations, the residuals in the flux space are far from symmetric, and seem to be much more symmetric in the magnitude space. The result is that the nonlinear fit in magnitude space is far more reliable than the linear fit in the flux space. Since the computers and nonlinear regression algorithms are efficient enough, we conclude that, in many cases, with low-accuracy data the nonlinear fit should be favored. In fact, there are statistical procedures that should be employed with the photometric fit. At the moment, the choice between the three-parameter and two-parameter versions is simply based on subjective decision-making. By checking parameter error and model comparison statistics, the choice could be done objectively. Similarly, the choice between the linear fit in flux space and the nonlinear fit in magnitude space should be based on a statistical test of unbiased residuals. Furthermore, the so-called Box-Cox transform could be employed to find an optimal transformation somewhere between the magnitude and flux spaces. The H,G_1,G_2 system is based on cubic splines, and is therefore a bit more complicated to implement than a system with simpler basis functions. The same applies to a complete program that would automatically choose the best transforms to data, test if two- or three-parameter version of the model should be fitted, and produce the fitted parameters with their error estimates. Our group has already made implementations of the H,G_1,G_2 system publicly available [2]. We plan to implement the abovementioned improvements to the system and make also these tools public.
The Impact of Uncertain Physical Parameters on HVAC Demand Response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Yannan; Elizondo, Marcelo A.; Lu, Shuai

HVAC units are currently one of the major resources providing demand response (DR) in residential buildings. Models of HVAC with DR function can improve understanding of its impact on power system operations and facilitate the deployment of DR technologies. This paper investigates the importance of various physical parameters and their distributions to the HVAC response to DR signals, which is a key step to the construction of HVAC models for a population of units with insufficient data. These parameters include the size of floors, insulation efficiency, the amount of solid mass in the house, and efficiency of the HVAC units.more » These parameters are usually assumed to follow Gaussian or Uniform distributions. We study the effect of uncertainty in the chosen parameter distributions on the aggregate HVAC response to DR signals, during transient phase and in steady state. We use a quasi-Monte Carlo sampling method with linear regression and Prony analysis to evaluate sensitivity of DR output to the uncertainty in the distribution parameters. The significance ranking on the uncertainty sources is given for future guidance in the modeling of HVAC demand response.« less
Improving RNA nearest neighbor parameters for helices by going beyond the two-state model.

PubMed

Spasic, Aleksandar; Berger, Kyle D; Chen, Jonathan L; Seetin, Matthew G; Turner, Douglas H; Mathews, David H

2018-06-01

RNA folding free energy change nearest neighbor parameters are widely used to predict folding stabilities of secondary structures. They were determined by linear regression to datasets of optical melting experiments on small model systems. Traditionally, the optical melting experiments are analyzed assuming a two-state model, i.e. a structure is either complete or denatured. Experimental evidence, however, shows that structures exist in an ensemble of conformations. Partition functions calculated with existing nearest neighbor parameters predict that secondary structures can be partially denatured, which also directly conflicts with the two-state model. Here, a new approach for determining RNA nearest neighbor parameters is presented. Available optical melting data for 34 Watson-Crick helices were fit directly to a partition function model that allows an ensemble of conformations. Fitting parameters were the enthalpy and entropy changes for helix initiation, terminal AU pairs, stacks of Watson-Crick pairs and disordered internal loops. The resulting set of nearest neighbor parameters shows a 38.5% improvement in the sum of residuals in fitting the experimental melting curves compared to the current literature set.
Prevalence and risk factors of non-carious cervical lesions related to occupational exposure to acid mists.

PubMed

Bomfim, Rafael Aiello; Crosato, Edgard; Mazzilli, Luiz Eugênio Nigro; Frias, Antonio Carlos

2015-01-01

This study evaluates the prevalence and risk factors of non-carious cervical lesions (NCCLs) in a Brazilian population of workers exposed and non-exposed to acid mists and chemical products. One hundred workers (46 exposed and 54 non-exposed) were evaluated in a Centro de Referência em Saúde do Trabalhador - CEREST (Worker's Health Reference Center). The workers responded to questionnaires regarding their personal information and about alcohol consumption and tobacco use. A clinical examination was conducted to evaluate the presence of NCCLs, according to WHO parameters. Statistical analyses were performed by unconditional logistic regression and multiple linear regression, with the critical level of p < 0.05. NCCLs were significantly associated with age groups (18-34, 35-44, 45-68 years). The unconditional logistic regression showed that the presence of NCCLs was better explained by age group (OR = 4.04; CI 95% 1.77-9.22) and occupational exposure to acid mists and chemical products (OR = 3.84; CI 95% 1.10-13.49), whereas the linear multiple regression revealed that NCCLs were better explained by years of smoking (p = 0.01) and age group (p = 0.04). The prevalence of NCCLs in the study population was particularly high (76.84%), and the risk factors for NCCLs were age, exposure to acid mists and smoking habit. Controlling risk factors through preventive and educative measures, allied to the use of personal protective equipment to prevent the occupational exposure to acid mists, may contribute to minimizing the prevalence of NCCLs.
A comparison of model-based imputation methods for handling missing predictor values in a linear regression model: A simulation study

NASA Astrophysics Data System (ADS)

Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah

2017-08-01

In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.

DTIC Science & Technology

1983-09-01

books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines

Treesearch

Stanley J. Zarnoch

2009-01-01

Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

ERIC Educational Resources Information Center

Schafer, William D.; Wang, Yuh-Yin

A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.

ERIC Educational Resources Information Center

Chan, Wai-Sum

2001-01-01

Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Estimating monotonic rates from biological data using local linear regression.

PubMed

Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

2017-03-01

Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
UV Spectrophotometric Determination and Validation of Hydroquinone in Liposome.

PubMed

Khoshneviszadeh, Rabea; Fazly Bazzaz, Bibi Sedigheh; Housaindokht, Mohammad Reza; Ebrahim-Habibi, Azadeh; Rajabi, Omid

2015-01-01

The method has been developed and validated for the determination of hydroquinone in liposomal formulation. The samples were dissolved in methanol and evaluated in 293 nm. The validation parameters such as linearity, accuracy, precision, specificity, limit of detection (LOD) and limit of quantitation (LOQ) were determined. The calibration curve was linear in 1-50 µg/mL range of hydroquinone analyte with a regression coefficient of 0.9998. This study showed that the liposomal hydroquinone composed of phospholipid (7.8 %), cholesterol (1.5 %), alpha ketopherol (0.17 %) and hydroquinone (0.5 %) did not absorb wavelength of 293 nm if it diluted 500 times by methanol. The concentration of hydroquinone reached 10 µg/mL after 500 times of dilution. Furthermore, various validation parameters as per ICH Q2B guideline were tested and found accordingly. The recovery percentages of liposomal hydroquinone were found 102 ± 0.8, 99 ± 0.2 and 98 ± 0.4 for 80%, 100% and 120% respectively. The relative standard deviation values of inter and intra-day precisions were <%2. LOD and LOQ were 0.24 and 0.72 µg/mL respectively.
Locally linear regression for pose-invariant face recognition.

PubMed

Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

2007-07-01

The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
New generation of hydraulic pedotransfer functions for Europe

PubMed Central

Tóth, B; Weynants, M; Nemes, A; Makó, A; Bilas, G; Tóth, G

2015-01-01

A range of continental-scale soil datasets exists in Europe with different spatial representation and based on different principles. We developed comprehensive pedotransfer functions (PTFs) for applications principally on spatial datasets with continental coverage. The PTF development included the prediction of soil water retention at various matric potentials and prediction of parameters to characterize soil moisture retention and the hydraulic conductivity curve (MRC and HCC) of European soils. We developed PTFs with a hierarchical approach, determined by the input requirements. The PTFs were derived by using three statistical methods: (i) linear regression where there were quantitative input variables, (ii) a regression tree for qualitative, quantitative and mixed types of information and (iii) mean statistics of developer-defined soil groups (class PTF) when only qualitative input parameters were available. Data of the recently established European Hydropedological Data Inventory (EU-HYDI), which holds the most comprehensive geographical and thematic coverage of hydro-pedological data in Europe, were used to train and test the PTFs. The applied modelling techniques and the EU-HYDI allowed the development of hydraulic PTFs that are more reliable and applicable for a greater variety of input parameters than those previously available for Europe. Therefore the new set of PTFs offers tailored advanced tools for a wide range of applications in the continent. PMID:25866465
A novel method for determination of aragonite saturation state on the continental shelf of central Oregon using multi-parameter relationships with hydrographic data

NASA Astrophysics Data System (ADS)

Juranek, L. W.; Feely, R. A.; Peterson, W. T.; Alin, S. R.; Hales, B.; Lee, K.; Sabine, C. L.; Peterson, J.

2009-12-01

We developed a multiple linear regression model to robustly determine aragonite saturation state (Ωarag) from observations of temperature and oxygen (R2 = 0.987, RMS error 0.053), using data collected in the Pacific Northwest region in late May 2007. The seasonal evolution of Ωarag near central Oregon was evaluated by applying the regression model to a monthly (winter)/bi-weekly (summer) water-column hydrographic time-series collected over the shelf and slope in 2007. The Ωarag predicted by the regression model was less than 1, the thermodynamic calcification/dissolution threshold, over shelf/slope bottom waters throughout the entire 2007 upwelling season (May-November), with the Ωarag = 1 horizon shoaling to 30 m by late summer. The persistence of water with Ωarag < 1 on the continental shelf has not been previously noted and could have notable ecological consequences for benthic and pelagic calcifying organisms such as mussels, oysters, abalone, echinoderms, and pteropods.
Quantile Regression Models for Current Status Data

PubMed Central

Ou, Fang-Shu; Zeng, Donglin; Cai, Jianwen

2016-01-01

Current status data arise frequently in demography, epidemiology, and econometrics where the exact failure time cannot be determined but is only known to have occurred before or after a known observation time. We propose a quantile regression model to analyze current status data, because it does not require distributional assumptions and the coefficients can be interpreted as direct regression effects on the distribution of failure time in the original time scale. Our model assumes that the conditional quantile of failure time is a linear function of covariates. We assume conditional independence between the failure time and observation time. An M-estimator is developed for parameter estimation which is computed using the concave-convex procedure and its confidence intervals are constructed using a subsampling method. Asymptotic properties for the estimator are derived and proven using modern empirical process theory. The small sample performance of the proposed method is demonstrated via simulation studies. Finally, we apply the proposed method to analyze data from the Mayo Clinic Study of Aging. PMID:27994307
Modeling of tropospheric NO2 column over different climatic zones and land use/land cover types in South Asia

NASA Astrophysics Data System (ADS)

ul-Haq, Zia; Rana, Asim Daud; Tariq, Salman; Mahmood, Khalid; Ali, Muhammad; Bashir, Iqra

2018-03-01

We have applied regression analyses for the modeling of tropospheric NO2 (tropo-NO2) as the function of anthropogenic nitrogen oxides (NOx) emissions, aerosol optical depth (AOD), and some important meteorological parameters such as temperature (Temp), precipitation (Preci), relative humidity (RH), wind speed (WS), cloud fraction (CLF) and outgoing long-wave radiation (OLR) over different climatic zones and land use/land cover types in South Asia during October 2004-December 2015. Simple linear regression shows that, over South Asia, tropo-NO2 variability is significantly linked to AOD, WS, NOx, Preci and CLF. Also zone-5, consisting of tropical monsoon areas of eastern India and Myanmar, is the only study zone over which all the selected parameters show their influence on tropo-NO2 at statistical significance levels. In stepwise multiple linear modeling, tropo-NO2 column over landmass of South Asia, is significantly predicted by the combination of RH (standardized regression coefficient, β = - 49), AOD (β = 0.42) and NOx (β = 0.25). The leading predictors of tropo-NO2 columns over zones 1-5 are OLR, AOD, Temp, OLR, and RH respectively. Overall, as revealed by the higher correlation coefficients (r), the multiple regressions provide reasonable models for tropo-NO2 over South Asia (r = 0.82), zone-4 (r = 0.90) and zone-5 (r = 0.93). The lowest r (of 0.66) has been found for hot semi-arid region in northwestern Indus-Ganges Basin (zone-2). The highest value of β for urban area AOD (of 0.42) is observed for megacity Lahore, located in warm semi-arid zone-2 with large scale crop-residue burning, indicating strong influence of aerosols on the modeled tropo-NO2 column. A statistical significant correlation (r = 0.22) at the 0.05 level is found between tropo-NO2 and AOD over Lahore. Also NOx emissions appear as the highest contributor (β = 0.59) for modeled tropo-NO2 column over megacity Dhaka.
Online determination of biophysical parameters of mucous membranes of a human body

NASA Astrophysics Data System (ADS)

Lisenko, S. A.; Kugeiko, M. M.

2013-07-01

We have developed a method for online determination of biophysical parameters of mucous membranes (MMs) of a human body (transport scattering coefficient, scattering anisotropy factor, haemoglobin concentration, degrees of blood oxygenation, average diameter of capillaries with blood) from measurements of spectral and spatial characteristics of diffuse reflection. The method is based on regression relationships between linearly independent components of the measured light signals and the unknown parameters of MMs, obtained by simulation of the radiation transfer in the MM under conditions of its general variability. We have proposed and justified the calibration-free fibre-optic method for determining the concentration of haemoglobin in MMs by measuring the light signals diffusely reflected by the tissue in four spectral regions at two different distances from the illumination spot. We have selected the optimal wavelengths of optical probing for the implementation of the method.
A Driving Behaviour Model of Electrical Wheelchair Users

PubMed Central

Hamam, Y.; Djouani, K.; Daachi, B.; Steyn, N.

2016-01-01

In spite of the presence of powered wheelchairs, some of the users still experience steering challenges and manoeuvring difficulties that limit their capacity of navigating effectively. For such users, steering support and assistive systems may be very necessary. To appreciate the assistance, there is need that the assistive control is adaptable to the user's steering behaviour. This paper contributes to wheelchair steering improvement by modelling the steering behaviour of powered wheelchair users, for integration into the control system. More precisely, the modelling is based on the improved Directed Potential Field (DPF) method for trajectory planning. The method has facilitated the formulation of a simple behaviour model that is also linear in parameters. To obtain the steering data for parameter identification, seven individuals participated in driving the wheelchair in different virtual worlds on the augmented platform. The obtained data facilitated the estimation of user parameters, using the ordinary least square method, with satisfactory regression analysis results. PMID:27148362
Effect of substituents on prediction of TLC retention of tetra-dentate Schiff bases and their Copper(II) and Nickel(II) complexes.

PubMed

Stevanović, Nikola R; Perušković, Danica S; Gašić, Uroš M; Antunović, Vesna R; Lolić, Aleksandar Đ; Baošić, Rada M

2017-03-01

The objectives of this study were to gain insights into structure-retention relationships and to propose the model to estimating their retention. Chromatographic investigation of series of 36 Schiff bases and their copper(II) and nickel(II) complexes was performed under both normal- and reverse-phase conditions. Chemical structures of the compounds were characterized by molecular descriptors which are calculated from the structure and related to the chromatographic retention parameters by multiple linear regression analysis. Effects of chelation on retention parameters of investigated compounds, under normal- and reverse-phase chromatographic conditions, were analyzed by principal component analysis, quantitative structure-retention relationship and quantitative structure-activity relationship models were developed on the basis of theoretical molecular descriptors, calculated exclusively from molecular structure, and parameters of retention and lipophilicity. Copyright © 2016 John Wiley & Sons, Ltd.

Effect of Malmquist bias on correlation studies with IRAS data base

NASA Technical Reports Server (NTRS)

Verter, Frances

1993-01-01

The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
Relationship of total body fat mass to weight-bearing bone volumetric density, geometry, and strength in young girls

PubMed Central

Farr, Joshua N.; Chen, Zhao; Lisse, Jeffrey R.; Lohman, Timothy G.; Going, Scott B.

2010-01-01

Understanding the influence of total body fat mass (TBFM) on bone during the peri-pubertal years is critical for the development of future interventions aimed at improving bone strength and reducing fracture risk. Thus, we evaluated the relationship of TBFM to volumetric bone mineral density (vBMD), geometry, and strength at metaphyseal and diaphyseal sites of the femur and tibia of young girls. Data from 396 girls aged 8–13 years from the “Jump-In: Building Better Bones” study were analyzed. Bone parameters were assessed using peripheral quantitative computed tomography (pQCT) at the 4% and 20% distal femur and 4% and 66% distal tibia of the non-dominant leg. Bone parameters at the 4% sites included trabecular vBMD, periosteal circumference, and bone strength index (BSI), while at the 20% femur and 66% tibia, parameters included cortical vBMD, periosteal circumference, and strength-strain index (SSI). Multiple linear regression analyses were used to assess associations between bone parameters and TBFM, controlling for muscle cross-sectional area (MCSA). Regression analyses were then repeated with maturity, bone length, physical activity, and ethnicity as additional covariates. Analysis of covariance (ANCOVA) was used to compare bone parameters among tertiles of TBFM. In regression models with TBFM and MCSA, associations between TBFM and bone parameters at all sites were not significant. TBFM explained very little variance in all bone parameters (0.2–2.3%). In contrast, MCSA was strongly related (p < 0.001) to all bone parameters, except cortical vBMD. The addition of maturity, bone length, physical activity, and ethnicity did not alter the relationship between TBFM and bone parameters. With bone parameters expressed relative to total body mass, ANCOVA showed that all outcomes were significantly (p < 0.001) greater in the lowest compared to the middle and highest tertiles of TBFM. Although TBFM is correlated with femur and tibia vBMD, periosteal circumference, and strength in young girls, this relationship is significantly attenuated after adjustment for MCSA. Nevertheless, girls with higher TBFM relative to body mass have markedly diminished vBMD, geometry, and bone strength at metaphyseal and diaphyseal sites of the femur and tibia. PMID:20060079
Tests of Sunspot Number Sequences: 3. Effects of Regression Procedures on the Calibration of Historic Sunspot Data

NASA Astrophysics Data System (ADS)

Lockwood, M.; Owens, M. J.; Barnard, L.; Usoskin, I. G.

2016-11-01

We use sunspot-group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups [RB] above a variable cut-off threshold of observed total whole spot area (uncorrected for foreshortening) to simulate what a lower-acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number [RA] using a variety of regression techniques. It is found that a very high correlation between RA and RB (r_{AB} > 0.98) does not prevent large errors in the intercalibration (for example sunspot-maximum values can be over 30 % too large even for such levels of r_{AB}). In generating the backbone sunspot number [R_{BB}], Svalgaard and Schatten ( Solar Phys., 2016) force regression fits to pass through the scatter-plot origin, which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot-cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile ("Q-Q") plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least-squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot-group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar-cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
A statistical investigation into the relationship between meteorological parameters and suicide

NASA Astrophysics Data System (ADS)

Dixon, Keith W.; Shulman, Mark D.

1983-06-01

Many previous studies of relationships between weather and suicides have been inconclusive and contradictory. This study investigated the relationship between suicide frequency and meteorological conditions in people who are psychologically predisposed to commit suicide. Linear regressions of diurnal temperature change, departure of temperature from the climatic norm, mean daytime sky cover, and the number of hours of precipitation for each day were performed on daily suicide totals using standard computer methods. Statistical analyses of suicide data for days with and without frontal passages were also performed. Days with five or more suicides (clusterdays) were isolated, and their weather parameters compared with those of nonclusterdays. Results show that neither suicide totals nor clusterday occurrence can be predicted using these meteorological parameters, since statistically significant relationships were not found. Although the data hinted that frontal passages and large daily temperature changes may occur on days with above average suicide totals, it was concluded that the influence of the weather parameters used, on the suicide rate, is a minor one, if indeed one exists.
A primer for biomedical scientists on how to execute model II linear regression analysis.

PubMed

Ludbrook, John

2012-04-01

1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Random regression analyses using B-splines functions to model growth from birth to adult age in Canchim cattle.

PubMed

Baldi, F; Alencar, M M; Albuquerque, L G

2010-12-01

The objective of this work was to estimate covariance functions using random regression models on B-splines functions of animal age, for weights from birth to adult age in Canchim cattle. Data comprised 49,011 records on 2435 females. The model of analysis included fixed effects of contemporary groups, age of dam as quadratic covariable and the population mean trend taken into account by a cubic regression on orthogonal polynomials of animal age. Residual variances were modelled through a step function with four classes. The direct and maternal additive genetic effects, and animal and maternal permanent environmental effects were included as random effects in the model. A total of seventeen analyses, considering linear, quadratic and cubic B-splines functions and up to seven knots, were carried out. B-spline functions of the same order were considered for all random effects. Random regression models on B-splines functions were compared to a random regression model on Legendre polynomials and with a multitrait model. Results from different models of analyses were compared using the REML form of the Akaike Information criterion and Schwarz' Bayesian Information criterion. In addition, the variance components and genetic parameters estimated for each random regression model were also used as criteria to choose the most adequate model to describe the covariance structure of the data. A model fitting quadratic B-splines, with four knots or three segments for direct additive genetic effect and animal permanent environmental effect and two knots for maternal additive genetic effect and maternal permanent environmental effect, was the most adequate to describe the covariance structure of the data. Random regression models using B-spline functions as base functions fitted the data better than Legendre polynomials, especially at mature ages, but higher number of parameters need to be estimated with B-splines functions. © 2010 Blackwell Verlag GmbH.
Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2013-01-01

This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Artificial Neural Network approach to develop unique Classification and Raga identification tools for Pattern Recognition in Carnatic Music

NASA Astrophysics Data System (ADS)

Srimani, P. K.; Parimala, Y. G.

2011-12-01

A unique approach has been developed to study patterns in ragas of Carnatic Classical music based on artificial neural networks. Ragas in Carnatic music which have found their roots in the Vedic period, have grown on a Scientific foundation over thousands of years. However owing to its vastness and complexities it has always been a challenge for scientists and musicologists to give an all encompassing perspective both qualitatively and quantitatively. Cognition, comprehension and perception of ragas in Indian classical music have always been the subject of intensive research, highly intriguing and many facets of these are hitherto not unravelled. This paper is an attempt to view the melakartha ragas with a cognitive perspective using artificial neural network based approach which has given raise to very interesting results. The 72 ragas of the melakartha system were defined through the combination of frequencies occurring in each of them. The data sets were trained using several neural networks. 100% accurate pattern recognition and classification was obtained using linear regression, TLRN, MLP and RBF networks. Performance of the different network topologies, by varying various network parameters, were compared. Linear regression was found to be the best performing network.
Using direct current resistivity sounding and geostatistics to aid in hydrogeological studies in the Choshuichi alluvial fan, Taiwan.

PubMed

Yang, Chieh-Hou; Lee, Wei-Feng

2002-01-01

Ground water reservoirs in the Choshuichi alluvial fan, central western Taiwan, were investigated using direct-current (DC) resistivity soundings at 190 locations, combined with hydrogeological measurements from 37 wells. In addition, attempts were made to calculate aquifer transmissivity from both surface DC resistivity measurements and geostatistically derived predictions of aquifer properties. DC resistivity sounding data are highly correlated to the hydraulic parameters in the Choshuichi alluvial fan. By estimating the spatial distribution of hydraulic conductivity from the kriged well data and the cokriged thickness of the correlative aquifer from both resistivity sounding data and well information, the transmissivity of the aquifer at each location can be obtained from the product of kriged hydraulic conductivity and computed thickness of the geoelectric layer. Thus, the spatial variation of the transmissivities in the study area is obtained. Our work is more comparable to Ahmed et al. (1988) than to the work of Niwas and Singhal (1981). The first "constraint" from Niwas and Singhal's work is a result of their use of linear regression. The geostatistical approach taken here (and by Ahmed et al. [1988]) is a natural improvement on the linear regression approach.
Modeling of bromate formation by ozonation of surface waters in drinking water treatment.

PubMed

Legube, Bernard; Parinet, Bernard; Gelinet, Karine; Berne, Florence; Croue, Jean-Philippe

2004-04-01

The main objective of this paper is to try to develop statistically and chemically rational models for bromate formation by ozonation of clarified surface waters. The results presented here show that bromate formation by ozonation of natural waters in drinking water treatment is directly proportional to the "Ct" value ("Ctau" in this study). Moreover, this proportionality strongly depends on many parameters: increasing of pH, temperature and bromide level leading to an increase of bromate formation; ammonia and dissolved organic carbon concentrations causing a reverse effect. Taking into account limitation of theoretical modeling, we proposed to predict bromate formation by stochastic simulations (multi-linear regression and artificial neural networks methods) from 40 experiments (BrO(3)(-) vs. "Ctau") carried out with three sand filtered waters sampled on three different waterworks. With seven selected variables we used a simple architecture of neural networks, optimized by "neural connection" of SPSS Inc./Recognition Inc. The bromate modeling by artificial neural networks gives better result than multi-linear regression. The artificial neural networks model allowed us classifying variables by decreasing order of influence (for the studied cases in our variables scale): "Ctau", [N-NH(4)(+)], [Br(-)], pH, temperature, DOC, alkalinity.
Use of a tracing task to assess visuomotor performance for evidence of concussion and recuperation.

PubMed

Kelty-Stephen, Damian G; Qureshi Ahmad, Mona; Stirling, Leia

2015-12-01

The likelihood of suffering a concussion while playing a contact sport ranges from 15-45% per year of play. These rates are highly variable as athletes seldom report concussive symptoms, or do not recognize their symptoms. We performed a prospective cohort study (n = 206, aged 10-17) to examine visuomotor tracing to determine the sensitivity for detecting neuromotor components of concussion. Tracing variability measures were investigated for a mean shift with presentation of concussion-related symptoms and a linear return toward baseline over subsequent return visits. Furthermore, previous research relating brain injury to the dissociation of smooth movements into "submovements" led to the expectation that cumulative micropause duration, a measure of motion continuity, might detect likelihood of injury. Separate linear mixed effects regressions of tracing measures indicated that 4 of the 5 tracing measures captured both short-term effects of injury and longer-term effects of recovery with subsequent visits. Cumulative micropause duration has a positive relationship with likelihood of participants having had a concussion. The present results suggest that future research should evaluate how well the coefficients for the tracing parameter in the logistic regression help to detect concussion in novel cases. (c) 2015 APA, all rights reserved).
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

PubMed

Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

2015-06-01

High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Competing regression models for longitudinal data.

PubMed

Alencar, Airlane P; Singer, Julio M; Rocha, Francisco Marcelo M

2012-03-01

The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest-posttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Application of ordinary Kriging method in entomologic ecology].

PubMed

Zhang, Runjie; Zhou, Qiang; Chen, Cuixian; Wang, Shousong

2003-01-01

Geostatistics is a statistic method based on regional variables and using the tool of variogram to analyze the spatial structure and the patterns of organism. In simulating the variogram within a great range, though optimal simulation cannot be obtained, the simulation method of a dialogue between human and computer can be used to optimize the parameters of the spherical models. In this paper, the method mentioned above and the weighted polynomial regression were utilized to simulate the one-step spherical model, the two-step spherical model and linear function model, and the available nearby samples were used to draw on the ordinary Kriging procedure, which provided a best linear unbiased estimate of the constraint of the unbiased estimation. The sum of square deviation between the estimating and measuring values of varying theory models were figured out, and the relative graphs were shown. It was showed that the simulation based on the two-step spherical model was the best simulation, and the one-step spherical model was better than the linear function model.
Optical Coherence Tomography Machine Learning Classifiers for Glaucoma Detection: A Preliminary Study

PubMed Central

Burgansky-Eliash, Zvia; Wollstein, Gadi; Chu, Tianjiao; Ramsey, Joseph D.; Glymour, Clark; Noecker, Robert J.; Ishikawa, Hiroshi; Schuman, Joel S.

2007-01-01

Purpose Machine-learning classifiers are trained computerized systems with the ability to detect the relationship between multiple input parameters and a diagnosis. The present study investigated whether the use of machine-learning classifiers improves optical coherence tomography (OCT) glaucoma detection. Methods Forty-seven patients with glaucoma (47 eyes) and 42 healthy subjects (42 eyes) were included in this cross-sectional study. Of the glaucoma patients, 27 had early disease (visual field mean deviation [MD] ≥ −6 dB) and 20 had advanced glaucoma (MD < −6 dB). Machine-learning classifiers were trained to discriminate between glaucomatous and healthy eyes using parameters derived from OCT output. The classifiers were trained with all 38 parameters as well as with only 8 parameters that correlated best with the visual field MD. Five classifiers were tested: linear discriminant analysis, support vector machine, recursive partitioning and regression tree, generalized linear model, and generalized additive model. For the last two classifiers, a backward feature selection was used to find the minimal number of parameters that resulted in the best and most simple prediction. The cross-validated receiver operating characteristic (ROC) curve and accuracies were calculated. Results The largest area under the ROC curve (AROC) for glaucoma detection was achieved with the support vector machine using eight parameters (0.981). The sensitivity at 80% and 95% specificity was 97.9% and 92.5%, respectively. This classifier also performed best when judged by cross-validated accuracy (0.966). The best classification between early glaucoma and advanced glaucoma was obtained with the generalized additive model using only three parameters (AROC = 0.854). Conclusions Automated machine classifiers of OCT data might be useful for enhancing the utility of this technology for detecting glaucomatous abnormality. PMID:16249492
Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape

PubMed Central

Coupé, Christophe

2018-01-01

As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for ‘difficult’ variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we assess a range of candidate distributions, including the Sichel, Delaporte, Box-Cox Green and Cole, and Box-Cox t distributions. We find that the Box-Cox t distribution, with appropriate modeling of its parameters, best fits the conditional distribution of phonemic inventory size. We finally discuss the specificities of phoneme counts, weak effects, and how GAMLSS should be considered for other linguistic variables. PMID:29713298
Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape.

PubMed

Coupé, Christophe

2018-01-01

As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for 'difficult' variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we assess a range of candidate distributions, including the Sichel, Delaporte, Box-Cox Green and Cole, and Box-Cox t distributions. We find that the Box-Cox t distribution, with appropriate modeling of its parameters, best fits the conditional distribution of phonemic inventory size. We finally discuss the specificities of phoneme counts, weak effects, and how GAMLSS should be considered for other linguistic variables.
Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

ERIC Educational Resources Information Center

Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

2006-01-01

Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Classical Testing in Functional Linear Models.

PubMed

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models

PubMed Central

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155

Some links on this page may take you to non-federal websites. Their policies may differ from this site.