Partial covariate adjusted regression
Şentürk, Damla; Nguyen, Danh V.
2008-01-01
Covariate adjusted regression (CAR) is a recently proposed adjustment method for regression analysis where both the response and predictors are not directly observed (Şentürk and Müller, 2005). The available data has been distorted by unknown functions of an observable confounding covariate. CAR provides consistent estimators for the coefficients of the regression between the variables of interest, adjusted for the confounder. We develop a broader class of partial covariate adjusted regression (PCAR) models to accommodate both distorted and undistorted (adjusted/unadjusted) predictors. The PCAR model allows for unadjusted predictors, such as age, gender and demographic variables, which are common in the analysis of biomedical and epidemiological data. The available estimation and inference procedures for CAR are shown to be invalid for the proposed PCAR model. We propose new estimators and develop new inference tools for the more general PCAR setting. In particular, we establish the asymptotic normality of the proposed estimators and propose consistent estimators of their asymptotic variances. Finite sample properties of the proposed estimators are investigated using simulation studies and the method is also illustrated with a Pima Indians diabetes data set. PMID:20126296
On variance estimate for covariate adjustment by propensity score analysis.
Zou, Baiming; Zou, Fei; Shuster, Jonathan J; Tighe, Patrick J; Koch, Gary G; Zhou, Haibo
2016-09-10
Propensity score (PS) methods have been used extensively to adjust for confounding factors in the statistical analysis of observational data in comparative effectiveness research. There are four major PS-based adjustment approaches: PS matching, PS stratification, covariate adjustment by PS, and PS-based inverse probability weighting. Though covariate adjustment by PS is one of the most frequently used PS-based methods in clinical research, the conventional variance estimation of the treatment effects estimate under covariate adjustment by PS is biased. As Stampf et al. have shown, this bias in variance estimation is likely to lead to invalid statistical inference and could result in erroneous public health conclusions (e.g., food and drug safety and adverse events surveillance). To address this issue, we propose a two-stage analytic procedure to develop a valid variance estimator for the covariate adjustment by PS analysis strategy. We also carry out a simple empirical bootstrap resampling scheme. Both proposed procedures are implemented in an R function for public use. Extensive simulation results demonstrate the bias in the conventional variance estimator and show that both proposed variance estimators offer valid estimates for the true variance, and they are robust to complex confounding structures. The proposed methods are illustrated for a post-surgery pain study. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26999553
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background. Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients. PMID:26413142
A method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Differential correction schemes in nonlinear regression
NASA Technical Reports Server (NTRS)
Decell, H. P., Jr.; Speed, F. M.
1972-01-01
Classical iterative methods in nonlinear regression are reviewed and improved upon. This is accomplished by discussion of the geometrical and theoretical motivation for introducing modifications using generalized matrix inversion. Examples having inherent pitfalls are presented and compared in terms of results obtained using classical and modified techniques. The modification is shown to be useful alone or in conjunction with other modifications appearing in the literature.
Chaussé, Pierre; Liu, Jin; Luta, George
2016-01-01
Covariate adjustment methods are frequently used when baseline covariate information is available for randomized controlled trials. Using a simulation study, we compared the analysis of covariance (ANCOVA) with three nonparametric covariate adjustment methods with respect to point and interval estimation for the difference between means. The three alternative methods were based on important members of the generalized empirical likelihood (GEL) family, specifically on the empirical likelihood (EL) method, the exponential tilting (ET) method, and the continuous updated estimator (CUE) method. Two criteria were considered for the comparison of the four statistical methods: the root mean squared error and the empirical coverage of the nominal 95% confidence intervals for the difference between means. Based on the results of the simulation study, for sensitivity analysis purposes, we recommend the use of ANCOVA (with robust standard errors when heteroscedasticity is present) together with the CUE-based covariate adjustment method. PMID:27077870
ERIC Educational Resources Information Center
Nimon, Kim; Henson, Robin K.
2015-01-01
The authors empirically examined whether the validity of a residualized dependent variable after covariance adjustment is comparable to that of the original variable of interest. When variance of a dependent variable is removed as a result of one or more covariates, the residual variance may not reflect the same meaning. Using the pretest-posttest…
Sample Size for Confidence Interval of Covariate-Adjusted Mean Difference
ERIC Educational Resources Information Center
Liu, Xiaofeng Steven
2010-01-01
This article provides a way to determine adequate sample size for the confidence interval of covariate-adjusted mean difference in randomized experiments. The standard error of adjusted mean difference depends on covariate variance and balance, which are two unknown quantities at the stage of planning sample size. If covariate observations are…
ERIC Educational Resources Information Center
Safarkhani, Maryam; Moerbeek, Mirjam
2013-01-01
In a randomized controlled trial, a decision needs to be made about the total number of subjects for adequate statistical power. One way to increase the power of a trial is by including a predictive covariate in the model. In this article, the effects of various covariate adjustment strategies on increasing the power is studied for discrete-time…
Cardiovascular Response Identification Based on Nonlinear Support Vector Regression
NASA Astrophysics Data System (ADS)
Wang, Lu; Su, Steven W.; Chan, Gregory S. H.; Celler, Branko G.; Cheng, Teddy M.; Savkin, Andrey V.
This study experimentally investigates the relationships between central cardiovascular variables and oxygen uptake based on nonlinear analysis and modeling. Ten healthy subjects were studied using cycle-ergometry exercise tests with constant workloads ranging from 25 Watt to 125 Watt. Breath by breath gas exchange, heart rate, cardiac output, stroke volume and blood pressure were measured at each stage. The modeling results proved that the nonlinear modeling method (Support Vector Regression) outperforms traditional regression method (reducing Estimation Error between 59% and 80%, reducing Testing Error between 53% and 72%) and is the ideal approach in the modeling of physiological data, especially with small training data set.
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
An Excel Solver Exercise to Introduce Nonlinear Regression
ERIC Educational Resources Information Center
Pinder, Jonathan P.
2013-01-01
Business students taking business analytics courses that have significant predictive modeling components, such as marketing research, data mining, forecasting, and advanced financial modeling, are introduced to nonlinear regression using application software that is a "black box" to the students. Thus, although correct models are…
Covariate-Adjusted Linear Mixed Effects Model with an Application to Longitudinal Data
Nguyen, Danh V.; Şentürk, Damla; Carroll, Raymond J.
2009-01-01
Linear mixed effects (LME) models are useful for longitudinal data/repeated measurements. We propose a new class of covariate-adjusted LME models for longitudinal data that nonparametrically adjusts for a normalizing covariate. The proposed approach involves fitting a parametric LME model to the data after adjusting for the nonparametric effects of a baseline confounding covariate. In particular, the effect of the observable covariate on the response and predictors of the LME model is modeled nonparametrically via smooth unknown functions. In addition to covariate-adjusted estimation of fixed/population parameters and random effects, an estimation procedure for the variance components is also developed. Numerical properties of the proposed estimators are investigated with simulation studies. The consistency and convergence rates of the proposed estimators are also established. An application to a longitudinal data set on calcium absorption, accounting for baseline distortion from body mass index, illustrates the proposed methodology. PMID:19266053
Detecting influential observations in nonlinear regression modeling of groundwater flow
Yager, R.M.
1998-01-01
Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.
Effects of model sensitivity and nonlinearity on nonlinear regression of ground water flow
Yager, R.M.
2004-01-01
Nonlinear regression is increasingly applied to the calibration of hydrologic models through the use of perturbation methods to compute the Jacobian or sensitivity matrix required by the Gauss-Newton optimization method. Sensitivities obtained by perturbation methods can be less accurate than those obtained by direct differentiation, however, and concern has arisen that the optimal parameter values and the associated parameter covariance matrix computed by perturbation could also be less accurate. Sensitivities computed by both perturbation and direct differentiation were applied in nonlinear regression calibration of seven ground water flow models. The two methods gave virtually identical optimum parameter values and covariances for the three models that were relatively linear and two of the models that were relatively nonlinear, but gave widely differing results for two other nonlinear models. The perturbation method performed better than direct differentiation in some regressions with the nonlinear models, apparently because approximate sensitivities computed for an interval yielded better search directions than did more accurately computed sensitivities for a point. The method selected to avoid overshooting minima on the error surface when updating parameter values with the Gauss-Newton procedure appears for nonlinear models to be more important than the method of sensitivity calculation in controlling regression convergence.
Covariate Adjusted Correlation Analysis with Application to FMR1 Premutation Female Carrier Data
Şentürk, Damla; Nguyen, Danh V.; Tassone, Flora; Hagerman, Randi J.; Carroll, Raymond J.; Hagerman, Paul J.
2009-01-01
Summary Motivated by molecular data on female premutation carriers of the fragile X mental retardation 1 (FMR1) gene, we present a new method of covariate adjusted correlation analysis to examine the association of messenger RNA (mRNA) and number of CGG repeat expansion in the FMR1 gene. The association between the molecular variables in female carriers needs to adjust for activation ratio (ActRatio), a measure which accounts for the protective effects of one normal X chromosome in females carriers. However, there are inherent uncertainties in the exact effects of ActRatio on the molecular measures of interest. To account for these uncertainties, we develop a flexible adjustment that accommodates both additive and multiplicative effects of ActRatio nonparametrically. The proposed adjusted correlation uses local conditional correlations, which are local method of moments estimators, to estimate the Pearson correlation between two variables adjusted for a third observable covariate. The local method of moments estimators are averaged to arrive at the final covariate adjusted correlation estimator, which is shown to be consistent. We also develop a test to check the nonparametric joint additive and multiplicative adjustment form. Simulation studies illustrate the efficacy of the proposed method. (Application to FMR1 premutation data on 165 female carriers indicates that the association between mRNA and CGG repeat after adjusting for ActRatio is stronger.) Finally, the results provide independent support for a specific jointly additive and multiplicative adjustment form for ActRatio previously proposed in the literature. PMID:19173699
Development and Application of Nonlinear Land-Use Regression Models
NASA Astrophysics Data System (ADS)
Champendal, Alexandre; Kanevski, Mikhail; Huguenot, Pierre-Emmanuel
2014-05-01
The problem of air pollution modelling in urban zones is of great importance both from scientific and applied points of view. At present there are several fundamental approaches either based on science-based modelling (air pollution dispersion) or on the application of space-time geostatistical methods (e.g. family of kriging models or conditional stochastic simulations). Recently, there were important developments in so-called Land Use Regression (LUR) models. These models take into account geospatial information (e.g. traffic network, sources of pollution, average traffic, population census, land use, etc.) at different scales, for example, using buffering operations. Usually the dimension of the input space (number of independent variables) is within the range of (10-100). It was shown that LUR models have some potential to model complex and highly variable patterns of air pollution in urban zones. Most of LUR models currently used are linear models. In the present research the nonlinear LUR models are developed and applied for Geneva city. Mainly two nonlinear data-driven models were elaborated: multilayer perceptron and random forest. An important part of the research deals also with a comprehensive exploratory data analysis using statistical, geostatistical and time series tools. Unsupervised self-organizing maps were applied to better understand space-time patterns of the pollution. The real data case study deals with spatial-temporal air pollution data of Geneva (2002-2011). Nitrogen dioxide (NO2) has caught our attention. It has effects on human health and on plants; NO2 contributes to the phenomenon of acid rain. The negative effects of nitrogen dioxides on plants are the reduction of the growth, production and pesticide resistance. And finally, the effects on materials: nitrogen dioxide increases the corrosion. The data used for this study consist of a set of 106 NO2 passive sensors. 80 were used to build the models and the remaining 36 have constituted
The Allometry of Coarse Root Biomass: Log-Transformed Linear Regression or Nonlinear Regression?
Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J.; Ma, Keping
2013-01-01
Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees. PMID:24116197
ERIC Educational Resources Information Center
Petscher, Yaacov; Schatschneider, Christopher
2011-01-01
Research by Huck and McLean (1975) demonstrated that the covariance-adjusted score is more powerful than the simple difference score, yet recent reviews indicate researchers are equally likely to use either score type in two-wave randomized experimental designs. A Monte Carlo simulation was conducted to examine the conditions under which the…
Biswas, Atanu; Park, Eunsik; Bhattacharya, Rahul
2012-08-01
Response-adaptive designs have become popular for allocation of the entering patients among two or more competing treatments in a phase III clinical trial. Although there are a lot of designs for binary treatment responses, the number of designs involving covariates is very small. Sometimes the patients give repeated responses. The only available response-adaptive allocation design for repeated binary responses is the urn design by Biswas and Dewanji [Biswas A and Dewanji AA. Randomized longitudinal play-the-winner design for repeated binary data. ANZJS 2004; 46: 675-684; Biswas A and Dewanji A. Inference for a RPW-type clinical trial with repeated monitoring for the treatment of rheumatoid arthritis. Biometr J 2004; 46: 769-779.], although it does not take care of the covariates of the patients in the allocation design. In this article, a covariate-adjusted response-adaptive randomisation procedure is developed using the log-odds ratio within the Bayesian framework for longitudinal binary responses. The small sample performance of the proposed allocation procedure is assessed through a simulation study. The proposed procedure is illustrated using some real data set. PMID:20974667
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment
ERIC Educational Resources Information Center
Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos
2013-01-01
In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…
2013-01-01
Background Abattoir condemnation data show promise as a rich source of data for syndromic surveillance of both animal and zoonotic diseases. However, inherent characteristics of abattoir condemnation data can bias results from space-time cluster detection methods for disease surveillance, and may need to be accounted for using various adjustment methods. The objective of this study was to compare the space-time scan statistics with different abilities to control for covariates and to assess their suitability for food animal syndromic surveillance. Four space-time scan statistic models were used including: animal class adjusted Poisson, space-time permutation, multi-level model adjusted Poisson, and a weighted normal scan statistic using model residuals. The scan statistics were applied to monthly bovine pneumonic lung and “parasitic liver” condemnation data from Ontario provincial abattoirs from 2001–2007. Results The number and space-time characteristics of identified clusters often varied between space-time scan tests for both “parasitic liver” and pneumonic lung condemnation data. While there were some similarities between isolated clusters in space, time and/or space-time, overall the results from space-time scan statistics differed substantially depending on the covariate adjustment approach used. Conclusions Variability in results among methods suggests that caution should be used in selecting space-time scan methods for abattoir surveillance. Furthermore, validation of different approaches with simulated or real outbreaks is required before conclusive decisions can be made concerning the best approach for conducting surveillance with these data. PMID:24246040
Confidence region estimation techniques for nonlinear regression :three case studies.
Swiler, Laura Painton (Sandia National Laboratories, Albuquerque, NM); Sullivan, Sean P. (University of Texas, Austin, TX); Stucky-Mack, Nicholas J. (Harvard University, Cambridge, MA); Roberts, Randall Mark; Vugrin, Kay White
2005-10-01
This work focuses on different methods to generate confidence regions for nonlinear parameter identification problems. Three methods for confidence region estimation are considered: a linear approximation method, an F-test method, and a Log-Likelihood method. Each of these methods are applied to three case studies. One case study is a problem with synthetic data, and the other two case studies identify hydraulic parameters in groundwater flow problems based on experimental well-test results. The confidence regions for each case study are analyzed and compared. Although the F-test and Log-Likelihood methods result in similar regions, there are differences between these regions and the regions generated by the linear approximation method for nonlinear problems. The differing results, capabilities, and drawbacks of all three methods are discussed.
Cooley, R.L.
1983-01-01
Investigates factors influencing the degree of improvement in estimates of parameters of a nonlinear regression groundwater flow model by incorporating prior information of unknown reliability. Consideration of expected behavior of the regression solutions and results of a hypothetical modeling problem lead to several general conclusions. -from Author
Nonlinear regression on Riemannian manifolds and its applications to Neuro-image analysis ★
Banerjee, Monami; Chakraborty, Rudrasis; Ofori, Edward; Vaillancourt, David
2016-01-01
Regression in its most common form where independent and dependent variables are in ℝn is a ubiquitous tool in Sciences and Engineering. Recent advances in Medical Imaging has lead to a wide spread availability of manifold-valued data leading to problems where the independent variables are manifold-valued and dependent are real-valued or vice-versa. The most common method of regression on a manifold is the geodesic regression, which is the counterpart of linear regression in Euclidean space. Often, the relation between the variables is highly complex, and existing most commonly used geodesic regression can prove to be inaccurate. Thus, it is necessary to resort to a non-linear model for regression. In this work we present a novel Kernel based non-linear regression method when the mapping to be estimated is either from M → ℝn or ℝn → M, where M is a Riemannian manifold. A key advantage of this approach is that there is no requirement for the manifold-valued data to necessarily inherit an ordering from the data in ℝn. We present several synthetic and real data experiments along with comparisons to the state-of-the-art geodesic regression method in literature and thus validating the effectiveness of the proposed algorithm. PMID:27110601
Cooley, R.L.
1982-01-01
Prior information on the parameters of a groundwater flow model can be used to improve parameter estimates obtained from nonlinear regression solution of a modeling problem. Two scales of prior information can be available: 1) prior information having known reliability (that is, bias and random error structure), and 2) prior information consisting of best available estimates of unknown reliability. It is shown that if both scales of prior information are available, then a combined regression analysis may be made. -from Author
Huang, C.; Townshend, J.R.G.
2003-01-01
A stepwise regression tree (SRT) algorithm was developed for approximating complex nonlinear relationships. Based on the regression tree of Breiman et al . (BRT) and a stepwise linear regression (SLR) method, this algorithm represents an improvement over SLR in that it can approximate nonlinear relationships and over BRT in that it gives more realistic predictions. The applicability of this method to estimating subpixel forest was demonstrated using three test data sets, on all of which it gave more accurate predictions than SLR and BRT. SRT also generated more compact trees and performed better than or at least as well as BRT at all 10 equal forest proportion interval ranging from 0 to 100%. This method is appealing to estimating subpixel land cover over large areas.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. PMID:23379851
Lim, Changwon
2015-03-30
Nonlinear regression is often used to evaluate the toxicity of a chemical or a drug by fitting data from a dose-response study. Toxicologists and pharmacologists may draw a conclusion about whether a chemical is toxic by testing the significance of the estimated parameters. However, sometimes the null hypothesis cannot be rejected even though the fit is quite good. One possible reason for such cases is that the estimated standard errors of the parameter estimates are extremely large. In this paper, we propose robust ridge regression estimation procedures for nonlinear models to solve this problem. The asymptotic properties of the proposed estimators are investigated; in particular, their mean squared errors are derived. The performances of the proposed estimators are compared with several standard estimators using simulation studies. The proposed methodology is also illustrated using high throughput screening assay data obtained from the National Toxicology Program. PMID:25490981
Christophersen, A; McKinley-McKee, J S
1984-01-01
An interactive program for analysing enzyme activity-time data using non-linear regression analysis is described. Protection studies can also be dealt with. The program computes inactivation rates, dissociation constants and promotion or inhibition parameters with their standard errors. It can also be used to distinguish different inactivation models. The program is written in SIMULA and is menu-oriented for refining or correcting data at the different levels of computing. PMID:6546558
A comparison of several methods of solving nonlinear regression groundwater flow problems.
Cooley, R.L.
1985-01-01
Computational efficiency and computer memory requirements for four methods of minimizing functions were compared for four test nonlinear-regression steady state groundwater flow problems. The fastest methods were the Marquardt and quasi-linearization methods, which required almost identical computer times and numbers of iterations; the next fastest was the quasi-Newton method, and last was the Fletcher-Reeves method, which did not converge in 100 iterations for two of the problems.-from Author
A Nonlinear Causality Estimator Based on Non-Parametric Multiplicative Regression
Nicolaou, Nicoletta; Constandinou, Timothy G.
2016-01-01
Causal prediction has become a popular tool for neuroscience applications, as it allows the study of relationships between different brain areas during rest, cognitive tasks or brain disorders. We propose a nonparametric approach for the estimation of nonlinear causal prediction for multivariate time series. In the proposed estimator, CNPMR, Autoregressive modeling is replaced by Nonparametric Multiplicative Regression (NPMR). NPMR quantifies interactions between a response variable (effect) and a set of predictor variables (cause); here, we modified NPMR for model prediction. We also demonstrate how a particular measure, the sensitivity Q, could be used to reveal the structure of the underlying causal relationships. We apply CNPMR on artificial data with known ground truth (5 datasets), as well as physiological data (2 datasets). CNPMR correctly identifies both linear and nonlinear causal connections that are present in the artificial data, as well as physiologically relevant connectivity in the real data, and does not seem to be affected by filtering. The Sensitivity measure also provides useful information about the latent connectivity.The proposed estimator addresses many of the limitations of linear Granger causality and other nonlinear causality estimators. CNPMR is compared with pairwise and conditional Granger causality (linear) and Kernel-Granger causality (nonlinear). The proposed estimator can be applied to pairwise or multivariate estimations without any modifications to the main method. Its nonpametric nature, its ability to capture nonlinear relationships and its robustness to filtering make it appealing for a number of applications. PMID:27378901
Harlim, John; Mahdi, Adam; Majda, Andrew J.
2014-01-15
A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model.
CANFIS: A non-linear regression procedure to produce statistical air-quality forecast models
Burrows, W.R.; Montpetit, J.; Pudykiewicz, J.
1997-12-31
Statistical models for forecasts of environmental variables can provide a good trade-off between significance and precision in return for substantial saving of computer execution time. Recent non-linear regression techniques give significantly increased accuracy compared to traditional linear regression methods. Two are Classification and Regression Trees (CART) and the Neuro-Fuzzy Inference System (NFIS). Both can model predict and distributions, including the tails, with much better accuracy than linear regression. Given a learning data set of matched predict and predictors, CART regression produces a non-linear, tree-based, piecewise-continuous model of the predict and data. Its variance-minimizing procedure optimizes the task of predictor selection, often greatly reducing initial data dimensionality. NFIS reduces dimensionality by a procedure known as subtractive clustering but it does not of itself eliminate predictors. Over-lapping coverage in predictor space is enhanced by NFIS with a Gaussian membership function for each cluster component. Coefficients for a continuous response model based on the fuzzified cluster centers are obtained by a least-squares estimation procedure. CANFIS is a two-stage data-modeling technique that combines the strength of CART to optimize the process of selecting predictors from a large pool of potential predictors with the modeling strength of NFIS. A CANFIS model requires negligible computer time to run. CANFIS models for ground-level O{sub 3}, particulates, and other pollutants will be produced for each of about 100 Canadian sites. The air-quality models will run twice daily using a small number of predictors isolated from a large pool of upstream and local Lagrangian potential predictors.
Aboveground biomass and carbon stocks modelling using non-linear regression model
NASA Astrophysics Data System (ADS)
Ain Mohd Zaki, Nurul; Abd Latif, Zulkiflee; Nazip Suratman, Mohd; Zainee Zainal, Mohd
2016-06-01
Aboveground biomass (AGB) is an important source of uncertainty in the carbon estimation for the tropical forest due to the variation biodiversity of species and the complex structure of tropical rain forest. Nevertheless, the tropical rainforest holds the most extensive forest in the world with the vast diversity of tree with layered canopies. With the usage of optical sensor integrate with empirical models is a common way to assess the AGB. Using the regression, the linkage between remote sensing and a biophysical parameter of the forest may be made. Therefore, this paper exemplifies the accuracy of non-linear regression equation of quadratic function to estimate the AGB and carbon stocks for the tropical lowland Dipterocarp forest of Ayer Hitam forest reserve, Selangor. The main aim of this investigation is to obtain the relationship between biophysical parameter field plots with the remotely-sensed data using nonlinear regression model. The result showed that there is a good relationship between crown projection area (CPA) and carbon stocks (CS) with Pearson Correlation (p < 0.01), the coefficient of correlation (r) is 0.671. The study concluded that the integration of Worldview-3 imagery with the canopy height model (CHM) raster based LiDAR were useful in order to quantify the AGB and carbon stocks for a larger sample area of the lowland Dipterocarp forest.
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.
Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C
2014-03-01
In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices. PMID:24608685
NASA Astrophysics Data System (ADS)
Deglint, Jason; Kazemzadeh, Farnoud; Wong, Alexander; Clausi, David A.
2015-09-01
One method to acquire multispectral images is to sequentially capture a series of images where each image contains information from a different bandwidth of light. Another method is to use a series of beamsplitters and dichroic filters to guide different bandwidths of light onto different cameras. However, these methods are very time consuming and expensive and perform poorly in dynamic scenes or when observing transient phenomena. An alternative strategy to capturing multispectral data is to infer this data using sparse spectral reflectance measurements captured using an imaging device with overlapping bandpass filters, such as a consumer digital camera using a Bayer filter pattern. Currently the only method of inferring dense reflectance spectra is the Wiener adaptive filter, which makes Gaussian assumptions about the data. However, these assumptions may not always hold true for all data. We propose a new technique to infer dense reflectance spectra from sparse spectral measurements through the use of a non-linear regression model. The non-linear regression model used in this technique is the random forest model, which is an ensemble of decision trees and trained via the spectral characterization of the optical imaging system and spectral data pair generation. This model is then evaluated by spectrally characterizing different patches on the Macbeth color chart, as well as by reconstructing inferred multispectral images. Results show that the proposed technique can produce inferred dense reflectance spectra that correlate well with the true dense reflectance spectra, which illustrates the merits of the technique.
Coons, D M; Boulton, R B; Bisson, L F
1995-01-01
The kinetics of glucose uptake in Saccharomyces cerevisiae are complex. An Eadie-Hofstee (rate of uptake versus rate of uptake over substrate concentration) plot of glucose uptake shows a nonlinear form typical of a multicomponent system. The nature of the constituent components is a subject of debate. It has recently been suggested that this nonlinearity is due to either a single saturable component together with free diffusion of glucose or a single constitutive component with a variable Km, rather than the action of multiple hexose transporters. Genetic data support the existence of a family of differentially regulated glucose transporters, encoded by the HXT genes. In this work, kinetic expressions and nonlinear regression analysis, based on an improved zero trans-influx assay, were used to address the nature of the components of the transport system. The results indicate that neither one component with free diffusion nor a single permease with a variable Km can explain the observed uptake rates. Results of uptake experiments, including the use of putative alternative substrates as inhibitory compounds, support the model derived from genetic analyses of a multicomponent system with at least two components, one a high-affinity carrier and the other a low-affinity carrier. This approach was extended to characterize the activity of the SNF3 protein and identify its role in the depression of high-affinity uptake. The kinetic data support a role of SNF3 as a regulatory protein that may not itself be a transporter. PMID:7768825
NONLINEAR-REGRESSION GROUNDWATER FLOW MODELING OF A DEEP REGIONAL AQUIFER SYSTEM.
Cooley, Richard L.; Konikow, Leonard F.; Naff, Richard L.
1986-01-01
A nonlinear regression groundwater flow model, based on a Galerkin finite-element discretization, was used to analyze steady state two-dimensional groundwater flow in the areally extensive Madison aquifer in a 75,000 mi**2 area of the Northern Great Plains. Regression parameters estimated include intrinsic permeabilities of the main aquifer and separate lineament zones, discharges from eight major springs surrounding the Black Hills, and specified heads on the model boundaries. Aquifer thickness and temperature variations were included as specified functions. The regression model was applied using sequential F testing so that the fewest number and simplest zonation of intrinsic permeabilities, combined with the simplest overall model, were evaluated initially; additional complexities (such as subdivisions of zones and variations in temperature and thickness) were added in stages to evaluate the subsequent degree of improvement in the model results. It was found that only the eight major springs, a single main aquifer intrinsic permeability, two separate lineament intrinsic permeabilities of much smaller values, and temperature variations are warranted by the observed data (hydraulic heads and prior information on some parameters) for inclusion in a model that attempts to explain significant controls on groundwater flow.
Nonlinear regression modeling of nutrient loads in streams: A Bayesian approach
Qian, S.S.; Reckhow, K.H.; Zhai, J.; McMahon, G.
2005-01-01
A Bayesian nonlinear regression modeling method is introduced and compared with the least squares method for modeling nutrient loads in stream networks. The objective of the study is to better model spatial correlation in river basin hydrology and land use for improving the model as a forecasting tool. The Bayesian modeling approach is introduced in three steps, each with a more complicated model and data error structure. The approach is illustrated using a data set from three large river basins in eastern North Carolina. Results indicate that the Bayesian model better accounts for model and data uncertainties than does the conventional least squares approach. Applications of the Bayesian models for ambient water quality standards compliance and TMDL assessment are discussed. Copyright 2005 by the American Geophysical Union.
A nonlinear regression approach to test for size-dependence of competitive ability.
Lamb, Eric G; Cahill, James F; Dale, Mark R T
2006-06-01
An individual's competitive ability is often dependent on its size, but the methods commonly used to analyze plant competition experiments generally assume that the outcome of interactions are size independent. A method for the analysis of experiments with paired competition treatments based on nonlinear regression with a power function is presented. This method allows straightforward tests of whether a competitive interaction is size dependent, and for the significance of experimental treatments. The method is applied to three example data sets: (1) an experiment where pairs of plants were grown with and without competition at five fertilization levels, (2) an experiment where the fecundity of two snail species were compared between environments at two densities, and (3) an addition series experiment where two plant species were grown in proportional mixtures at several densities. Competitive ability was size-dependent in two of these examples, which demonstrates that a wide range of ecologically important information can be lost when the assumption of size-dependence is ignored. Regression with a power curve should always be used to test whether competitive interactions are size independent, and for the further analysis of size-dependent interactions. PMID:16869420
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
A Nonlinear Adaptive Beamforming Algorithm Based on Least Squares Support Vector Regression
Wang, Lutao; Jin, Gang; Li, Zhengzhou; Xu, Hongbin
2012-01-01
To overcome the performance degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences, a novel beamforming approach based on nonlinear least-square support vector regression machine (LS-SVR) is derived in this paper. In this approach, the conventional linearly constrained minimum variance cost function used by minimum variance distortionless response (MVDR) beamformer is replaced by a squared-loss function to increase robustness in complex scenarios and provide additional control over the sidelobe level. Gaussian kernels are also used to obtain better generalization capacity. This novel approach has two highlights, one is a recursive regression procedure to estimate the weight vectors on real-time, the other is a sparse model with novelty criterion to reduce the final size of the beamformer. The analysis and simulation tests show that the proposed approach offers better noise suppression capability and achieve near optimal signal-to-interference-and-noise ratio (SINR) with a low computational burden, as compared to other recently proposed robust beamforming techniques.
Burger, Divan Aristo; Schall, Robert
2015-01-01
Trials of the early bactericidal activity (EBA) of tuberculosis (TB) treatments assess the decline, during the first few days to weeks of treatment, in colony forming unit (CFU) count of Mycobacterium tuberculosis in the sputum of patients with smear-microscopy-positive pulmonary TB. Profiles over time of CFU data have conventionally been modeled using linear, bilinear, or bi-exponential regression. We propose a new biphasic nonlinear regression model for CFU data that comprises linear and bilinear regression models as special cases and is more flexible than bi-exponential regression models. A Bayesian nonlinear mixed-effects (NLME) regression model is fitted jointly to the data of all patients from a trial, and statistical inference about the mean EBA of TB treatments is based on the Bayesian NLME regression model. The posterior predictive distribution of relevant slope parameters of the Bayesian NLME regression model provides insight into the nature of the EBA of TB treatments; specifically, the posterior predictive distribution allows one to judge whether treatments are associated with monolinear or bilinear decline of log(CFU) count, and whether CFU count initially decreases fast, followed by a slower rate of decrease, or vice versa. PMID:25322214
Technology Transfer Automated Retrieval System (TEKTRAN)
Non-linear regression techniques are used widely to fit weed field emergence patterns to soil microclimatic indices using S-type functions. Artificial neural networks present interesting and alternative features for such modeling purposes. In this work, a univariate hydrothermal-time based Weibull m...
Technology Transfer Automated Retrieval System (TEKTRAN)
Parametric non-linear regression (PNR) techniques commonly are used to develop weed seedling emergence models. Such techniques, however, require statistical assumptions that are difficult to meet. To examine and overcome these limitations, we compared PNR with a nonparametric estimation technique. F...
ERIC Educational Resources Information Center
Strang, Kenneth David
2009-01-01
This paper discusses how a seldom-used statistical procedure, recursive regression (RR), can numerically and graphically illustrate data-driven nonlinear relationships and interaction of variables. This routine falls into the family of exploratory techniques, yet a few interesting features make it a valuable compliment to factor analysis and…
Hirohashi, Yoshihiro; Tanaka, Akira; Yoshizawa, Makoto; Sugita, Norihiro; Abe, Makoto; Kato, Tsuyoshi; Shiraishi, Yasuyuki; Miura, Hidekazu; Yambe, Tomoyuki
2016-06-01
Recently, driving methods for synchronizing ventricular assist devices (VADs) with heart rhythm of patients suffering from severe heart failure have been receiving attention. Most of the conventional methods require implanting a sensor for measurement of a signal, such as electrocardiogram, to achieve synchronization. In general, implanting sensors into the cardiovascular system of the patients is undesirable in clinical situations. The objective of this study was to extract the heartbeat component without any additional sensors, and to synchronize the rotational speed of the VAD with this component. Although signals from the VAD such as the consumption current and the rotational speed are affected by heartbeat, these raw signals cannot be utilized directly in the heartbeat synchronization control methods because they are changed by not only the effect of heartbeat but also the change in the rotational speed itself. In this study, a nonlinear kernel regression model was adopted to estimate the instantaneous rotational speed from the raw signals. The heartbeat component was extracted by computing the estimation error of the model with parameters determined by using the signals when there was no effect of heartbeat. Validations were conducted on a mock circulatory system, and the heartbeat component was extracted well by the proposed method. Also, heartbeat synchronization control was achieved without any additional sensors in the test environment. PMID:26758256
A fast nonlinear regression method for estimating permeability in CT perfusion imaging
Bennink, Edwin; Riordan, Alan J; Horsch, Alexander D; Dankbaar, Jan Willem; Velthuis, Birgitta K; de Jong, Hugo W
2013-01-01
Blood–brain barrier damage, which can be quantified by measuring vascular permeability, is a potential predictor for hemorrhagic transformation in acute ischemic stroke. Permeability is commonly estimated by applying Patlak analysis to computed tomography (CT) perfusion data, but this method lacks precision. Applying more elaborate kinetic models by means of nonlinear regression (NLR) may improve precision, but is more time consuming and therefore less appropriate in an acute stroke setting. We propose a simplified NLR method that may be faster and still precise enough for clinical use. The aim of this study is to evaluate the reliability of in total 12 variations of Patlak analysis and NLR methods, including the simplified NLR method. Confidence intervals for the permeability estimates were evaluated using simulated CT attenuation–time curves with realistic noise, and clinical data from 20 patients. Although fixating the blood volume improved Patlak analysis, the NLR methods yielded significantly more reliable estimates, but took up to 12 × longer to calculate. The simplified NLR method was ∼4 × faster than other NLR methods, while maintaining the same confidence intervals (CIs). In conclusion, the simplified NLR method is a new, reliable way to estimate permeability in stroke, fast enough for clinical application in an acute stroke setting. PMID:23881247
De Mello, Fernanda; Oliveira, Carlos A L; Ribeiro, Ricardo P; Resende, Emiko K; Povh, Jayme A; Fornari, Darci C; Barreto, Rogério V; McManus, Concepta; Streit, Danilo
2015-01-01
Was evaluated the pattern of growth among females and males of tambaqui by Gompertz nonlinear regression model. Five traits of economic importance were measured on 145 animals during the three years, totaling 981 morphometric data analyzed. Different curves were adjusted between males and females for body weight, height and head length and only one curve was adjusted to the width and body length. The asymptotic weight (a) and relative growth rate to maturity (k) were different between sexes in animals with ± 5 kg; slaughter weight practiced by a specific niche market, very profitable. However, there was no difference between males and females up to ± 2 kg; slaughter weight established to supply the bigger consumer market. Females showed weight greater than males (± 280 g), which are more suitable for fish farming purposes defined for the niche market to larger animals. In general, males had lower maximum growth rate (8.66 g / day) than females (9.34 g / day), however, reached faster than females, 476 and 486 days growth rate, respectively. The height and length body are the traits that contributed most to the weight at 516 days (P <0.001). PMID:26628036
Xue, Hongqi; Wu, Yichao; Wu, Hulin
2013-01-01
In many regression problems, the relations between the covariates and the response may be nonlinear. Motivated by the application of reconstructing a gene regulatory network, we consider a sparse high-dimensional additive model with the additive components being some known nonlinear functions with unknown parameters. To identify the subset of important covariates, we propose a new method for simultaneous variable selection and parameter estimation by iteratively combining a large-scale variable screening (the nonlinear independence screening, NLIS) and a moderate-scale model selection (the nonnegative garrote, NNG) for the nonlinear additive regressions. We have shown that the NLIS procedure possesses the sure screening property and it is able to handle problems with non-polynomial dimensionality; and for finite dimension problems, the NNG for the nonlinear additive regressions has selection consistency for the unimportant covariates and also estimation consistency for the parameter estimates of the important covariates. The proposed method is applied to simulated data and a real data example for identifying gene regulations to illustrate its numerical performance. PMID:25170239
Yobbi, D.K.
2000-01-01
A nonlinear least-squares regression technique for estimation of ground-water flow model parameters was applied to an existing model of the regional aquifer system underlying west-central Florida. The regression technique minimizes the differences between measured and simulated water levels. Regression statistics, including parameter sensitivities and correlations, were calculated for reported parameter values in the existing model. Optimal parameter values for selected hydrologic variables of interest are estimated by nonlinear regression. Optimal estimates of parameter values are about 140 times greater than and about 0.01 times less than reported values. Independently estimating all parameters by nonlinear regression was impossible, given the existing zonation structure and number of observations, because of parameter insensitivity and correlation. Although the model yields parameter values similar to those estimated by other methods and reproduces the measured water levels reasonably accurately, a simpler parameter structure should be considered. Some possible ways of improving model calibration are to: (1) modify the defined parameter-zonation structure by omitting and/or combining parameters to be estimated; (2) carefully eliminate observation data based on evidence that they are likely to be biased; (3) collect additional water-level data; (4) assign values to insensitive parameters, and (5) estimate the most sensitive parameters first, then, using the optimized values for these parameters, estimate the entire data set.
Kleinman, Lawrence C; Norton, Edward C
2009-01-01
Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213
A non-linear regression method for CT brain perfusion analysis
NASA Astrophysics Data System (ADS)
Bennink, E.; Oosterbroek, J.; Viergever, M. A.; Velthuis, B. K.; de Jong, H. W. A. M.
2015-03-01
CT perfusion (CTP) imaging allows for rapid diagnosis of ischemic stroke. Generation of perfusion maps from CTP data usually involves deconvolution algorithms providing estimates for the impulse response function in the tissue. We propose the use of a fast non-linear regression (NLR) method that we postulate has similar performance to the current academic state-of-art method (bSVD), but that has some important advantages, including the estimation of vascular permeability, improved robustness to tracer-delay, and very few tuning parameters, that are all important in stroke assessment. The aim of this study is to evaluate the fast NLR method against bSVD and a commercial clinical state-of-art method. The three methods were tested against a published digital perfusion phantom earlier used to illustrate the superiority of bSVD. In addition, the NLR and clinical methods were also tested against bSVD on 20 clinical scans. Pearson correlation coefficients were calculated for each of the tested methods. All three methods showed high correlation coefficients (>0.9) with the ground truth in the phantom. With respect to the clinical scans, the NLR perfusion maps showed higher correlation with bSVD than the perfusion maps from the clinical method. Furthermore, the perfusion maps showed that the fast NLR estimates are robust to tracer-delay. In conclusion, the proposed fast NLR method provides a simple and flexible way of estimating perfusion parameters from CT perfusion scans, with high correlation coefficients. This suggests that it could be a better alternative to the current clinical and academic state-of-art methods.
NASA Astrophysics Data System (ADS)
Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou
2013-05-01
A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.
Comparison of Linear and Non-Linear Regression Models to Estimate Leaf Area Index of Dryland Shrubs.
NASA Astrophysics Data System (ADS)
Dashti, H.; Glenn, N. F.; Ilangakoon, N. T.; Mitchell, J.; Dhakal, S.; Spaete, L.
2015-12-01
Leaf area index (LAI) is a key parameter in global ecosystem studies. LAI is considered a forcing variable in land surface processing models since ecosystem dynamics are highly correlated to LAI. In response to environmental limitations, plants in semiarid ecosystems have smaller leaf area, making accurate estimation of LAI by remote sensing a challenging issue. Optical remote sensing (400-2500 nm) techniques to estimate LAI are based either on radiative transfer models (RTMs) or statistical approaches. Considering the complex radiation field of dry ecosystems, simple 1-D RTMs lead to poor results, and on the other hand, inversion of more complex 3-D RTMs is a demanding task which requires the specification of many variables. A good alternative to physical approaches is using methods based on statistics. Similar to many natural phenomena, there is a non-linear relationship between LAI and top of canopy electromagnetic waves reflected to optical sensors. Non-linear regression models can better capture this relationship. However, considering the problem of a few numbers of observations in comparison to the feature space (n
non-linear regression techniques were investigated to estimate LAI. Our study area is located in southwestern Idaho, Great Basin. Sagebrush (Artemisia tridentata spp) serves a critical role in maintaining the structure of this ecosystem. Using a leaf area meter (Accupar LP-80), LAI values were measured in the field. Linear Partial Least Square regression and non-linear, tree based Random Forest regression have been implemented to estimate the LAI of sagebrush from hyperspectral data (AVIRIS-ng) collected in late summer 2014. Cross validation of results indicate that PLS can provide comparable results to Random Forest.
CONFIDENCE INTERVALS FOR A CROP YIELD LOSS FUNCTION IN NONLINEAR REGRESSION
Quantifying the relationship between chronic pollutant exposure and the ensuing biological response requires consideration of nonlinear functions that are flexible enough to generate a wide range of response curves. he linear approximation (i.e., Wald's) interval estimates for oz...
Nonlinear regression-based method for pseudoenhancement correction in CT colonography.
Tsagaan, Baigalmaa; Näppi, Janne; Yoshida, Hiroyuki
2009-08-01
In CT colonography (CTC), orally administered positive-contrast tagging agents are often used for differentiating residual bowel contents from native colonic structures. However, tagged materials can sometimes hyperattenuate observed CT numbers of their adjacent untagged materials. Such pseudoenhancement complicates the differentiation of colonic soft-tissue structures from tagged materials, because pseudoenhanced colonic structures may have CT numbers that are similar to those of tagged materials. The authors developed a nonlinear regression-based (NLRB) method for performing a local image-based pseudoenhancement correction of CTC data. To calibrate the correction parameters, the CT data of an anthropomorphic reference phantom were correlated with those of partially tagged phantoms. The CTC data were registered spatially by use of an adaptive multiresolution method, and untagged and tagged partial-volume soft-tissue surfaces were correlated by use of a virtual tagging scheme. The NLRB method was then optimized to minimize the difference in the CT numbers of soft-tissue regions between the untagged and tagged phantom CTC data by use of the Nelder-Mead downhill simplex method. To validate the method, the CT numbers of untagged regions were compared with those of registered pseudoenhanced phantom regions before and after the correction. The CT numbers were significantly different before performing the correction (p<0.01), whereas, after the correction, the difference between the CT numbers was not significant. The effect of the correction was also tested on the size measurement of polyps that were covered by tagging in phantoms and in clinical cases. In phantom cases, before the correction, the diameters of 12 simulated polyps submerged in tagged fluids that were measured in a soft-tissue CT display were significantly different from those measured in an untagged phantom (p<0.01), whereas after the correction the difference was not significant. In clinical cases
Ding, Yongsheng; Cheng, Lijun; Pedrycz, Witold; Hao, Kuangrong
2015-10-01
A new global nonlinear predictor with a particle swarm-optimized interval support vector regression (PSO-ISVR) is proposed to address three issues (viz., kernel selection, model optimization, kernel method speed) encountered when applying SVR in the presence of large data sets. The novel prediction model can reduce the SVR computing overhead by dividing input space and adaptively selecting the optimized kernel functions to obtain optimal SVR parameter by PSO. To quantify the quality of the predictor, its generalization performance and execution speed are investigated based on statistical learning theory. In addition, experiments using synthetic data as well as the stock volume weighted average price are reported to demonstrate the effectiveness of the developed models. The experimental results show that the proposed PSO-ISVR predictor can improve the computational efficiency and the overall prediction accuracy compared with the results produced by the SVR and other regression methods. The proposed PSO-ISVR provides an important tool for nonlinear regression analysis of big data. PMID:25974954
NASA Astrophysics Data System (ADS)
Lima, Aranildo R.; Cannon, Alex J.; Hsieh, William W.
2013-01-01
A hybrid algorithm combining support vector regression with evolutionary strategy (SVR-ES) is proposed for predictive models in the environmental sciences. SVR-ES uses uncorrelated mutation with p step sizes to find the optimal SVR hyper-parameters. Three environmental forecast datasets used in the WCCI-2006 contest - surface air temperature, precipitation and sulphur dioxide concentration - were tested. We used multiple linear regression (MLR) as benchmark and a variety of machine learning techniques including bootstrap-aggregated ensemble artificial neural network (ANN), SVR-ES, SVR with hyper-parameters given by the Cherkassky-Ma estimate, the M5 regression tree, and random forest (RF). We also tested all techniques using stepwise linear regression (SLR) first to screen out irrelevant predictors. We concluded that SVR-ES is an attractive approach because it tends to outperform the other techniques and can also be implemented in an almost automatic way. The Cherkassky-Ma estimate is a useful approach for minimizing the mean absolute error and saving computational time related to the hyper-parameter search. The ANN and RF are also good options to outperform multiple linear regression (MLR). Finally, the use of SLR for predictor selection can dramatically reduce computational time and often help to enhance accuracy.
Tiedeman, C.R.; Kernodle, J.M.; McAda, D.P.
1998-01-01
This report documents the application of nonlinear-regression methods to a numerical model of ground-water flow in the Albuquerque Basin, New Mexico. In the Albuquerque Basin, ground water is the primary source for most water uses. Ground-water withdrawal has steadily increased since the 1940's, resulting in large declines in water levels in the Albuquerque area. A ground-water flow model was developed in 1994 and revised and updated in 1995 for the purpose of managing basin ground- water resources. In the work presented here, nonlinear-regression methods were applied to a modified version of the previous flow model. Goals of this work were to use regression methods to calibrate the model with each of six different configurations of the basin subsurface and to assess and compare optimal parameter estimates, model fit, and model error among the resulting calibrations. The Albuquerque Basin is one in a series of north trending structural basins within the Rio Grande Rift, a region of Cenozoic crustal extension. Mountains, uplifts, and fault zones bound the basin, and rock units within the basin include pre-Santa Fe Group deposits, Tertiary Santa Fe Group basin fill, and post-Santa Fe Group volcanics and sediments. The Santa Fe Group is greater than 14,000 feet (ft) thick in the central part of the basin. During deposition of the Santa Fe Group, crustal extension resulted in development of north trending normal faults with vertical displacements of as much as 30,000 ft. Ground-water flow in the Albuquerque Basin occurs primarily in the Santa Fe Group and post-Santa Fe Group deposits. Water flows between the ground-water system and surface-water bodies in the inner valley of the basin, where the Rio Grande, a network of interconnected canals and drains, and Cochiti Reservoir are located. Recharge to the ground-water flow system occurs as infiltration of precipitation along mountain fronts and infiltration of stream water along tributaries to the Rio Grande; subsurface
Creating a non-linear total sediment load formula using polynomial best subset regression model
NASA Astrophysics Data System (ADS)
Okcu, Davut; Pektas, Ali Osman; Uyumaz, Ali
2016-08-01
The aim of this study is to derive a new total sediment load formula which is more accurate and which has less application constraints than the well-known formulae of the literature. 5 most known stream power concept sediment formulae which are approved by ASCE are used for benchmarking on a wide range of datasets that includes both field and flume (lab) observations. The dimensionless parameters of these widely used formulae are used as inputs in a new regression approach. The new approach is called Polynomial Best subset regression (PBSR) analysis. The aim of the PBRS analysis is fitting and testing all possible combinations of the input variables and selecting the best subset. Whole the input variables with their second and third powers are included in the regression to test the possible relation between the explanatory variables and the dependent variable. While selecting the best subset a multistep approach is used that depends on significance values and also the multicollinearity degrees of inputs. The new formula is compared to others in a holdout dataset and detailed performance investigations are conducted for field and lab datasets within this holdout data. Different goodness of fit statistics are used as they represent different perspectives of the model accuracy. After the detailed comparisons are carried out we figured out the most accurate equation that is also applicable on both flume and river data. Especially, on field dataset the prediction performance of the proposed formula outperformed the benchmark formulations.
De la Cruz, Rolando; Meza, Cristian; Arribas-Gil, Ana; Carroll, Raymond J.
2016-01-01
Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to assess association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. When a nonlinear mixed-effects model is used to fit the longitudinal trajectories, the existing estimation strategies based on likelihood approximations have been shown to exhibit some computational efficiency problems (De la Cruz et al., 2011). In this article we consider a Bayesian estimation procedure for the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. The proposed prior structure allows for the implementation of an MCMC sampler. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to assess the importance of modelling correlated errors and quantify the consequences of model misspecification. PMID:27274601
Lee, Seunghak; Lozano, Aurélie; Kambadur, Prabhanjan; Xing, Eric P
2016-05-01
Genome-wide association studies have revealed individual genetic variants associated with phenotypic traits such as disease risk and gene expressions. However, detecting pairwise interaction effects of genetic variants on traits still remains a challenge due to a large number of combinations of variants (∼10(11) SNP pairs in the human genome), and relatively small sample sizes (typically <10(4)). Despite recent breakthroughs in detecting interaction effects, there are still several open problems, including: (1) how to quickly process a large number of SNP pairs, (2) how to distinguish between true signals and SNPs/SNP pairs merely correlated with true signals, (3) how to detect nonlinear associations between SNP pairs and traits given small sample sizes, and (4) how to control false positives. In this article, we present a unified framework, called SPHINX, which addresses the aforementioned challenges. We first propose a piecewise linear model for interaction detection, because it is simple enough to estimate model parameters given small sample sizes but complex enough to capture nonlinear interaction effects. Then, based on the piecewise linear model, we introduce randomized group lasso under stability selection, and a screening algorithm to address the statistical and computational challenges mentioned above. In our experiments, we first demonstrate that SPHINX achieves better power than existing methods for interaction detection under false positive control. We further applied SPHINX to late-onset Alzheimer's disease dataset, and report 16 SNPs and 17 SNP pairs associated with gene traits. We also present a highly scalable implementation of our screening algorithm, which can screen ∼118 billion candidates of associations on a 60-node cluster in <5.5 hours. PMID:27159633
O'Reilly, S; Riveros, M C
1994-01-01
A second degree equation fitted by nonlinear regression for the analysis of the pH effect on enzyme activity is proposed for diprotic enzyme systems. This method allows the calculation of two molecular dissociation constants (KE1 and KE2 for the free enzyme, KES1 and KES2 for the ES complex) and the pH independent parameters (Vmax and Vmax/Km). The method is validated by bibliographic (alpha-chymotrypsin) and experimental data (almond beta-D-glucosidase). No significant differences were found between present data and those previously reported in the literature using similar experimental conditions. This method works using comparatively few [H+] concentration values within a narrow pH range, preferentially around the optimum, being adequate for diprotic systems with close pKa values. PMID:8728828
NASA Astrophysics Data System (ADS)
Hussain, Mirza Zahid; Li, Fuguo; Wang, Jing; Yuan, Zhanwei; Li, Pan; Wu, Tao
2015-07-01
The present study comprises the determination of constitutive relationship for thermo-mechanical processing of INCONEL 718 through double multivariate nonlinear regression, a newly developed approach which not only considers the effect of strain, strain rate, and temperature on flow stress but also explains the interaction effect of these thermo-mechanical parameters on flow behavior of the alloy. Hot isothermal compression experiments were performed on Gleeble-3500 thermo-mechanical testing machine in the temperature range of 1153 to 1333 K within the strain rate range of 0.001 to 10 s-1. The deformation behavior of INCONEL 718 is analyzed and summarized by establishing the high temperature deformation constitutive equation. The calculated correlation coefficient ( R) and average absolute relative error ( AARE) underline the precision of proposed constitutive model.
Stevens, F. J.; Bobrovnik, S. A.; Biosciences Division; Palladin Inst. Biochemistry
2007-12-01
Physiological responses of the adaptive immune system are polyclonal in nature whether induced by a naturally occurring infection, by vaccination to prevent infection or, in the case of animals, by challenge with antigen to generate reagents of research or commercial significance. The composition of the polyclonal responses is distinct to each individual or animal and changes over time. Differences exist in the affinities of the constituents and their relative proportion of the responsive population. In addition, some of the antibodies bind to different sites on the antigen, whereas other pairs of antibodies are sterically restricted from concurrent interaction with the antigen. Even if generation of a monoclonal antibody is the ultimate goal of a project, the quality of the resulting reagent is ultimately related to the characteristics of the initial immune response. It is probably impossible to quantitatively parse the composition of a polyclonal response to antigen. However, molecular regression allows further parameterization of a polyclonal antiserum in the context of certain simplifying assumptions. The antiserum is described as consisting of two competing populations of high- and low-affinity and unknown relative proportions. This simple model allows the quantitative determination of representative affinities and proportions. These parameters may be of use in evaluating responses to vaccines, to evaluating continuity of antibody production whether in vaccine recipients or animals used for the production of antisera, or in optimizing selection of donors for the production of monoclonal antibodies.
Akutekwe, Arinze; Seker, Huseyin
2015-08-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in systems biology. Most methods for modeling and inferring the dynamics of GRNs, such as those based on state space models, vector autoregressive models and G1DBN algorithm, assume linear dependencies among genes. However, this strong assumption does not make for true representation of time-course relationships across the genes, which are inherently nonlinear. Nonlinear modeling methods such as the S-systems and causal structure identification (CSI) have been proposed, but are known to be statistically inefficient and analytically intractable in high dimensions. To overcome these limitations, we propose an optimized ensemble approach based on support vector regression (SVR) and dynamic Bayesian networks (DBNs). The method called SVR-DBN, uses nonlinear kernels of the SVR to infer the temporal relationships among genes within the DBN framework. The two-stage ensemble is further improved by SVR parameter optimization using Particle Swarm Optimization. Results on eight insilico-generated datasets, and two real world datasets of Drosophila Melanogaster and Escherichia Coli, show that our method outperformed the G1DBN algorithm by a total average accuracy of 12%. We further applied our method to model the time-course relationships of ovarian carcinoma. From our results, four hub genes were discovered. Stratified analysis further showed that the expression levels Prostrate differentiation factor and BTG family member 2 genes, were significantly increased by the cisplatin and oxaliplatin platinum drugs; while expression levels of Polo-like kinase and Cyclin B1 genes, were both decreased by the platinum drugs. These hub genes might be potential biomarkers for ovarian carcinoma. PMID:26738192
Inverse Tasks In The Tsunami Problem: Nonlinear Regression With Inaccurate Input Data
NASA Astrophysics Data System (ADS)
Lavrentiev, M.; Shchemel, A.; Simonov, K.
A variant of modified training functional that allows considering inaccurate input data is suggested. A limiting case when a part of input data is completely undefined, and, therefore, a problem of reconstruction of hidden parameters should be solved, is also considered. Some numerical experiments are presented. It is assumed that a dependence of known output variables on known input ones should be found is the classic problem definition, which is widely used in the majority of neural nets algorithms. The quality of approximation is evaluated as a performance function. Often the error of the task is evaluated as squared distance between known input data and predicted data multiplied by weighed coefficients. These coefficients may be named "precision coefficients". When inputs are not known exactly, natural generalization of performance function is adding member that responsible for distance between known inputs and shifted inputs, which lessen model's error. It is desirable that the set of variable parameters is compact for training to be con- verging. In the above problem it is possible to choose variants of demands of a priori compactness, which allow meaningful interpretation in the smoothness of the model dependence. Two kinds of regularization was used, first limited squares of coefficients responsible for nonlinearity and second limited multiplication of the above coeffi- cients and linear coefficients. Asymptotic universality of neural net ability to approxi- mate various smooth functions with any accuracy by increase of the number of tunable parameters is often the base for selecting a type of neural net approximation. It is pos- sible to show that used neural net will approach to Fourier integral transform, which approximate abilities are known, with increasing of the number of tunable parameters. In the limiting case, when input data is set with zero precision, the problem of recon- struction of hidden parameters with observed output data appears. The
NASA Astrophysics Data System (ADS)
Zhang, Xiaoyu; Li, Qingbo; Zhang, Guangjun
2013-11-01
In this paper, a modified single-index signal regression (mSISR) method is proposed to construct a nonlinear and practical model with high-accuracy. The mSISR method defines the optimal penalty tuning parameter in P-spline signal regression (PSR) as initial tuning parameter and chooses the number of cycles based on minimizing root mean squared error of cross-validation (RMSECV). mSISR is superior to single-index signal regression (SISR) in terms of accuracy, computation time and convergency. And it can provide the character of the non-linearity between spectra and responses in a more precise manner than SISR. Two spectra data sets from basic research experiments, including plant chlorophyll nondestructive measurement and human blood glucose noninvasive measurement, are employed to illustrate the advantages of mSISR. The results indicate that the mSISR method (i) obtains the smooth and helpful regression coefficient vector, (ii) explicitly exhibits the type and amount of the non-linearity, (iii) can take advantage of nonlinear features of the signals to improve prediction performance and (iv) has distinct adaptability for the complex spectra model by comparing with other calibration methods. It is validated that mSISR is a promising nonlinear modeling strategy for multivariate calibration.
NASA Astrophysics Data System (ADS)
Frecon, Jordan; Didier, Gustavo; Pustelnik, Nelly; Abry, Patrice
2016-08-01
Self-similarity is widely considered the reference framework for modeling the scaling properties of real-world data. However, most theoretical studies and their practical use have remained univariate. Operator Fractional Brownian Motion (OfBm) was recently proposed as a multivariate model for self-similarity. Yet it has remained seldom used in applications because of serious issues that appear in the joint estimation of its numerous parameters. While the univariate fractional Brownian motion requires the estimation of two parameters only, its mere bivariate extension already involves 7 parameters which are very different in nature. The present contribution proposes a method for the full identification of bivariate OfBm (i.e., the joint estimation of all parameters) through an original formulation as a non-linear wavelet regression coupled with a custom-made Branch & Bound numerical scheme. The estimation performance (consistency and asymptotic normality) is mathematically established and numerically assessed by means of Monte Carlo experiments. The impact of the parameters defining OfBm on the estimation performance as well as the associated computational costs are also thoroughly investigated.
NASA Astrophysics Data System (ADS)
Alves, Larissa A.; de Castro, Arthur H.; de Mendonça, Fernanda G.; de Mesquita, João P.
2016-05-01
The oxygenated functional groups present on the surface of carbon dots with an average size of 2.7 ± 0.5 nm were characterized by a variety of techniques. In particular, we discussed the fit data of potentiometric titration curves using a nonlinear regression method based on the Levenberg-Marquardt algorithm. The results obtained by statistical treatment of the titration curve data showed that the best fit was obtained considering the presence of five Brønsted-Lowry acids on the surface of the carbon dots with constant ionization characteristics of carboxylic acids, cyclic ester, phenolic and pyrone-like groups. The total number of oxygenated acid groups obtained was 5 mmol g-1, with approximately 65% (∼2.9 mmol g-1) originating from groups with pKa < 6. The methodology showed good reproducibility and stability with standard deviations below 5%. The nature of the groups was independent of small variations in experimental conditions, i.e. the mass of carbon dots titrated and initial concentration of HCl solution. Finally, we believe that the methodology used here, together with other characterization techniques, is a simple, fast and powerful tool to characterize the complex acid-base properties of these so interesting and intriguing nanoparticles.
ERIC Educational Resources Information Center
Barringer, Mary S.
Researchers are becoming increasingly aware of the advantages of using multiple regression as opposed to analysis of variance (ANOVA) or analysis of covariance (ANCOVA). Multiple regression is more versatile and does not force the researcher to throw away variance by categorizing intervally scaled data. Polynomial regression analysis offers the…
Meloun, Milan; Syrový, Tomás; Bordovská, Sylva; Vrána, Ales
2007-02-01
When drugs are poorly soluble then, instead of the potentiometric determination of dissociation constants, pH-spectrophotometric titration can be used along with nonlinear regression of the absorbance response surface data. Generally, regression models are extremely useful for extracting the essential features from a multiwavelength set of data. Regression diagnostics represent procedures for examining the regression triplet (data, model, method) in order to check (a) the data quality for a proposed model; (b) the model quality for a given set of data; and (c) that all of the assumptions used for least squares hold. In the interactive, PC-assisted diagnosis of data, models and estimation methods, the examination of data quality involves the detection of influential points, outliers and high leverages, that cause many problems when regression fitting the absorbance response hyperplane. All graphically oriented techniques are suitable for the rapid estimation of influential points. The reliability of the dissociation constants for the acid drug silybin may be proven with goodness-of-fit tests of the multiwavelength spectrophotometric pH-titration data. The uncertainty in the measurement of the pK (a) of a weak acid obtained by the least squares nonlinear regression analysis of absorption spectra is calculated. The procedure takes into account the drift in pH measurement, the drift in spectral measurement, and all of the drifts in analytical operations, as well as the relative importance of each source of uncertainty. The most important source of uncertainty in the experimental set-up for the example is the uncertainty in the pH measurement. The influences of various sources of uncertainty on the accuracy and precision are discussed using the example of the mixed dissociation constants of silybin, obtained using the SQUAD(84) and SPECFIT/32 regression programs. PMID:17216158
NASA Astrophysics Data System (ADS)
Lin, Yiqiu
2007-12-01
Ozone forecast models using nonlinear regression (NLR) have been successfully applied to daily ozone forecast for seven metro areas in Kentucky, including Ashland, Bowling Green, Covington, Lexington, Louisville, Owensboro, and Paducah. In this study, the updated 2005 NLR ozone forecast models for these metro areas were evaluated on both the calibration data sets and independent data sets. These NLR ozone forecast models explained at least 72% of the variance of the daily peak ozone. Using the models to predict the ozone concentrations during the 2005 ozone season, the metro area mean absolute errors (MAEs) of the model hindcasts ranged from 5.90 ppb to 7.20 ppb. For the model raw forecasts, the metro area MAEs ranged from 7.90 ppb to 9.80 ppb. Based on previously developed NLR ozone forecast models for those areas, Takagi-Sugeno fuzzy system models were developed for the seven metro areas. The fuzzy "c-means" clustering technique coupled with an optimal output predefuzzification approach (least square method) was used to train the Takagi-Sugeno fuzzy system. Two types of fuzzy models, basic fuzzy and NLR-fuzzy system models, were developed. The basic fuzzy and NLR-fuzzy models exhibited essentially equivalent performance to the existing NLR models on 2004 ozone season hindcasts and forecasts. Both types of fuzzy models had, on average, slightly lower metro area averaged MAEs than the NLR models. Among the seven Kentucky metro areas Ashland, Covington, and Louisville are currently designated nonattainment areas for both ground level O 3 and PM2.5. In this study, summer PM2.5 forecast models were developed for providing daily average PM2.5 forecasts for the seven metro areas. The performance of the PM2.5 forecast models was generally not as good as that of the ozone forecast models. For the summer 2004 model hindcasts, the metro-area average MAE was 5.33 mug/m 3. Exploratory research was conducted to find the relationship between the winter PM2.5 concentrations and
Interval estimates for nonlinear parameters using the linear approximation are sensitive to parameter curvature effects. he adequacy of the linear approximation (Wald) interval is determined using the nonlinearity measures of Bates and Watts (1980), and Clarke (1987b), and the pr...
Lee, Myung Hee; Liu, Yufeng
2013-12-01
The continuum regression technique provides an appealing regression framework connecting ordinary least squares, partial least squares and principal component regression in one family. It offers some insight on the underlying regression model for a given application. Moreover, it helps to provide deep understanding of various regression techniques. Despite the useful framework, however, the current development on continuum regression is only for linear regression. In many applications, nonlinear regression is necessary. The extension of continuum regression from linear models to nonlinear models using kernel learning is considered. The proposed kernel continuum regression technique is quite general and can handle very flexible regression model estimation. An efficient algorithm is developed for fast implementation. Numerical examples have demonstrated the usefulness of the proposed technique. PMID:24058224
Meloun, Milan; Bordovská, Sylva; Galla, Lubomír
2007-11-30
The mixed dissociation constants of four non-steroidal anti-inflammatory drugs (NSAIDs) ibuprofen, diclofenac sodium, flurbiprofen and ketoprofen at various ionic strengths I of range 0.003-0.155, and at temperatures of 25 degrees C and 37 degrees C, were determined with the use of two different multiwavelength and multivariate treatments of spectral data, SPECFIT/32 and SQUAD(84) nonlinear regression analyses and INDICES factor analysis. The factor analysis in the INDICES program predicts the correct number of components, and even the presence of minor ones, when the data quality is high and the instrumental error is known. The thermodynamic dissociation constant pK(a)(T) was estimated by nonlinear regression of (pK(a), I) data at 25 degrees C and 37 degrees C. Goodness-of-fit tests for various regression diagnostics enabled the reliability of the parameter estimates found to be proven. PALLAS, MARVIN, SPARC, ACD/pK(a) and Pharma Algorithms predict pK(a) being based on the structural formulae of drug compounds in agreement with the experimental value. The best agreement seems to be between the ACD/pK(a) program and experimentally found values and with SPARC. PALLAS and MARVIN predicted pK(a,pred) values with larger bias errors in comparison with the experimental value for all four drugs. PMID:17825517
Nonlinear-regression flow model of the Gulf Coast aquifer systems in the south-central United States
Kuiper, L.K.
1994-01-01
A multiple-regression methodology was used to help answer questions concerning model reliability, and to calibrate a time-dependent variable-density ground-water flow model of the gulf coast aquifer systems in the south-central United States. More than 40 regression models with 2 to 31 regressions parameters are used and detailed results are presented for 12 of the models. More than 3,000 values for grid-element volume-averaged head and hydraulic conductivity are used for the regression model observations. Calculated prediction interval half widths, though perhaps inaccurate due to a lack of normality of the residuals, are the smallest for models with only four regression parameters. In addition, the root-mean weighted residual decreases very little with an increase in the number of regression parameters. The various models showed considerable overlap between the prediction inter- vals for shallow head and hydraulic conductivity. Approximate 95-percent prediction interval half widths for volume-averaged freshwater head exceed 108 feet; for volume-averaged base 10 logarithm hydraulic conductivity, they exceed 0.89. All of the models are unreliable for the prediction of head and ground-water flow in the deeper parts of the aquifer systems, including the amount of flow coming from the underlying geopressured zone. Truncating the domain of solution of one model to exclude that part of the system having a ground-water density greater than 1.005 grams per cubic centimeter or to exclude that part of the systems below a depth of 3,000 feet, and setting the density to that of freshwater does not appreciably change the results for head and ground-water flow, except for locations close to the truncation surface.
Lloyd, J W; Rook, J S; Braselton, E; Shea, M E
2000-02-01
A study was designed to model the fluctuations of nine specific element concentrations in mammary secretions from periparturient mares over time. During the 1992 foaling season, serial samples of mammary secretions were collected from all 18 pregnant Arabian mares at the Michigan State University equine teaching and research center. Non-linear regression techniques were used to model the relationship between element concentration in mammary secretions and days from foaling (which connected two separate sigmoid curves with a spline function); indicator variables were included for mare and mare parity. Element concentrations in mammary secretions varied significantly during the periparturient period in mares. Both time trends and individual variability explained a significant portion of the variation in these element concentrations. Multiparous mares had lower concentrations of K and Zn, but higher concentrations of Na. Substantial serial and spatial correlation were detected in spite of modeling efforts to avoid the problem. As a result, p-values obtained for parameter estimates were likely biased toward zero. Nonetheless, results of this analysis indicate that monitoring changes in mammary-secretion element concentrations might reasonably be used as a predictor of impending parturition in the mare. In addition, these results suggest that element concentrations warrant attention in the development of neonatal milk-replacement therapies. This study demonstrates that non-linear regression can be used successfully to model time-series data in animal-health management. This approach should be considered by investigators facing similar analytical challenges. PMID:10782599
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects. PMID:21350755
Lambert, Ronald J W; Mytilinaios, Ioannis; Maitland, Luke; Brown, Angus M
2012-08-01
This study describes a method to obtain parameter confidence intervals from the fitting of non-linear functions to experimental data, using the SOLVER and Analysis ToolPaK Add-In of the Microsoft Excel spreadsheet. Previously we have shown that Excel can fit complex multiple functions to biological data, obtaining values equivalent to those returned by more specialized statistical or mathematical software. However, a disadvantage of using the Excel method was the inability to return confidence intervals for the computed parameters or the correlations between them. Using a simple Monte-Carlo procedure within the Excel spreadsheet (without recourse to programming), SOLVER can provide parameter estimates (up to 200 at a time) for multiple 'virtual' data sets, from which the required confidence intervals and correlation coefficients can be obtained. The general utility of the method is exemplified by applying it to the analysis of the growth of Listeria monocytogenes, the growth inhibition of Pseudomonas aeruginosa by chlorhexidine and the further analysis of the electrophysiological data from the compound action potential of the rodent optic nerve. PMID:21764476
NASA Astrophysics Data System (ADS)
Huttunen, Jani; Kokkola, Harri; Mielonen, Tero; Esa Juhani Mononen, Mika; Lipponen, Antti; Reunanen, Juha; Vilhelm Lindfors, Anders; Mikkonen, Santtu; Erkki Juhani Lehtinen, Kari; Kouremeti, Natalia; Bais, Alkiviadis; Niska, Harri; Arola, Antti
2016-07-01
In order to have a good estimate of the current forcing by anthropogenic aerosols, knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from the 1990s onward. One option to lengthen the AOD time series beyond the 1990s is to retrieve AOD from surface solar radiation (SSR) measurements taken with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a non-linear regression method and four machine learning methods (Gaussian process, neural network, random forest and support vector machine) with AOD observations carried out with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and non-linear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the random forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval, whereas the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during
Jalali-Heravi, M; Mani-Varnosfaderani, A; Taherinia, D; Mahmoodi, M M
2012-07-01
The main aim of this work was to assess the ability of Bayesian multivariate adaptive regression splines (BMARS) and Bayesian radial basis function (BRBF) techniques for modelling the gas chromatographic retention indices of volatile components of Artemisia species. A diverse set of molecular descriptors was calculated and used as descriptor pool for modelling the retention indices. The ability of BMARS and BRBF techniques was explored for the selection of the most relevant descriptors and proper basis functions for modelling. The results revealed that BRBF technique is more reproducible than BMARS for modelling the retention indices and can be used as a method for variable selection and modelling in quantitative structure-property relationship (QSPR) studies. It is also concluded that the Markov chain Monte Carlo (MCMC) search engine, implemented in BRBF algorithm, is a suitable method for selecting the most important features from a vast number of them. The values of correlation between the calculated retention indices and the experimental ones for the training and prediction sets (0.935 and 0.902, respectively) revealed the prediction power of the BRBF model in estimating the retention index of volatile components of Artemisia species. PMID:22452344
The covariate-adjusted frequency plot.
Holling, Heinz; Böhning, Walailuck; Böhning, Dankmar; Formann, Anton K
2016-04-01
Count data arise in numerous fields of interest. Analysis of these data frequently require distributional assumptions. Although the graphical display of a fitted model is straightforward in the univariate scenario, this becomes more complex if covariate information needs to be included into the model. Stratification is one way to proceed, but has its limitations if the covariate has many levels or the number of covariates is large. The article suggests a marginal method which works even in the case that all possible covariate combinations are different (i.e. no covariate combination occurs more than once). For each covariate combination the fitted model value is computed and then summed over the entire data set. The technique is quite general and works with all count distributional models as well as with all forms of covariate modelling. The article provides illustrations of the method for various situations and also shows that the proposed estimator as well as the empirical count frequency are consistent with respect to the same parameter. PMID:23376964
NASA Astrophysics Data System (ADS)
Biyanto, Totok R.
2016-06-01
Fouling in a heat exchanger in Crude Preheat Train (CPT) refinery is an unsolved problem that reduces the plant efficiency, increases fuel consumption and CO2 emission. The fouling resistance behavior is very complex. It is difficult to develop a model using first principle equation to predict the fouling resistance due to different operating conditions and different crude blends. In this paper, Artificial Neural Networks (ANN) MultiLayer Perceptron (MLP) with input structure using Nonlinear Auto-Regressive with eXogenous (NARX) is utilized to build the fouling resistance model in shell and tube heat exchanger (STHX). The input data of the model are flow rates and temperatures of the streams of the heat exchanger, physical properties of product and crude blend data. This model serves as a predicting tool to optimize operating conditions and preventive maintenance of STHX. The results show that the model can capture the complexity of fouling characteristics in heat exchanger due to thermodynamic conditions and variations in crude oil properties (blends). It was found that the Root Mean Square Error (RMSE) are suitable to capture the nonlinearity and complexity of the STHX fouling resistance during phases of training and validation.
Caballero, Julio; Fernández, Michael
2006-01-01
Antifungal activity was modeled for a set of 96 heterocyclic ring derivatives (2,5,6-trisubstituted benzoxazoles, 2,5-disubstituted benzimidazoles, 2-substituted benzothiazoles and 2-substituted oxazolo(4,5-b)pyridines) using multiple linear regression (MLR) and Bayesian-regularized artificial neural network (BRANN) techniques. Inhibitory activity against Candida albicans (log(1/C)) was correlated with 3D descriptors encoding the chemical structures of the heterocyclic compounds. Training and test sets were chosen by means of k-Means Clustering. The most appropriate variables for linear and nonlinear modeling were selected using a genetic algorithm (GA) approach. In addition to the MLR equation (MLR-GA), two nonlinear models were built, model BRANN employing the linear variable subset and an optimum model BRANN-GA obtained by a hybrid method that combined BRANN and GA approaches (BRANN-GA). The linear model fit the training set (n = 80) with r2 = 0.746, while BRANN and BRANN-GA gave higher values of r2 = 0.889 and r2 = 0.937, respectively. Beyond the improvement of training set fitting, the BRANN-GA model was superior to the others by being able to describe 87% of test set (n = 16) variance in comparison with 78 and 81% the MLR-GA and BRANN models, respectively. Our quantitative structure-activity relationship study suggests that the distributions of atomic mass, volume and polarizability have relevant relationships with the antifungal potency of the compounds studied. Furthermore, the ability of the six variables selected nonlinearly to differentiate the data was demonstrated when the total data set was well distributed in a Kohonen self-organizing neural network (KNN). PMID:16205958
Letcher, J H
1989-01-01
For a number of reasons, it is desirable to fabricate coils which, for a known current, shall produce predetermined values of the magnetic field intensity at a number of points within a nuclear magnetic resonance imager. The calculation of the magnetic field intensity at a set of points involves the integration of the Biot-Savart equation for all components of the segments of conductor which make up the coil. This process in itself is a rather formidable task. When this process is parameterized in terms of coil diameter, coil spacing, etc. the problem is to determine the values of these parameters to match values of magnetic field intensities which are desired. The problem thereby increases in complexity to the point where, by ordinary methods, the problem becomes intractable. A generalized solution technique has been developed on a digital computer to implement the rotational discrimination nonlinear regression techniques of Faris, Law and Letcher to find the best solution to this problem. The problem is posed by integrating the Biot-Savart equation. This produces algebraic expressions for incorporation into the optimization program which is executed on a computer in a conversational mode. This technique was employed to specify the dimensions of a rectangular surface coil for the investigation of the whole human spine. PMID:2630841
Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas
2013-01-01
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
Tomlinson, Sean
2016-04-01
The calculation and comparison of physiological characteristics of thermoregulation has provided insight into patterns of ecology and evolution for over half a century. Thermoregulation has typically been explored using linear techniques; I explore the application of non-linear scaling to more accurately calculate and compare characteristics and thresholds of thermoregulation, including the basal metabolic rate (BMR), peak metabolic rate (PMR) and the lower (Tlc) and upper (Tuc) critical limits to the thermo-neutral zone (TNZ) for Australian rodents. An exponentially-modified logistic function accurately characterised the response of metabolic rate to ambient temperature, while evaporative water loss was accurately characterised by a Michaelis-Menten function. When these functions were used to resolve unique parameters for the nine species studied here, the estimates of BMR and TNZ were consistent with the previously published estimates. The approach resolved differences in rates of metabolism and water loss between subfamilies of Australian rodents that haven't been quantified before. I suggest that non-linear scaling is not only more effective than the established segmented linear techniques, but also is more objective. This approach may allow broader and more flexible comparison of characteristics of thermoregulation, but it needs testing with a broader array of taxa than those used here. PMID:27033039
Huang, Dong; Cabral, Ricardo; De la Torre, Fernando
2016-02-01
Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740
Lee, Paul H.
2016-01-01
Healthy adults are advised to perform at least 150 min of moderate-intensity physical activity weekly, but this advice is based on studies using self-reports of questionable validity. This study examined the dose-response relationship of accelerometer-measured physical activity and sedentary behaviors on all-cause mortality using segmented Cox regression to empirically determine the break-points of the dose-response relationship. Data from 7006 adult participants aged 18 or above in the National Health and Nutrition Examination Survey waves 2003–2004 and 2005–2006 were included in the analysis and linked with death certificate data using a probabilistic matching approach in the National Death Index through December 31, 2011. Physical activity and sedentary behavior were measured using ActiGraph model 7164 accelerometer over the right hip for 7 consecutive days. Each minute with accelerometer count <100; 1952–5724; and ≥5725 were classified as sedentary, moderate-intensity physical activity, and vigorous-intensity physical activity, respectively. Segmented Cox regression was used to estimate the hazard ratio (HR) of time spent in sedentary behaviors, moderate-intensity physical activity, and vigorous-intensity physical activity and all-cause mortality, adjusted for demographic characteristics, health behaviors, and health conditions. Data were analyzed in 2016. During 47,119 person-year of follow-up, 608 deaths occurred. Each additional hour per day of sedentary behaviors was associated with a HR of 1.15 (95% CI 1.01, 1.31) among participants who spend at least 10.9 h per day on sedentary behaviors, and each additional minute per day spent on moderate-intensity physical activity was associated with a HR of 0.94 (95% CI 0.91, 0.96) among participants with daily moderate-intensity physical activity ≤14.1 min. Associations of moderate physical activity and sedentary behaviors on all-cause mortality were independent of each other. To conclude, evidence from
Roenigk, K.F.; Jensen, K.F.; Carr, R.W.
1987-10-22
Arrhenius parameters are estimated for silane and disilane thermal decomposition reactions by direct regression of RRKM predictions on published static and shock-tube data. For silane decomposition, they find E/sub infinity/ = 57.4-61.1 kcal/mol and log A/sub infinity/ = 14.9-16.3, while for disilane they find E/sub infinity/ = 51.1-52.5 kcal/mol and log A/sub infinity/ = 15.2-16.2. The lower limiting values correspond to inclusion of negative temperature dependence in the collision efficiency, while the higher values correspond to inclusion of weak or negligible temperature dependence. The Arrhenium parameters for both silane and disilane decomposition differ substantially from previously published values. For silane, they predict preexponentials approximately an order of magnitude greater than the previous values for the same activation energy. For disilane, they find A/sub infinity/ is roughly an order of magnitude higher than the literature values and E/sub infinity/ is greater by more than 2 kcal/mol. Falloff curves for both silane and disilane decomposition are given. Implications of these results for the activation energy of SiH/sub 2/ insertion into H/sub 2/ and SiH/sub 4/ and for ..delta..H/sub f//sup 0/(SiH/sub 2/) are discussed.
Liu, S.; Anderson, P.; Zhou, G.; Kauffman, B.; Hughes, F.; Schimel, D.; Watson, Vicente; Tosi, Joseph
2008-01-01
Objectively assessing the performance of a model and deriving model parameter values from observations are critical and challenging in landscape to regional modeling. In this paper, we applied a nonlinear inversion technique to calibrate the ecosystem model CENTURY against carbon (C) and nitrogen (N) stock measurements collected from 39 mature tropical forest sites in seven life zones in Costa Rica. Net primary productivity from the Moderate-Resolution Imaging Spectroradiometer (MODIS), C and N stocks in aboveground live biomass, litter, coarse woody debris (CWD), and in soils were used to calibrate the model. To investigate the resolution of available observations on the number of adjustable parameters, inversion was performed using nine setups of adjustable parameters. Statistics including observation sensitivity, parameter correlation coefficient, parameter sensitivity, and parameter confidence limits were used to evaluate the information content of observations, resolution of model parameters, and overall model performance. Results indicated that soil organic carbon content, soil nitrogen content, and total aboveground biomass carbon had the highest information contents, while measurements of carbon in litter and nitrogen in CWD contributed little to the parameter estimation processes. The available information could resolve the values of 2-4 parameters. Adjusting just one parameter resulted in under-fitting and unacceptable model performance, while adjusting five parameters simultaneously led to over-fitting. Results further indicated that the MODIS NPP values were compressed as compared with the spatial variability of net primary production (NPP) values inferred from inverse modeling. Using inverse modeling to infer NPP and other sensitive model parameters from C and N stock observations provides an opportunity to utilize data collected by national to regional forest inventory systems to reduce the uncertainties in the carbon cycle and generate valuable
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Covariate-adjusted confidence interval for the intraclass correlation coefficient.
Shoukri, Mohamed M; Donner, Allan; El-Dali, Abdelmoneim
2013-09-01
A crucial step in designing a new study is to estimate the required sample size. For a design involving cluster sampling, the appropriate sample size depends on the so-called design effect, which is a function of the average cluster size and the intracluster correlation coefficient (ICC). It is well-known that under the framework of hierarchical and generalized linear models, a reduction in residual error may be achieved by including risk factors as covariates. In this paper we show that the covariate design, indicating whether the covariates are measured at the cluster level or at the within-cluster subject level affects the estimation of the ICC, and hence the design effect. Therefore, the distinction between these two types of covariates should be made at the design stage. In this paper we use the nested-bootstrap method to assess the accuracy of the estimated ICC for continuous and binary response variables under different covariate structures. The codes of two SAS macros are made available by the authors for interested readers to facilitate the construction of confidence intervals for the ICC. Moreover, using Monte Carlo simulations we evaluate the relative efficiency of the estimators and evaluate the accuracy of the coverage probabilities of a 95% confidence interval on the population ICC. The methodology is illustrated using a published data set of blood pressure measurements taken on family members. PMID:23871746
Generalized REGression Package for Nonlinear Parameter Estimation
1995-05-15
GREG computes modal (maximum-posterior-density) and interval estimates of the parameters in a user-provided Fortran subroutine MODEL, using a user-provided vector OBS of single-response observations or matrix OBS of multiresponse observations. GREG can also select the optimal next experiment from a menu of simulated candidates, so as to minimize the volume of the parametric inference region based on the resulting augmented data set.
Deriving the Regression Equation without Using Calculus
ERIC Educational Resources Information Center
Gordon, Sheldon P.; Gordon, Florence S.
2004-01-01
Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…
ERIC Educational Resources Information Center
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Image segmentation via piecewise constant regression
NASA Astrophysics Data System (ADS)
Acton, Scott T.; Bovik, Alan C.
1994-09-01
We introduce a novel unsupervised image segmentation technique that is based on piecewise constant (PICO) regression. Given an input image, a PICO output image for a specified feature size (scale) is computed via nonlinear regression. The regression effectively provides the constant region segmentation of the input image that has a minimum deviation from the input image. PICO regression-based segmentation avoids the problems of region merging, poor localization, region boundary ambiguity, and region fragmentation. Additionally, our segmentation method is particularly well-suited for corrupted (noisy) input data. An application to segmentation and classification of remotely sensed imagery is provided.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Eberly, Lynn E
2007-01-01
This chapter describes multiple linear regression, a statistical approach used to describe the simultaneous associations of several variables with one continuous outcome. Important steps in using this approach include estimation and inference, variable selection in model building, and assessing model fit. The special cases of regression with interactions among the variables, polynomial regression, regressions with categorical (grouping) variables, and separate slopes models are also covered. Examples in microbiology are used throughout. PMID:18450050
2015-09-09
The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.
Orthogonal Regression and Equivariance.
ERIC Educational Resources Information Center
Blankmeyer, Eric
Ordinary least-squares regression treats the variables asymmetrically, designating a dependent variable and one or more independent variables. When it is not obvious how to make this distinction, a researcher may prefer to use orthogonal regression, which treats the variables symmetrically. However, the usual procedure for orthogonal regression is…
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Prediction in Multiple Regression.
ERIC Educational Resources Information Center
Osborne, Jason W.
2000-01-01
Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)
Improved Regression Calibration
ERIC Educational Resources Information Center
Skrondal, Anders; Kuha, Jouni
2012-01-01
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…
Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-01
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424
Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
George: Gaussian Process regression
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel
2015-11-01
George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.
Multivariate Regression with Calibration*
Liu, Han; Wang, Lie; Zhao, Tuo
2014-01-01
We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861
Regression versus No Regression in the Autistic Disorder: Developmental Trajectories
ERIC Educational Resources Information Center
Bernabei, P.; Cerquiglini, A.; Cortesi, F.; D' Ardia, C.
2007-01-01
Developmental regression is a complex phenomenon which occurs in 20-49% of the autistic population. Aim of the study was to assess possible differences in the development of regressed and non-regressed autistic preschoolers. We longitudinally studied 40 autistic children (18 regressed, 22 non-regressed) aged 2-6 years. The following developmental…
Cortazar, E; Usobiaga, A; Fernández, L A; de, Diego A; Madariaga, J M
2002-02-01
A MATHEMATICA package, 'CONDU.M', has been developed to find the polynomial in concentration and temperature which best fits conductimetric data of the type (kappa, c, T) or (kappa, c1, c2, T) of electrolyte solutions (kappa: specific conductivity; ci: concentration of component i; T: temperature). In addition, an interface, 'TKONDU', has been written in the TCL/Tk language to facilitate the use of CONDU.M by an operator not familiarised with MATHEMATICA. All this software is available on line (UPV/EHU, 2001). 'CONDU.M' has been programmed to: (i) select the optimum grade in c1 and/or c2; (ii) compare models with linear or quadratic terms in temperature; (iii) calculate the set of adjustable parameters which best fits data; (iv) simplify the model by elimination of 'a priori' included adjustable parameters which after the regression analysis result in low statistical significance; (v) facilitate the location of outlier data by graphical analysis of the residuals; and (vi) provide quantitative statistical information on the quality of the fit, allowing a critical comparison among different models. Due to the multiple options offered the software allows testing different conductivity models in a short time, even if a large set of conductivity data is being considered simultaneously. Then, the user can choose the best model making use of the graphical and statistical information provided in the output file. Although the program has been initially designed to treat conductimetric data, it can be also applied for processing data with similar structure, e.g. (P, c, T) or (P, c1, c2, T), being P any appropriate transport, physical or thermodynamic property. PMID:11868914
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Webcast entitled Statistical Tools for Making Sense of Data, by the National Nutrient Criteria Support Center, N-STEPS (Nutrients-Scientific Technical Exchange Partnership. The section "Correlation and Regression" provides an overview of these two techniques in the context of nut...
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. PMID:23665468
Residuals and regression diagnostics: focusing on logistic regression.
Zhang, Zhongheng
2016-05-01
Up to now I have introduced most steps in regression model building and validation. The last step is to check whether there are observations that have significant impact on model coefficient and specification. The article firstly describes plotting Pearson residual against predictors. Such plots are helpful in identifying non-linearity and provide hints on how to transform predictors. Next, I focus on observations of outlier, leverage and influence that may have significant impact on model building. Outlier is such an observation that its response value is unusual conditional on covariate pattern. Leverage is an observation with covariate pattern that is far away from the regressor space. Influence is the product of outlier and leverage. That is, when influential observation is dropped from the model, there will be a significant shift of the coefficient. Summary statistics for outlier, leverage and influence are studentized residuals, hat values and Cook's distance. They can be easily visualized with graphs and formally tested using the car package. PMID:27294091
``Once Nonlinear, Always Nonlinear''
NASA Astrophysics Data System (ADS)
Blackstock, David T.
2006-05-01
The phrase "Once nonlinear, always nonlinear" is attributed to David F. Pernet. In the 1970s he noticed that nonlinearly generated higher harmonic components (both tones and noise) don't decay as small signals, no matter how far the wave propagates. Despite being out of step with the then widespread notion that small-signal behavior is restored in "old age," Pernet's view is supported by the Burgers-equation solutions of the early 1960s. For a plane wave from a sinusoidally vibrating source in a thermoviscous fluid, the old-age decay of the nth harmonic is e-nαx, not e-n2αx (small-signal expectation), where α is the absorption coefficient at the fundamental frequency f and x is propagation distance. Moreover, for spherical waves (r the distance) the harmonic diminishes as e-nαx/rn, not e-n2αx/r. While not new, these results have special application to aircraft noise propagation, since the large propagation distances of interest imply old age. The virtual source model may be used to explain the "anomalous" decay rates. In old age most of the nth harmonic sound comes from virtual sources close to the receiver. Their strength is proportional to the nth power of the local fundamental amplitude, and that sets the decay law for the nth harmonic.
Ridge Regression: A Regression Procedure for Analyzing Correlated Independent Variables.
ERIC Educational Resources Information Center
Rakow, Ernest A.
Ridge regression is presented as an analytic technique to be used when predictor variables in a multiple linear regression situation are highly correlated, a situation which may result in unstable regression coefficients and difficulties in interpretation. Ridge regression avoids the problem of selection of variables that may occur in stepwise…
Ridge Regression Signal Processing
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Fast Censored Linear Regression
HUANG, YIJIAN
2013-01-01
Weighted log-rank estimating function has become a standard estimation method for the censored linear regression model, or the accelerated failure time model. Well established statistically, the estimator defined as a consistent root has, however, rather poor computational properties because the estimating function is neither continuous nor, in general, monotone. We propose a computationally efficient estimator through an asymptotics-guided Newton algorithm, in which censored quantile regression methods are tailored to yield an initial consistent estimate and a consistent derivative estimate of the limiting estimating function. We also develop fast interval estimation with a new proposal for sandwich variance estimation. The proposed estimator is asymptotically equivalent to the consistent root estimator and barely distinguishable in samples of practical size. However, computation time is typically reduced by two to three orders of magnitude for point estimation alone. Illustrations with clinical applications are provided. PMID:24347802
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression. PMID:18450049
Incremental hierarchical discriminant regression.
Weng, Juyang; Hwang, Wey-Shiuan
2007-03-01
This paper presents incremental hierarchical discriminant regression (IHDR) which incrementally builds a decision tree or regression tree for very high-dimensional regression or decision spaces by an online, real-time learning system. Biologically motivated, it is an approximate computational model for automatic development of associative cortex, with both bottom-up sensory inputs and top-down motor projections. At each internal node of the IHDR tree, information in the output space is used to automatically derive the local subspace spanned by the most discriminating features. Embedded in the tree is a hierarchical probability distribution model used to prune very unlikely cases during the search. The number of parameters in the coarse-to-fine approximation is dynamic and data-driven, enabling the IHDR tree to automatically fit data with unknown distribution shapes (thus, it is difficult to select the number of parameters up front). The IHDR tree dynamically assigns long-term memory to avoid the loss-of-memory problem typical with a global-fitting learning algorithm for neural networks. A major challenge for an incrementally built tree is that the number of samples varies arbitrarily during the construction process. An incrementally updated probability model, called sample-size-dependent negative-log-likelihood (SDNLL) metric is used to deal with large sample-size cases, small sample-size cases, and unbalanced sample-size cases, measured among different internal nodes of the IHDR tree. We report experimental results for four types of data: synthetic data to visualize the behavior of the algorithms, large face image data, continuous video stream from robot navigation, and publicly available data sets that use human defined features. PMID:17385628
Regression Segmentation for M³ Spinal Images.
Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo
2015-08-01
Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases. PMID:25361503
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Zhao, Ni; Chen, Jun; Carroll, Ian M.; Ringel-Kulka, Tamar; Epstein, Michael P.; Zhou, Hua; Zhou, Jin J.; Ringel, Yehuda; Li, Hongzhe; Wu, Michael C.
2015-01-01
High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals’ microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. “Optimal” MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for. PMID:25957468
Zhao, Ni; Chen, Jun; Carroll, Ian M; Ringel-Kulka, Tamar; Epstein, Michael P; Zhou, Hua; Zhou, Jin J; Ringel, Yehuda; Li, Hongzhe; Wu, Michael C
2015-05-01
High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for. PMID:25957468
Recursive Algorithm For Linear Regression
NASA Technical Reports Server (NTRS)
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Lees, Mackenzie C.; Merani, Shaheed; Tauh, Keerit; Khadaroo, Rachel G.
2015-01-01
Background Older adults (≥ 65 yr) are the fastest growing population and are presenting in increasing numbers for acute surgical care. Emergency surgery is frequently life threatening for older patients. Our objective was to identify predictors of mortality and poor outcome among elderly patients undergoing emergency general surgery. Methods We conducted a retrospective cohort study of patients aged 65–80 years undergoing emergency general surgery between 2009 and 2010 at a tertiary care centre. Demographics, comorbidities, in-hospital complications, mortality and disposition characteristics of patients were collected. Logistic regression analysis was used to identify covariate-adjusted predictors of in-hospital mortality and discharge of patients home. Results Our analysis included 257 patients with a mean age of 72 years; 52% were men. In-hospital mortality was 12%. Mortality was associated with patients who had higher American Society of Anesthesiologists (ASA) class (odds ratio [OR] 3.85, 95% confidence interval [CI] 1.43–10.33, p = 0.008) and in-hospital complications (OR 1.93, 95% CI 1.32–2.83, p = 0.001). Nearly two-thirds of patients discharged home were younger (OR 0.92, 95% CI 0.85–0.99, p = 0.036), had lower ASA class (OR 0.45, 95% CI 0.27–0.74, p = 0.002) and fewer in-hospital complications (OR 0.69, 95% CI 0.53–0.90, p = 0.007). Conclusion American Society of Anesthesiologists class and in-hospital complications are perioperative predictors of mortality and disposition in the older surgical population. Understanding the predictors of poor outcome and the importance of preventing in-hospital complications in older patients will have important clinical utility in terms of preoperative counselling, improving health care and discharging patients home. PMID:26204143
A Version of Quadratic Regression with Interpretable Parameters.
ERIC Educational Resources Information Center
Cudeck, Robert; du Toit, Stephen H. C.
2002-01-01
Suggests an alternative form of the quadratic model that has the same expectation function of the original model but has the useful feature that its parameters are interpretable. Provides examples of a simple regression problem and a nonlinear mixed-effects model. (SLD)
Logarithmic Transformations in Regression: Do You Transform Back Correctly?
ERIC Educational Resources Information Center
Dambolena, Ismael G.; Eriksen, Steven E.; Kopcso, David P.
2009-01-01
The logarithmic transformation is often used in regression analysis for a variety of purposes such as the linearization of a nonlinear relationship between two or more variables. We have noticed that when this transformation is applied to the response variable, the computation of the point estimate of the conditional mean of the original response…
Multinomial logistic regression ensembles.
Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J
2013-05-01
This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model. PMID:23611203
Bayesian Spatial Quantile Regression
Reich, Brian J.; Fuentes, Montserrat; Dunson, David B.
2013-01-01
Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997–2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794
Bayesian Spatial Quantile Regression.
Reich, Brian J; Fuentes, Montserrat; Dunson, David B
2011-03-01
Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997-2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. PMID:26861909
Linear regression in astronomy. I
NASA Technical Reports Server (NTRS)
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Evaluating differential effects using regression interactions and regression mixture models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Technology Transfer Automated Retrieval System (TEKTRAN)
Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in heat production, or energy expenditure (EE). Multivariate adaptive regression splines (MARS), is a nonparametric method that estimates complex nonlinear relationships by a seri...
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Quantile regression for climate data
NASA Astrophysics Data System (ADS)
Marasinghe, Dilhani Shalika
Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035
Nonparametric Covariate-Adjusted Association Tests Based on the Generalized Kendall’s Tau*
Zhu, Wensheng; Jiang, Yuan; Zhang, Heping
2012-01-01
Identifying the risk factors for comorbidity is important in psychiatric research. Empirically, studies have shown that testing multiple, correlated traits simultaneously is more powerful than testing a single trait at a time in association analysis. Furthermore, for complex diseases, especially mental illnesses and behavioral disorders, the traits are often recorded in different scales such as dichotomous, ordinal and quantitative. In the absence of covariates, nonparametric association tests have been developed for multiple complex traits to study comorbidity. However, genetic studies generally contain measurements of some covariates that may affect the relationship between the risk factors of major interest (such as genes) and the outcomes. While it is relatively easy to adjust these covariates in a parametric model for quantitative traits, it is challenging for multiple complex traits with possibly different scales. In this article, we propose a nonparametric test for multiple complex traits that can adjust for covariate effects. The test aims to achieve an optimal scheme of adjustment by using a maximum statistic calculated from multiple adjusted test statistics. We derive the asymptotic null distribution of the maximum test statistic, and also propose a resampling approach, both of which can be used to assess the significance of our test. Simulations are conducted to compare the type I error and power of the nonparametric adjusted test to the unadjusted test and other existing adjusted tests. The empirical results suggest that our proposed test increases the power through adjustment for covariates when there exist environmental effects, and is more robust to model misspecifications than some existing parametric adjusted tests. We further demonstrate the advantage of our test by analyzing a data set on genetics of alcoholism. PMID:22745516
Neumann, Anke; Billionnet, Cécile
2016-06-01
In observational studies without random assignment of the treatment, the unadjusted comparison between treatment groups may be misleading due to confounding. One method to adjust for measured confounders is inverse probability of treatment weighting. This method can also be used in the analysis of time to event data with competing risks. Competing risks arise if for some individuals the event of interest is precluded by a different type of event occurring before, or if only the earliest of several times to event, corresponding to different event types, is observed or is of interest. In the presence of competing risks, time to event data are often characterized by cumulative incidence functions, one for each event type of interest. We describe the use of inverse probability of treatment weighting to create adjusted cumulative incidence functions. This method is equivalent to direct standardization when the weight model is saturated. No assumptions about the form of the cumulative incidence functions are required. The method allows studying associations between treatment and the different types of event under study, while focusing on the earliest event only. We present a SAS macro implementing this method and we provide a worked example. PMID:27084321
ERIC Educational Resources Information Center
Furtwengler, Scott R.
2015-01-01
The present study sought to determine the extent to which participation in a post-secondary honors program affected academic achievement. Archival data were collected on three cohorts of high-achieving students at a large public university. Propensity scores were calculated on factors predicting participation in honors and used as the covariate.…
A consistent local linear estimator of the covariate adjusted correlation coefficient
Nguyen, Danh V.; Şentürk, Damla
2009-01-01
Consider the correlation between two random variables (X, Y), both not directly observed. One only observes X̃ = φ1(U)X + φ2(U) and Ỹ = ψ1(U)Y + ψ2(U), where all four functions {φl(·),ψl(·), l = 1, 2} are unknown/unspecified smooth functions of an observable covariate U. We consider consistent estimation of the correlation between the unobserved variables X and Y, adjusted for the above general dual additive and multiplicative effects of U, based on the observed data (X̃, Ỹ, U). PMID:21720454
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Ecological Regression and Voting Rights.
ERIC Educational Resources Information Center
Freedman, David A.; And Others
1991-01-01
The use of ecological regression in voting rights cases is discussed in the context of a lawsuit against Los Angeles County (California) in 1990. Ecological regression assumes that systematic voting differences between precincts are explained by ethnic differences. An alternative neighborhood model is shown to lead to different conclusions. (SLD)
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record PMID:26651981
[Regression grading in gastrointestinal tumors].
Tischoff, I; Tannapfel, A
2012-02-01
Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Splines for Diffeomorphic Image Regression
Singh, Nikhil; Niethammer, Marc
2016-01-01
This paper develops a method for splines on diffeomorphisms for image regression. In contrast to previously proposed methods to capture image changes over time, such as geodesic regression, the method can capture more complex spatio-temporal deformations. In particular, it is a first step towards capturing periodic motions for example of the heart or the lung. Starting from a variational formulation of splines the proposed approach allows for the use of temporal control points to control spline behavior. This necessitates the development of a shooting formulation for splines. Experimental results are shown for synthetic and real data. The performance of the method is compared to geodesic regression. PMID:25485370
Abstract Expression Grammar Symbolic Regression
NASA Astrophysics Data System (ADS)
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Multiple Regression and Its Discontents
ERIC Educational Resources Information Center
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Time-Warped Geodesic Regression
Hong, Yi; Singh, Nikhil; Kwitt, Roland; Niethammer, Marc
2016-01-01
We consider geodesic regression with parametric time-warps. This allows, for example, to capture saturation effects as typically observed during brain development or degeneration. While highly-flexible models to analyze time-varying image and shape data based on generalizations of splines and polynomials have been proposed recently, they come at the cost of substantially more complex inference. Our focus in this paper is therefore to keep the model and its inference as simple as possible while allowing to capture expected biological variation. We demonstrate that by augmenting geodesic regression with parametric time-warp functions, we can achieve comparable flexibility to more complex models while retaining model simplicity. In addition, the time-warp parameters provide useful information of underlying anatomical changes as demonstrated for the analysis of corpora callosa and rat calvariae. We exemplify our strategy for shape regression on the Grassmann manifold, but note that the method is generally applicable for time-warped geodesic regression. PMID:25485368
Basis Selection for Wavelet Regression
NASA Technical Reports Server (NTRS)
Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)
1998-01-01
A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.
Regression methods for spatial data
NASA Technical Reports Server (NTRS)
Yakowitz, S. J.; Szidarovszky, F.
1982-01-01
The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
Forward model nonlinearity versus inverse model nonlinearity
Mehl, S.
2007-01-01
The issue of concern is the impact of forward model nonlinearity on the nonlinearity of the inverse model. The question posed is, "Does increased nonlinearity in the head solution (forward model) always result in increased nonlinearity in the inverse solution (estimation of hydraulic conductivity)?" It is shown that the two nonlinearities are separate, and it is not universally true that increased forward model nonlinearity increases inverse model nonlinearity. ?? 2007 National Ground Water Association.
NASA Astrophysics Data System (ADS)
SjöBerg, Daniel
2003-04-01
We investigate the propagation of electromagnetic waves in a cylindrical waveguide with an arbitrary cross section filled with a nonlinear material. The electromagnetic field is expanded in the usual eigenmodes of the waveguide, and the coupling between the modes is quantified. We derive the wave equations governing each mode with special emphasis on the situation with a dominant TE mode. The result is a strictly hyperbolic system of nonlinear partial differential equations for the dominating mode, whereas the minor modes satisfy hyperbolic systems of linear, nonstationary, and partial differential equations. A growth estimate is given for the minor modes.
Interpretation of Standardized Regression Coefficients in Multiple Regression.
ERIC Educational Resources Information Center
Thayer, Jerome D.
The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for variables…
Demosaicing Based on Directional Difference Regression and Efficient Regression Priors.
Wu, Jiqing; Timofte, Radu; Van Gool, Luc
2016-08-01
Color demosaicing is a key image processing step aiming to reconstruct the missing pixels from a recorded raw image. On the one hand, numerous interpolation methods focusing on spatial-spectral correlations have been proved very efficient, whereas they yield a poor image quality and strong visible artifacts. On the other hand, optimization strategies, such as learned simultaneous sparse coding and sparsity and adaptive principal component analysis-based algorithms, were shown to greatly improve image quality compared with that delivered by interpolation methods, but unfortunately are computationally heavy. In this paper, we propose efficient regression priors as a novel, fast post-processing algorithm that learns the regression priors offline from training data. We also propose an independent efficient demosaicing algorithm based on directional difference regression, and introduce its enhanced version based on fused regression. We achieve an image quality comparable to that of the state-of-the-art methods for three benchmarks, while being order(s) of magnitude faster. PMID:27254866
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Cactus: An Introduction to Regression
ERIC Educational Resources Information Center
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression modelling of Dst index
NASA Astrophysics Data System (ADS)
Parnowski, Aleksei
We developed a new approach to the problem of real-time space weather indices forecasting using readily available data from ACE and a number of ground stations. It is based on the regression modelling method [1-3], which combines the benefits of empirical and statistical approaches. Mathematically it is based upon the partial regression analysis and Monte Carlo simulations to deduce the empirical relationships in the system. The typical elapsed time per forecast is a few seconds on an average PC. This technique can be easily extended to other indices like AE and Kp. The proposed system can also be useful for investigating physical phenomena related to interactions between the solar wind and the magnetosphere -it already helped uncovering two new geoeffective parameters. 1. Parnowski A.S. Regression modeling method of space weather prediction // Astrophysics Space Science. — 2009. — V. 323, 2. — P. 169-180. doi:10.1007/s10509-009-0060-4 [arXiv:0906.3271] 2. Parnovskiy A.S. Regression Modeling and its Application to the Problem of Prediction of Space Weather // Journal of Automation and Information Sciences. — 2009. — V. 41, 5. — P. 61-69. doi:10.1615/JAutomatInfScien.v41.i5.70 3. Parnowski A.S. Statistically predicting Dst without satellite data // Earth, Planets and Space. — 2009. — V. 61, 5. — P. 621-624.
Fungible Weights in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2008-01-01
Every set of alternate weights (i.e., nonleast squares weights) in a multiple regression analysis with three or more predictors is associated with an infinite class of weights. All members of a given class can be deemed "fungible" because they yield identical "SSE" (sum of squared errors) and R[superscript 2] values. Equations for generating…
Spontaneous regression of breast cancer.
Lewison, E F
1976-11-01
The dramatic but rare regression of a verified case of breast cancer in the absence of adequate, accepted, or conventional treatment has been observed and documented by clinicians over the course of many years. In my practice limited to diseases of the breast, over the past 25 years I have observed 12 patients with a unique and unusual clinical course valid enough to be regarded as spontaneous regression of breast cancer. These 12 patients, with clinically confirmed breast cancer, had temporary arrest or partial remission of their disease in the absence of complete or adequate treatment. In most of these cases, spontaneous regression could not be equated ultimately with permanent cure. Three of these case histories are summarized, and patient characteristics of pertinent clinical interest in the remaining case histories are presented and discussed. Despite widespread doubt and skepticism, there is ample clinical evidence to confirm the fact that spontaneous regression of breast cancer is a rare phenomenon but is real and does occur. PMID:799758
Regression Models of Atlas Appearance
Rohlfing, Torsten; Sullivan, Edith V.; Pfefferbaum, Adolf
2010-01-01
Models of object appearance based on principal components analysis provide powerful and versatile tools in computer vision and medical image analysis. A major shortcoming is that they rely entirely on the training data to extract principal modes of appearance variation and ignore underlying variables (e.g., subject age, gender). This paper introduces an appearance modeling framework based instead on generalized multi-linear regression. The training of regression appearance models is controlled by independent variables. This makes it straightforward to create model instances for specific values of these variables, which is akin to model interpolation. We demonstrate the new framework by creating an appearance model of the human brain from MR images of 36 subjects. Instances of the model created for different ages are compared with average shape atlases created from age-matched sub-populations. Relative tissue volumes vs. age in models are also compared with tissue volumes vs. subject age in the original images. In both experiments, we found excellent agreement between the regression models and the comparison data. We conclude that regression appearance models are a promising new technique for image analysis, with one potential application being the representation of a continuum of mutually consistent, age-specific atlases of the human brain. PMID:19694260
Correlation Weights in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.; Jones, Jeff A.
2010-01-01
A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…
Quantile Regression with Censored Data
ERIC Educational Resources Information Center
Lin, Guixian
2009-01-01
The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…
Regression models of atlas appearance.
Rohlfing, Torsten; Sullivan, Edith V; Pfefferbaum, Adolf
2009-01-01
Models of object appearance based on principal components analysis provide powerful and versatile tools in computer vision and medical image analysis. A major shortcoming is that they rely entirely on the training data to extract principal modes of appearance variation and ignore underlying variables (e.g., subject age, gender). This paper introduces an appearance modeling framework based instead on generalized multi-linear regression. The training of regression appearance models is controlled by independent variables. This makes it straightforward to create model instances for specific values of these variables, which is akin to model interpolation. We demonstrate the new framework by creating an appearance model of the human brain from MR images of 36 subjects. Instances of the model created for different ages are compared with average shape atlases created from age-matched sub-populations. Relative tissue volumes vs. age in models are also compared with tissue volumes vs. subject age in the original images. In both experiments, we found excellent agreement between the regression models and the comparison data. We conclude that regression appearance models are a promising new technique for image analysis, with one potential application being the representation of a continuum of mutually consistent, age-specific atlases of the human brain. PMID:19694260
Ridge Regression for Interactive Models.
ERIC Educational Resources Information Center
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…
Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors
Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David
2013-01-01
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials. PMID:24293988
Direct regression models for longitudinal rates of change
Bryan, Matthew; Heagerty, Patrick J.
2014-01-01
Comparing rates of growth, or rates of change, across covariate-defined subgroups is a primary objective for many longitudinal studies. In the special case of a linear trend over time, the interaction between a covariate and time will characterize differences in longitudinal rates of change. However, in the presence of a non-linear longitudinal trajectory, the standard mean regression approach does not permit parsimonious description or inference regarding differences in rates of change. Therefore, we propose regression methodology for longitudinal data that allows a direct, structured comparison of rates across subgroups even in the presence of a non-linear trend over time. Our basic longitudinal rate regression method assumes a proportional difference across covariate groups in the rate of change across time, but this assumption can be relaxed. Rates are compared relative to a generally specified time trend for which we discuss both parametric and non-parametric estimating approaches. We develop mixed model longitudinal methodology that explicitly characterizes subject-to-subject variation in rates, as well as a marginal estimating equation-based method. In addition, we detail a score test to detect violations of the proportionality assumption, and we allow time-varying rate effects as a natural generalization. Simulation results demonstrate potential gains in power for the longitudinal rate regression model relative to a linear mixed effects model in the presence of a non-linear trend in time. We apply our method to a study of growth among infants born to HIV infected mothers, and conclude with a discussion of possible extensions for our methods. PMID:24497427
Regression Verification Using Impact Summaries
NASA Technical Reports Server (NTRS)
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program
Nonlinear analysis of pupillary dynamics.
Onorati, Francesco; Mainardi, Luca Tommaso; Sirca, Fabiola; Russo, Vincenzo; Barbieri, Riccardo
2016-02-01
Pupil size reflects autonomic response to different environmental and behavioral stimuli, and its dynamics have been linked to other autonomic correlates such as cardiac and respiratory rhythms. The aim of this study is to assess the nonlinear characteristics of pupil size of 25 normal subjects who participated in a psychophysiological experimental protocol with four experimental conditions, namely “baseline”, “anger”, “joy”, and “sadness”. Nonlinear measures, such as sample entropy, correlation dimension, and largest Lyapunov exponent, were computed on reconstructed signals of spontaneous fluctuations of pupil dilation. Nonparametric statistical tests were performed on surrogate data to verify that the nonlinear measures are an intrinsic characteristic of the signals. We then developed and applied a piecewise linear regression model to detrended fluctuation analysis (DFA). Two joinpoints and three scaling intervals were identified: slope α0, at slow time scales, represents a persistent nonstationary long-range correlation, whereas α1 and α2, at middle and fast time scales, respectively, represent long-range power-law correlations, similarly to DFA applied to heart rate variability signals. Of the computed complexity measures, α0 showed statistically significant differences among experimental conditions (p<0.001). Our results suggest that (a) pupil size at constant light condition is characterized by nonlinear dynamics, (b) three well-defined and distinct long-memory processes exist at different time scales, and (c) autonomic stimulation is partially reflected in nonlinear dynamics. PMID:26351899
Regression analysis of networked data
Zhou, Yan; Song, Peter X.-K.
2016-01-01
This paper concerns regression methodology for assessing relationships between multi-dimensional response variables and covariates that are correlated within a network. To address analytical challenges associated with the integration of network topology into the regression analysis, we propose a hybrid quadratic inference method that uses both prior and data-driven correlations among network nodes. A Godambe information-based tuning strategy is developed to allocate weights between the prior and data-driven network structures, so the estimator is efficient. The proposed method is conceptually simple and computationally fast, and has appealing large-sample properties. It is evaluated by simulation, and its application is illustrated using neuroimaging data from an association study of the effects of iron deficiency on auditory recognition memory in infants. PMID:27279658
Observational Studies: Matching or Regression?
Brazauskas, Ruta; Logan, Brent R
2016-03-01
In observational studies with an aim of assessing treatment effect or comparing groups of patients, several approaches could be used. Often, baseline characteristics of patients may be imbalanced between groups, and adjustments are needed to account for this. It can be accomplished either via appropriate regression modeling or, alternatively, by conducting a matched pairs study. The latter is often chosen because it makes groups appear to be comparable. In this article we considered these 2 options in terms of their ability to detect a treatment effect in time-to-event studies. Our investigation shows that a Cox regression model applied to the entire cohort is often a more powerful tool in detecting treatment effect as compared with a matched study. Real data from a hematopoietic cell transplantation study is used as an example. PMID:26712591
PM10 forecasting using clusterwise regression
NASA Astrophysics Data System (ADS)
Poggi, Jean-Michel; Portier, Bruno
2011-12-01
In this paper, we are interested in the statistical forecasting of the daily mean PM10 concentration. Hourly concentrations of PM10 have been measured in the city of Rouen, in Haute-Normandie, France. Located at northwest of Paris, near the south side of Manche sea and heavily industrialised. We consider three monitoring stations reflecting the diversity of situations: an urban background station, a traffic station and an industrial station near the cereal harbour of Rouen. We have focused our attention on data for the months that register higher values, from December to March, on years 2004-2009. The models are obtained from the winter days of the four seasons 2004/2005 to 2007/2008 (training data) and then the forecasting performance is evaluated on the winter days of the season 2008/2009 (test data). We show that it is possible to accurately forecast the daily mean concentration by fitting a function of meteorological predictors and the average concentration measured on the previous day. The values of observed meteorological variables are used for fitting the models and are also considered for the test data. We have compared the forecasts produced by three different methods: persistence, generalized additive nonlinear models and clusterwise linear regression models. This last method gives very impressive results and the end of the paper tries to analyze the reasons of such a good behavior.
Sliced Inverse Regression for Time Series Analysis
NASA Astrophysics Data System (ADS)
Chen, Li-Sue
1995-11-01
In this thesis, general nonlinear models for time series data are considered. A basic form is x _{t} = f(beta_sp{1} {T}X_{t-1},beta_sp {2}{T}X_{t-1},... , beta_sp{k}{T}X_ {t-1},varepsilon_{t}), where x_{t} is an observed time series data, X_{t } is the first d time lag vector, (x _{t},x_{t-1},... ,x _{t-d-1}), f is an unknown function, beta_{i}'s are unknown vectors, varepsilon_{t }'s are independent distributed. Special cases include AR and TAR models. We investigate the feasibility applying SIR/PHD (Li 1990, 1991) (the sliced inverse regression and principal Hessian methods) in estimating beta _{i}'s. PCA (Principal component analysis) is brought in to check one critical condition for SIR/PHD. Through simulation and a study on 3 well -known data sets of Canadian lynx, U.S. unemployment rate and sunspot numbers, we demonstrate how SIR/PHD can effectively retrieve the interesting low-dimension structures for time series data.
Counting people with low-level features and Bayesian regression.
Chan, Antoni B; Vasconcelos, Nuno
2012-04-01
An approach to the problem of estimating the size of inhomogeneous crowds, which are composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is proposed. Instead, the crowd is segmented into components of homogeneous motion, using the mixture of dynamic-texture motion model. A set of holistic low-level features is extracted from each segmented region, and a function that maps features into estimates of the number of people per segment is learned with Bayesian regression. Two Bayesian regression models are examined. The first is a combination of Gaussian process regression with a compound kernel, which accounts for both the global and local trends of the count mapping but is limited by the real-valued outputs that do not match the discrete counts. We address this limitation with a second model, which is based on a Bayesian treatment of Poisson regression that introduces a prior distribution on the linear weights of the model. Since exact inference is analytically intractable, a closed-form approximation is derived that is computationally efficient and kernelizable, enabling the representation of nonlinear functions. An approximate marginal likelihood is also derived for kernel hyperparameter learning. The two regression-based crowd counting methods are evaluated on a large pedestrian data set, containing very distinct camera views, pedestrian traffic, and outliers, such as bikes or skateboarders. Experimental results show that regression-based counts are accurate regardless of the crowd size, outperforming the count estimates produced by state-of-the-art pedestrian detectors. Results on 2 h of video demonstrate the efficiency and robustness of the regression-based crowd size estimation over long periods of time. PMID:22020684
Deep Wavelet Scattering for Quantum Energy Regression
NASA Astrophysics Data System (ADS)
Hirn, Matthew
Physical functionals are usually computed as solutions of variational problems or from solutions of partial differential equations, which may require huge computations for complex systems. Quantum chemistry calculations of ground state molecular energies is such an example. Indeed, if x is a quantum molecular state, then the ground state energy E0 (x) is the minimum eigenvalue solution of the time independent Schrödinger Equation, which is computationally intensive for large systems. Machine learning algorithms do not simulate the physical system but estimate solutions by interpolating values provided by a training set of known examples {(xi ,E0 (xi) } i <= n . However, precise interpolations may require a number of examples that is exponential in the system dimension, and are thus intractable. This curse of dimensionality may be circumvented by computing interpolations in smaller approximation spaces, which take advantage of physical invariants. Linear regressions of E0 over a dictionary Φ ={ϕk } k compute an approximation E 0 as: E 0 (x) =∑kwkϕk (x) , where the weights {wk } k are selected to minimize the error between E0 and E 0 on the training set. The key to such a regression approach then lies in the design of the dictionary Φ. It must be intricate enough to capture the essential variability of E0 (x) over the molecular states x of interest, while simple enough so that evaluation of Φ (x) is significantly less intensive than a direct quantum mechanical computation (or approximation) of E0 (x) . In this talk we present a novel dictionary Φ for the regression of quantum mechanical energies based on the scattering transform of an intermediate, approximate electron density representation ρx of the state x. The scattering transform has the architecture of a deep convolutional network, composed of an alternating sequence of linear filters and nonlinear maps. Whereas in many deep learning tasks the linear filters are learned from the training data, here
Regression analysis of non-contact acousto-thermal signature data
NASA Astrophysics Data System (ADS)
Criner, Amanda; Schehl, Norman
2016-05-01
The non-contact acousto-thermal signature (NCATS) is a nondestructive evaluation technique with potential to detect fatigue in materials such as noisy titanium and polymer matrix composites. The underlying physical mechanisms and properties may be determined by parameter estimation via nonlinear regression. The nonlinear regression analysis formulation, including the underlying models, is discussed. Several models and associated data analyses are given along with the assumptions implicit in the underlying model. The results are anomalous. These anomalous results are evaluated with respect to the accuracy of the implicit assumptions.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Heteroscedastic transformation cure regression models.
Chen, Chyong-Mei; Chen, Chen-Hsin
2016-06-30
Cure models have been applied to analyze clinical trials with cures and age-at-onset studies with nonsusceptibility. Lu and Ying (On semiparametric transformation cure model. Biometrika 2004; 91:331?-343. DOI: 10.1093/biomet/91.2.331) developed a general class of semiparametric transformation cure models, which assumes that the failure times of uncured subjects, after an unknown monotone transformation, follow a regression model with homoscedastic residuals. However, it cannot deal with frequently encountered heteroscedasticity, which may result from dispersed ranges of failure time span among uncured subjects' strata. To tackle the phenomenon, this article presents semiparametric heteroscedastic transformation cure models. The cure status and the failure time of an uncured subject are fitted by a logistic regression model and a heteroscedastic transformation model, respectively. Unlike the approach of Lu and Ying, we derive score equations from the full likelihood for estimating the regression parameters in the proposed model. The similar martingale difference function to their proposal is used to estimate the infinite-dimensional transformation function. Our proposed estimating approach is intuitively applicable and can be conveniently extended to other complicated models when the maximization of the likelihood may be too tedious to be implemented. We conduct simulation studies to validate large-sample properties of the proposed estimators and to compare with the approach of Lu and Ying via the relative efficiency. The estimating method and the two relevant goodness-of-fit graphical procedures are illustrated by using breast cancer data and melanoma data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887342
Regression analysis of cytopathological data
Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.
1982-12-01
Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.
Multiatlas segmentation as nonparametric regression.
Awate, Suyash P; Whitaker, Ross T
2014-09-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528
Adaptive support vector regression for UAV flight control.
Shin, Jongho; Jin Kim, H; Kim, Youdan
2011-01-01
This paper explores an application of support vector regression for adaptive control of an unmanned aerial vehicle (UAV). Unlike neural networks, support vector regression (SVR) generates global solutions, because SVR basically solves quadratic programming (QP) problems. With this advantage, the input-output feedback-linearized inverse dynamic model and the compensation term for the inversion error are identified off-line, which we call I-SVR (inversion SVR) and C-SVR (compensation SVR), respectively. In order to compensate for the inversion error and the unexpected uncertainty, an online adaptation algorithm for the C-SVR is proposed. Then, the stability of the overall error dynamics is analyzed by the uniformly ultimately bounded property in the nonlinear system theory. In order to validate the effectiveness of the proposed adaptive controller, numerical simulations are performed on the UAV model. PMID:20970303
Robust and efficient estimation with weighted composite quantile regression
NASA Astrophysics Data System (ADS)
Jiang, Xuejun; Li, Jingzhi; Xia, Tian; Yan, Wanfeng
2016-09-01
In this paper we introduce a weighted composite quantile regression (CQR) estimation approach and study its application in nonlinear models such as exponential models and ARCH-type models. The weighted CQR is augmented by using a data-driven weighting scheme. With the error distribution unspecified, the proposed estimators share robustness from quantile regression and achieve nearly the same efficiency as the oracle maximum likelihood estimator (MLE) for a variety of error distributions including the normal, mixed-normal, Student's t, Cauchy distributions, etc. We also suggest an algorithm for the fast implementation of the proposed methodology. Simulations are carried out to compare the performance of different estimators, and the proposed approach is used to analyze the daily S&P 500 Composite index, which verifies the effectiveness and efficiency of our theoretical results.
Sridhar, Upasana Manimegalai; Govindarajan, Anand; Rhinehart, R Russell
2016-01-01
This work reveals the applicability of a relatively new optimization technique, Leapfrogging, for both nonlinear regression modeling and a methodology for nonlinear model-predictive control. Both are relatively simple, yet effective. The application on a nonlinear, pilot-scale, shell-and-tube heat exchanger reveals practicability of the techniques. PMID:26606850
New Nonlinear Multigrid Analysis
NASA Technical Reports Server (NTRS)
Xie, Dexuan
1996-01-01
The nonlinear multigrid is an efficient algorithm for solving the system of nonlinear equations arising from the numerical discretization of nonlinear elliptic boundary problems. In this paper, we present a new nonlinear multigrid analysis as an extension of the linear multigrid theory presented by Bramble. In particular, we prove the convergence of the nonlinear V-cycle method for a class of mildly nonlinear second order elliptic boundary value problems which do not have full elliptic regularity.
Practical Session: Multiple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
[Nonlinear magnetohydrodynamics
Not Available
1994-01-01
Resistive MHD equilibrium, even for small resistivity, differs greatly from ideal equilibrium, as do the dynamical consequences of its instabilities. The requirement, imposed by Faraday`s law, that time independent magnetic fields imply curl-free electric fields, greatly restricts the electric fields allowed inside a finite-resistivity plasma. If there is no flow and the implications of the Ohm`s law are taken into account (and they need not be, for ideal equilibria), the electric field must equal the resistivity times the current density. The vanishing of the divergence of the current density then provides a partial differential equation which, together with boundary conditions, uniquely determines the scalar potential, the electric field, and the current density, for any given resistivity profile. The situation parallels closely that of driven shear flows in hydrodynamics, in that while dissipative steady states are somewhat more complex than ideal ones, there are vastly fewer of them to consider. Seen in this light, the vast majority of ideal MHD equilibria are just irrelevant, incapable of being set up in the first place. The steady state whose stability thresholds and nonlinear behavior needs to be investigated ceases to be an arbitrary ad hoc exercise dependent upon the whim of the investigator, but is determined by boundary conditions and choice of resistivity profile.
Determination of airplane model structure from flight data by using modified stepwise regression
NASA Technical Reports Server (NTRS)
Klein, V.; Batterson, J. G.; Murphy, P. C.
1981-01-01
The linear and stepwise regressions are briefly introduced, then the problem of determining airplane model structure is addressed. The MSR was constructed to force a linear model for the aerodynamic coefficient first, then add significant nonlinear terms and delete nonsignificant terms from the model. In addition to the statistical criteria in the stepwise regression, the prediction sum of squares (PRESS) criterion and the analysis of residuals were examined for the selection of an adequate model. The procedure is used in examples with simulated and real flight data. It is shown that the MSR performs better than the ordinary stepwise regression and that the technique can also be applied to the large amplitude maneuvers.
Semiparametric regression during 2003–2007*
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2010-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800
Building Regression Models: The Importance of Graphics.
ERIC Educational Resources Information Center
Dunn, Richard
1989-01-01
Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Developmental Regression in Autism Spectrum Disorders
ERIC Educational Resources Information Center
Rogers, Sally J.
2004-01-01
The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…
Modeling maximum daily temperature using a varying coefficient regression model
NASA Astrophysics Data System (ADS)
Li, Han; Deng, Xinwei; Kim, Dong-Yun; Smith, Eric P.
2014-04-01
Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature. A good predictive model for daily maximum temperature is required because daily maximum temperature is an important measure for predicting survival of temperature sensitive fish. To appropriately model the strong relationship between water and air temperatures at a daily time step, it is important to incorporate information related to the time of the year into the modeling. In this work, a time-varying coefficient model is used to study the relationship between air temperature and water temperature. The time-varying coefficient model enables dynamic modeling of the relationship, and can be used to understand how the air-water temperature relationship varies over time. The proposed model is applied to 10 streams in Maryland, West Virginia, Virginia, North Carolina, and Georgia using daily maximum temperatures. It provides a better fit and better predictions than those produced by a simple linear regression model or a nonlinear logistic model.
Estimating equivalence with quantile regression.
Cade, Brian S
2011-01-01
Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. PMID:21516905
Streamflow forecasting using functional regression
NASA Astrophysics Data System (ADS)
Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.
2016-07-01
Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.
Insulin resistance: regression and clustering.
Yoon, Sangho; Assimes, Themistocles L; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A
2014-01-01
In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be. PMID:24887437
Harmonic regression and scale stability.
Lee, Yi-Hsuan; Haberman, Shelby J
2013-10-01
Monitoring a very frequently administered educational test with a relatively short history of stable operation imposes a number of challenges. Test scores usually vary by season, and the frequency of administration of such educational tests is also seasonal. Although it is important to react to unreasonable changes in the distributions of test scores in a timely fashion, it is not a simple matter to ascertain what sort of distribution is really unusual. Many commonly used approaches for seasonal adjustment are designed for time series with evenly spaced observations that span many years and, therefore, are inappropriate for data from such educational tests. Harmonic regression, a seasonal-adjustment method, can be useful in monitoring scale stability when the number of years available is limited and when the observations are unevenly spaced. Additional forms of adjustments can be included to account for variability in test scores due to different sources of population variations. To illustrate, real data are considered from an international language assessment. PMID:24092490
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. PMID:26188633
Developmental regression in autism spectrum disorder
Al Backer, Nouf Backer
2015-01-01
The occurrence of developmental regression in autism spectrum disorder (ASD) is one of the most puzzling phenomena of this disorder. A little is known about the nature and mechanism of developmental regression in ASD. About one-third of young children with ASD lose some skills during the preschool period, usually speech, but sometimes also nonverbal communication, social or play skills are also affected. There is a lot of evidence suggesting that most children who demonstrate regression also had previous, subtle, developmental differences. It is difficult to predict the prognosis of autistic children with developmental regression. It seems that the earlier development of social, language, and attachment behaviors followed by regression does not predict the later recovery of skills or better developmental outcomes. The underlying mechanisms that lead to regression in autism are unknown. The role of subclinical epilepsy in the developmental regression of children with autism remains unclear. PMID:27493417
A Survey of UML Based Regression Testing
NASA Astrophysics Data System (ADS)
Fahad, Muhammad; Nadeem, Aamer
Regression testing is the process of ensuring software quality by analyzing whether changed parts behave as intended, and unchanged parts are not affected by the modifications. Since it is a costly process, a lot of techniques are proposed in the research literature that suggest testers how to build regression test suite from existing test suite with minimum cost. In this paper, we discuss the advantages and drawbacks of using UML diagrams for regression testing and analyze that UML model helps in identifying changes for regression test selection effectively. We survey the existing UML based regression testing techniques and provide an analysis matrix to give a quick insight into prominent features of the literature work. We discuss the open research issues like managing and reducing the size of regression test suite, prioritization of the test cases that would be helpful during strict schedule and resources that remain to be addressed for UML based regression testing.
Multiobjective optimization for model selection in kernel methods in regression.
You, Di; Benitez-Quiroz, Carlos Fabian; Martinez, Aleix M
2014-10-01
Regression plays a major role in many scientific and engineering problems. The goal of regression is to learn the unknown underlying function from a set of sample vectors with known outcomes. In recent years, kernel methods in regression have facilitated the estimation of nonlinear functions. However, two major (interconnected) problems remain open. The first problem is given by the bias-versus-variance tradeoff. If the model used to estimate the underlying function is too flexible (i.e., high model complexity), the variance will be very large. If the model is fixed (i.e., low complexity), the bias will be large. The second problem is to define an approach for selecting the appropriate parameters of the kernel function. To address these two problems, this paper derives a new smoothing kernel criterion, which measures the roughness of the estimated function as a measure of model complexity. Then, we use multiobjective optimization to derive a criterion for selecting the parameters of that kernel. The goal of this criterion is to find a tradeoff between the bias and the variance of the learned function. That is, the goal is to increase the model fit while keeping the model complexity in check. We provide extensive experimental evaluations using a variety of problems in machine learning, pattern recognition, and computer vision. The results demonstrate that the proposed approach yields smaller estimation errors as compared with methods in the state of the art. PMID:25291740
Multiobjective Optimization for Model Selection in Kernel Methods in Regression
You, Di; Benitez-Quiroz, C. Fabian; Martinez, Aleix M.
2016-01-01
Regression plays a major role in many scientific and engineering problems. The goal of regression is to learn the unknown underlying function from a set of sample vectors with known outcomes. In recent years, kernel methods in regression have facilitated the estimation of nonlinear functions. However, two major (interconnected) problems remain open. The first problem is given by the bias-vs-variance trade-off. If the model used to estimate the underlying function is too flexible (i.e., high model complexity), the variance will be very large. If the model is fixed (i.e., low complexity), the bias will be large. The second problem is to define an approach for selecting the appropriate parameters of the kernel function. To address these two problems, this paper derives a new smoothing kernel criterion, which measures the roughness of the estimated function as a measure of model complexity. Then, we use multiobjective optimization to derive a criterion for selecting the parameters of that kernel. The goal of this criterion is to find a trade-off between the bias and the variance of the learned function. That is, the goal is to increase the model fit while keeping the model complexity in check. We provide extensive experimental evaluations using a variety of problems in machine learning, pattern recognition and computer vision. The results demonstrate that the proposed approach yields smaller estimation errors as compared to methods in the state of the art. PMID:25291740
NASA Astrophysics Data System (ADS)
Lauterborn, Werner; Kurz, Thomas; Akhatov, Iskander
At high sound intensities or long propagation distances at
Regression in schizophrenia and its therapeutic value.
Yazaki, N
1992-03-01
Using the regression evaluation scale, 25 schizophrenic patients were classified into three groups of Dissolution/autism (DAUG), Dissolution----attachment (DATG) and Non-regression (NRG). The regression of DAUG was of the type in which autism occurred when destructiveness emerged, while the regression of DATG was of the type in which attachment occurred when destructiveness emerged. This suggests that the regressive phenomena are an actualized form of the approach complex. In order to determine the factors distinguishing these two groups, I investigated psychiatric symptoms, mother-child relationships, premorbid personalities and therapeutic interventions. I believe that these factors form a continuity in which they interrelatedly determine the regressive state. Foremost among them, I stressed the importance of the mother-child relationship. PMID:1353128
Data Mining within a Regression Framework
NASA Astrophysics Data System (ADS)
Berk, Richard A.
Regression analysis can imply a far wider range of statistical procedures than often appreciated. In this chapter, a number of common Data Mining procedures are discussed within a regression framework. These include non-parametric smoothers, classification and regression trees, bagging, and random forests. In each case, the goal is to characterize one or more of the distributional features of a response conditional on a set of predictors.
LRGS: Linear Regression by Gibbs Sampling
NASA Astrophysics Data System (ADS)
Mantz, Adam B.
2016-02-01
LRGS (Linear Regression by Gibbs Sampling) implements a Gibbs sampler to solve the problem of multivariate linear regression with uncertainties in all measured quantities and intrinsic scatter. LRGS extends an algorithm by Kelly (2007) that used Gibbs sampling for performing linear regression in fairly general cases in two ways: generalizing the procedure for multiple response variables, and modeling the prior distribution of covariates using a Dirichlet process.
Geodesic least squares regression on information manifolds
Verdoolaege, Geert
2014-12-05
We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.
Quantile regression applied to spectral distance decay
Rocchini, D.; Cade, B.S.
2008-01-01
Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.
Hybrid fuzzy regression with trapezoidal fuzzy data
NASA Astrophysics Data System (ADS)
Razzaghnia, T.; Danesh, S.; Maleki, A.
2011-12-01
In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.
Radman, Andreja; Gredičak, Matija; Kopriva, Ivica; Jerić, Ivanka
2011-01-01
Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met) with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel) support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM) regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample. PMID:22272081
Radman, Andreja; Gredičak, Matija; Kopriva, Ivica; Jerić, Ivanka
2011-01-01
Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met) with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel) support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM) regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample. PMID:22272081
Nonlinear Hysteretic Torsional Waves.
Cabaret, J; Béquin, P; Theocharis, G; Andreev, V; Gusev, V E; Tournat, V
2015-07-31
We theoretically study and experimentally report the propagation of nonlinear hysteretic torsional pulses in a vertical granular chain made of cm-scale, self-hanged magnetic beads. As predicted by contact mechanics, the torsional coupling between two beads is found to be nonlinear hysteretic. This results in a nonlinear pulse distortion essentially different from the distortion predicted by classical nonlinearities and in a complex dynamic response depending on the history of the wave particle angular velocity. Both are consistent with the predictions of purely hysteretic nonlinear elasticity and the Preisach-Mayergoyz hysteresis model, providing the opportunity to study the phenomenon of nonlinear dynamic hysteresis in the absence of other types of material nonlinearities. The proposed configuration reveals a plethora of interesting phenomena including giant amplitude-dependent attenuation, short-term memory, as well as dispersive properties. Thus, it could find interesting applications in nonlinear wave control devices such as strong amplitude-dependent filters. PMID:26274421
Nonlinear Hysteretic Torsional Waves
NASA Astrophysics Data System (ADS)
Cabaret, J.; Béquin, P.; Theocharis, G.; Andreev, V.; Gusev, V. E.; Tournat, V.
2015-07-01
We theoretically study and experimentally report the propagation of nonlinear hysteretic torsional pulses in a vertical granular chain made of cm-scale, self-hanged magnetic beads. As predicted by contact mechanics, the torsional coupling between two beads is found to be nonlinear hysteretic. This results in a nonlinear pulse distortion essentially different from the distortion predicted by classical nonlinearities and in a complex dynamic response depending on the history of the wave particle angular velocity. Both are consistent with the predictions of purely hysteretic nonlinear elasticity and the Preisach-Mayergoyz hysteresis model, providing the opportunity to study the phenomenon of nonlinear dynamic hysteresis in the absence of other types of material nonlinearities. The proposed configuration reveals a plethora of interesting phenomena including giant amplitude-dependent attenuation, short-term memory, as well as dispersive properties. Thus, it could find interesting applications in nonlinear wave control devices such as strong amplitude-dependent filters.
Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo
2012-11-01
For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability. PMID:22124584
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Illustration of Regression towards the Means
ERIC Educational Resources Information Center
Govindaraju, K.; Haslett, S. J.
2008-01-01
This article presents a procedure for generating a sequence of data sets which will yield exactly the same fitted simple linear regression equation y = a + bx. Unless rescaled, the generated data sets will have progressively smaller variability for the two variables, and the associated response and covariate will "regress" towards their…
Stepwise versus Hierarchical Regression: Pros and Cons
ERIC Educational Resources Information Center
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Cross-Validation, Shrinkage, and Multiple Regression.
ERIC Educational Resources Information Center
Hynes, Kevin
One aspect of multiple regression--the shrinkage of the multiple correlation coefficient on cross-validation is reviewed. The paper consists of four sections. In section one, the distinction between a fixed and a random multiple regression model is made explicit. In section two, the cross-validation paradigm and an explanation for the occurrence…
Principles of Quantile Regression and an Application
ERIC Educational Resources Information Center
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
Dealing with Outliers: Robust, Resistant Regression
ERIC Educational Resources Information Center
Glasser, Leslie
2007-01-01
Least-squares linear regression is the best of statistics and it is the worst of statistics. The reasons for this paradoxical claim, arising from possible inapplicability of the method and the excessive influence of "outliers", are discussed and substitute regression methods based on median selection, which is both robust and resistant, are…
A Practical Guide to Regression Discontinuity
ERIC Educational Resources Information Center
Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard
2012-01-01
Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…
Sulphasalazine and regression of rheumatoid nodules.
Englert, H J; Hughes, G R; Walport, M J
1987-03-01
The regression of small rheumatoid nodules was noted in four patients after starting sulphasalazine therapy. This coincided with an improvement in synovitis and also falls in erythrocyte sedimentation rate (ESR) and C reactive protein (CRP). The relation between the nodule regression and the sulphasalazine therapy is discussed. PMID:2883940
A Simulation Investigation of Principal Component Regression.
ERIC Educational Resources Information Center
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Three-Dimensional Modeling in Linear Regression.
ERIC Educational Resources Information Center
Herman, James D.
Linear regression examines the relationship between one or more independent (predictor) variables and a dependent variable. By using a particular formula, regression determines the weights needed to minimize the error term for a given set of predictors. With one predictor variable, the relationship between the predictor and the dependent variable…
Symplectic geometry spectrum regression for prediction of noisy time series
NASA Astrophysics Data System (ADS)
Xie, Hong-Bo; Dokos, Socrates; Sivakumar, Bellie; Mengersen, Kerrie
2016-05-01
We present the symplectic geometry spectrum regression (SGSR) technique as well as a regularized method based on SGSR for prediction of nonlinear time series. The main tool of analysis is the symplectic geometry spectrum analysis, which decomposes a time series into the sum of a small number of independent and interpretable components. The key to successful regularization is to damp higher order symplectic geometry spectrum components. The effectiveness of SGSR and its superiority over local approximation using ordinary least squares are demonstrated through prediction of two noisy synthetic chaotic time series (Lorenz and Rössler series), and then tested for prediction of three real-world data sets (Mississippi River flow data and electromyographic and mechanomyographic signal recorded from human body).
Symplectic geometry spectrum regression for prediction of noisy time series.
Xie, Hong-Bo; Dokos, Socrates; Sivakumar, Bellie; Mengersen, Kerrie
2016-05-01
We present the symplectic geometry spectrum regression (SGSR) technique as well as a regularized method based on SGSR for prediction of nonlinear time series. The main tool of analysis is the symplectic geometry spectrum analysis, which decomposes a time series into the sum of a small number of independent and interpretable components. The key to successful regularization is to damp higher order symplectic geometry spectrum components. The effectiveness of SGSR and its superiority over local approximation using ordinary least squares are demonstrated through prediction of two noisy synthetic chaotic time series (Lorenz and Rössler series), and then tested for prediction of three real-world data sets (Mississippi River flow data and electromyographic and mechanomyographic signal recorded from human body). PMID:27300890
Improved speech inversion using general regression neural network.
Najnin, Shamima; Banerjee, Bonny
2015-09-01
The problem of nonlinear acoustic to articulatory inversion mapping is investigated in the feature space using two models, the deep belief network (DBN) which is the state-of-the-art, and the general regression neural network (GRNN). The task is to estimate a set of articulatory features for improved speech recognition. Experiments with MOCHA-TIMIT and MNGU0 databases reveal that, for speech inversion, GRNN yields a lower root-mean-square error and a higher correlation than DBN. It is also shown that conjunction of acoustic and GRNN-estimated articulatory features yields state-of-the-art accuracy in broad class phonetic classification and phoneme recognition using less computational power. PMID:26428818
On robust regression with high-dimensional predictors
El Karoui, Noureddine; Bean, Derek; Bickel, Peter J.; Lim, Chinghway; Yu, Bin
2013-01-01
We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but . We find an exact stochastic representation for the distribution of at fixed p and n under various assumptions on the objective function ρ and our statistical model. A scalar random variable whose deterministic limit can be studied when plays a central role in this representation. We discover a nonlinear system of two deterministic equations that characterizes . Interestingly, the system shows that depends on ρ through proximal mappings of ρ as well as various aspects of the statistical model underlying our study. Several surprising results emerge. In particular, we show that, when is large enough, least squares becomes preferable to least absolute deviations for double-exponential errors. PMID:23954908
Nonlinear Pricing in Energy and Environmental Markets
NASA Astrophysics Data System (ADS)
Ito, Koichiro
This dissertation consists of three empirical studies on nonlinear pricing in energy and environmental markets. The first investigates how consumers respond to multi-tier nonlinear price schedules for residential electricity. Chapter 2 asks a similar research question for residential water pricing. Finally, I examine the effect of nonlinear financial rewards for energy conservation by applying a regression discontinuity design to a large-scale electricity rebate program that was implemented in California. Economic theory generally assumes that consumers respond to marginal prices when making economic decisions, but this assumption may not hold for complex price schedules. The chapter "Do Consumers Respond to Marginal or Average Price? Evidence from Nonlinear Electricity Pricing" provides empirical evidence that consumers respond to average price rather than marginal price when faced with nonlinear electricity price schedules. Nonlinear price schedules, such as progressive income tax rates and multi-tier electricity prices, complicate economic decisions by creating multiple marginal prices for the same good. Evidence from laboratory experiments suggests that consumers facing such price schedules may respond to average price as a heuristic. I empirically test this prediction using field data by exploiting price variation across a spatial discontinuity in electric utility service areas. The territory border of two electric utilities lies within several city boundaries in southern California. As a result, nearly identical households experience substantially different nonlinear electricity price schedules. Using monthly household-level panel data from 1999 to 2008, I find strong evidence that consumers respond to average price rather than marginal or expected marginal price. I show that even though this sub-optimizing behavior has a minimal impact on individual welfare, it can critically alter the policy implications of nonlinear pricing. The second chapter " How Do
Technology Transfer Automated Retrieval System (TEKTRAN)
In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...
NASA Astrophysics Data System (ADS)
Darnah
2016-04-01
Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.
Investigating bias in squared regression structure coefficients
Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The objective of this paper is to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. An investigation of the signal-response equations is conducted and the concept is tested by application to actual remote sensing data from a laboratory experiment performed under controlled conditions. Investigation of the signal-response equations shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined.
Tian, Pan; Hu, Jie; Qi, Jin; Xia, Peng; Peng, Ying-Hong
2015-01-01
Applications of neural machine interfaces have received increased attention during the last decades. It is crucial to realize the continuous control of prosthetic devices based on biological signals. In order to deal with the highly nonlinear relationship between the Electromyography (EMG) signals and motion, this study presents a novel decoding approach which employs multi-output support vector regression (M-SVR). The proposed M-SVR is compared with other popular regression techniques and the experimental results demonstrate the effectiveness of M-SVR in hand continuous movement trajectory reconstruction. PMID:26406051
Alley, William M.
1986-01-01
Problems involving the combined use of contaminant transport models and nonlinear optimization schemes can be very expensive to solve. This paper explores the use of transport models with ordinary regression and regression on ranks to develop approximate response functions of concentrations at critical locations as a function of pumping and recharge at decision wells. These response functions combined with other constraints can often be solved very easily and may suggest reasonable starting points for combined simulation-management modeling or even relatively efficient operating schemes in themselves.
A Regression Algorithm for Model Reduction of Large-Scale Multi-Dimensional Problems
NASA Astrophysics Data System (ADS)
Rasekh, Ehsan
2011-11-01
Model reduction is an approach for fast and cost-efficient modelling of large-scale systems governed by Ordinary Differential Equations (ODEs). Multi-dimensional model reduction has been suggested for reduction of the linear systems simultaneously with respect to frequency and any other parameter of interest. Multi-dimensional model reduction is also used to reduce the weakly nonlinear systems based on Volterra theory. Multiple dimensions degrade the efficiency of reduction by increasing the size of the projection matrix. In this paper a new methodology is proposed to efficiently build the reduced model based on regression analysis. A numerical example confirms the validity of the proposed regression algorithm for model reduction.
The Current and Future Use of Ridge Regression for Prediction in Quantitative Genetics
de Vlaming, Ronald; Groenen, Patrick J. F.
2015-01-01
In recent years, there has been a considerable amount of research on the use of regularization methods for inference and prediction in quantitative genetics. Such research mostly focuses on selection of markers and shrinkage of their effects. In this review paper, the use of ridge regression for prediction in quantitative genetics using single-nucleotide polymorphism data is discussed. In particular, we consider (i) the theoretical foundations of ridge regression, (ii) its link to commonly used methods in animal breeding, (iii) the computational feasibility, and (iv) the scope for constructing prediction models with nonlinear effects (e.g., dominance and epistasis). Based on a simulation study we gauge the current and future potential of ridge regression for prediction of human traits using genome-wide SNP data. We conclude that, for outcomes with a relatively simple genetic architecture, given current sample sizes in most cohorts (i.e., N < 10,000) the predictive accuracy of ridge regression is slightly higher than the classical genome-wide association study approach of repeated simple regression (i.e., one regression per SNP). However, both capture only a small proportion of the heritability. Nevertheless, we find evidence that for large-scale initiatives, such as biobanks, sample sizes can be achieved where ridge regression compared to the classical approach improves predictive accuracy substantially. PMID:26273586
Regression of altitude-produced cardiac hypertrophy.
NASA Technical Reports Server (NTRS)
Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.
1973-01-01
The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.
L-moments under nuisance regression
NASA Astrophysics Data System (ADS)
Picek, Jan; Schindler, Martin
2016-06-01
The L-moments are analogues of the conventional moments and have similar interpretations. They are calculated using linear combinations of the expectation of ordered data. In practice, L-moments must usually be estimated from a random sample drawn from an unknown distribution as a linear combination of ordered statistics. Jureckova and Picek (2014) showed that averaged regression quantile is asymptotically equivalent to the location quantile. We therefore propose a generalization of L-moments in the model with nuisance regression using the averaged regression quantiles.
Sparse Multivariate Regression With Covariance Estimation
Rothman, Adam J.; Levina, Elizaveta; Zhu, Ji
2014-01-01
We propose a procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for correlation of the response variables. This method, which we call multivariate regression with covariance estimation (MRCE), involves penalized likelihood with simultaneous estimation of the regression coefficients and the covariance structure. An efficient optimization algorithm and a fast approximation are developed for computing MRCE. Using simulation studies, we show that the proposed method outperforms relevant competitors when the responses are highly correlated. We also apply the new method to a finance example on predicting asset returns. An R-package containing this dataset and code for computing MRCE and its approximation are available online. PMID:24963268
Spontaneous Regression of Primitive Merkel Cell Carcinoma
2015-01-01
Merkel cell carcinoma (MCC) is a rare, aggressive skin tumor that mainly occurs in the elderly with a generally poor prognosis. Like all skin cancers, its incidence is rising. Despite the poor prognosis, a few reports of spontaneous regression have been published. We describe the case of a 89-year-old male patient who presented two MCC lesions of the scalp. Following biopsy the lesions underwent complete regression with no clinical evidence of residual tumor up to 24 months. The current knowledge of MCC and the other cases of spontaneous regression described in the literature are reviewed. PMID:26788270
Testing the product of slopes in related regressions.
Morrell, Christopher H; Shetty, Veena; Phillips, Terry; Arumugam, Thiruma V; Mattson, Mark P; Wan, Ruiqian
2013-09-01
A study was conducted of the relationships among neuroprotective factors and cytokines in brain tissue of mice at different ages that were examined on the effect of dietary restriction on protection after experimentally induced brain stroke. It was of interest to assess whether the cross-product of the slopes of pairs of variables vs. age was positive or negative. To accomplish this, the product of the slopes was estimated and tested to determine if it is significantly different from zero. Since the measurements are taken on the same animals, the models used must account for the non-independence of the measurements within animals. A number of approaches are illustrated. First a multivariate multiple regression model is employed. Since we are interested in a nonlinear function of the parameters (the product) the delta method is used to obtain the standard error of the estimate of the product. Second, a linear mixed-effects model is fit that allows for the specification of an appropriate correlation structure among repeated measurements. The delta method is again used to obtain the standard error. Finally, a non-linear mixed-effects approach is taken to fit the linear-mixed-effects model and conduct the test. A simulation study investigates the properties of the procedure. PMID:25346580
Fuzzy regression modeling for tool performance prediction and degradation detection.
Li, X; Er, M J; Lim, B S; Zhou, J H; Gan, O P; Rutkowski, L
2010-10-01
In this paper, the viability of using Fuzzy-Rule-Based Regression Modeling (FRM) algorithm for tool performance and degradation detection is investigated. The FRM is developed based on a multi-layered fuzzy-rule-based hybrid system with Multiple Regression Models (MRM) embedded into a fuzzy logic inference engine that employs Self Organizing Maps (SOM) for clustering. The FRM converts a complex nonlinear problem to a simplified linear format in order to further increase the accuracy in prediction and rate of convergence. The efficacy of the proposed FRM is tested through a case study - namely to predict the remaining useful life of a ball nose milling cutter during a dry machining process of hardened tool steel with a hardness of 52-54 HRc. A comparative study is further made between four predictive models using the same set of experimental data. It is shown that the FRM is superior as compared with conventional MRM, Back Propagation Neural Networks (BPNN) and Radial Basis Function Networks (RBFN) in terms of prediction accuracy and learning speed. PMID:20945519
Kernel Regression Techniques for Enhancing Spitzer Photometric Precision
NASA Astrophysics Data System (ADS)
Ingalls, James G.; Krick, Jessica; Carey, Sean; Grillmair, Carl J.; Lowrance, Patrick; Glaccum, William; Laine, Seppo; Surace, Jason Anthony
2015-08-01
The Infrared Array Camera (IRAC) on the Spitzer Space Telescope has been used to measure < 0.01% temporal variations in the fluxes of exoplanet systems. The IRAC PSF at both 3.6 and 4.5 μm is undersampled and thus the detector arrays show variations of as much as 8% in sensitivity as the center of the PSF moves across a pixel due to normal spacecraft motions. This is the largest source of correlated noise in IRAC photometry. We describe the latest progress towards an independent calibration of the intra-pixel gain that does not rely on the measurements to be calibrated. The technique begins with: (1) localizing the sub-pixel position of a point source using Spitzer’s Pointing Calibration and Reference Sensor (PCRS); and (2) harnessing a “training set” of many thousands of densely spaced photometric measurements of a non-variable star. Kernel regression, where the training data are nonlinearly combined based on a distance metric for each data point, leads to significant improvements in photometric precision over our previous gridded method. The distance metric we use was derived from a supervised learning algorithm to minimize regression error. We conclude that these results rival the precision obtained with self-calibration techniques, but do not risk the removal of astrophysical signals.
Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).
Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W
2016-07-20
Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022
Different approaches to multivariate calibration of nonlinear sensor data.
Dieterle, Frank; Busche, Stefan; Gauglitz, Günter
2004-10-01
In this study, different approaches to the multivariate calibration of the vapors of two refrigerants are reported. As the relationships between the time-resolved sensor signals and the concentrations of the analytes are nonlinear, the widely used partial least-squares regression (PLS) fails. Therefore, different methods are used, which are known to be able to deal with nonlinearities present in data. First, the Box-Cox transformation, which transforms the dependent variables nonlinearly, was applied. The second approach, the implicit nonlinear PLS regression, tries to account for nonlinearities by introducing squared terms of the independent variables to the original independent variables. The third approach, quadratic PLS (QPLS), uses a nonlinear quadratic inner relationship for the model instead of a linear relationship such as PLS. Tree algorithms are also used, which split a nonlinear problem into smaller subproblems, which are modeled using linear methods or discrete values. Finally, neural networks are applied, which are able to model any relationship. Different special implementations, like genetic algorithms with neural networks and growing neural networks, are also used to prevent an overfitting. Among the fast and simpler algorithms, QPLS shows good results. Different implementations of neural networks show excellent results. Among the different implementations, the most sophisticated and computing-intensive algorithms (growing neural networks) show the best results. Thus, the optimal method for the data set presented is a compromise between quality of calibration and complexity of the algorithm. PMID:15156303
Some Simple Computational Formulas for Multiple Regression
ERIC Educational Resources Information Center
Aiken, Lewis R., Jr.
1974-01-01
Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)
TWSVR: Regression via Twin Support Vector Machine.
Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh
2016-02-01
Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. PMID:26624223
Spontaneous Regression of an Incidental Spinal Meningioma
Yilmaz, Ali; Kizilay, Zahir; Sair, Ahmet; Avcil, Mucahit; Ozkul, Ayca
2016-01-01
AIM: The regression of meningioma has been reported in literature before. In spite of the fact that the regression may be involved by hemorrhage, calcification or some drugs withdrawal, it is rarely observed spontaneously. CASE REPORT: We report a 17 year old man with a cervical meningioma which was incidentally detected. In his cervical MRI an extradural, cranio-caudal contrast enchanced lesion at C2-C3 levels of the cervical spinal cord was detected. Despite the slight compression towards the spinal cord, he had no symptoms and refused any kind of surgical approach. The meningioma was followed by control MRI and it spontaneously regressed within six months. There were no signs of hemorrhage or calcification. CONCLUSION: Although it is a rare condition, the clinicians should consider that meningiomas especially incidentally diagnosed may be regressed spontaneously. PMID:27275345
A new bivariate negative binomial regression model
NASA Astrophysics Data System (ADS)
Faroughi, Pouya; Ismail, Noriszura
2014-12-01
This paper introduces a new form of bivariate negative binomial (BNB-1) regression which can be fitted to bivariate and correlated count data with covariates. The BNB regression discussed in this study can be fitted to bivariate and overdispersed count data with positive, zero or negative correlations. The joint p.m.f. of the BNB1 distribution is derived from the product of two negative binomial marginals with a multiplicative factor parameter. Several testing methods were used to check overdispersion and goodness-of-fit of the model. Application of BNB-1 regression is illustrated on Malaysian motor insurance dataset. The results indicated that BNB-1 regression has better fit than bivariate Poisson and BNB-2 models with regards to Akaike information criterion.
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.
1979-01-01
The paper attempts to define optical physics and/or environmental conditions under which the linear multiple-regression should be applicable. It is reported that investigation of the signal response shows that the exact solution for a number of optical physics conditions is of the same form as a linearized multiple-regression equation, even if nonlinear contributions from surface reflections, atmospheric constituents, or other water pollutants are included. Limitations on achieving this type of solution are defined. Laboratory data are used to demonstrate that the technique is applicable to water mixtures which contain constituents with both linear and nonlinear radiance gradients. Finally, it is concluded that instrument noise, ground-truth placement, and time lapse between remote sensor overpass and water sample operations are serious barriers to successful use of the technique.
The Geometry of Enhancement in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and enhancement cannot…
Fuzzy multiple linear regression: A computational approach
NASA Technical Reports Server (NTRS)
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Multiple-Instance Regression with Structured Data
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.
Kondric, Miran; Trajkovski, Biljana; Strbad, Maja; Foretić, Nikola; Zenić, Natasa
2013-12-01
There is evident lack of studies which investigated morphological influence on physical fitness (PF) among preschool children. The aim of this study was to (1) calculate and interpret linear and nonlinear relationships between simple anthropometric predictors and PF criteria among preschoolers of both genders, and (2) to find critical values of the anthropometric predictors which should be recognized as the breakpoint of the negative influence on the PF. The sample of subjects consisted of 413 preschoolers aged 4 to 6 (mean age, 5.08 years; 176 girls and 237 boys), from Rijeka, Croatia. The anthropometric variables included body height (BH), body weight (BW), sum of triceps and subscapular skinfold (SUMSF), and calculated BMI (BMI = BW (kg)/BH (m)2). The PF was screened throughout testing of flexibility, repetitive strength, explosive strength, and agility. Linear and nonlinear (general quadratic model y = a + bx + cx2) regressions were calculated and interpreted simultaneously. BH and BW are far better predictors of the physical fitness status than BMI and SUMSF. In all calculated regressions excluding flexibility criterion, linear and nonlinear prediction of the PF throughout BH and BW reached statistical significance, indicating influence of the advancement in maturity status on PF variables Differences between linear and nonlinear regressions are smaller in males than in females. There are some indices that the age of 4 to 6 years is a critical period in the prevention of obesity, mostly because the extensively studied and proven negative influence of overweight and adiposity on PF tests is not yet evident. In some cases we have found evident regression breakpoints (approximately 25 kg in boys), which should be interpreted as critical values of the anthropometric measures for the studied sample of subjects. PMID:24611341
Nonlinear rotordynamics analysis
NASA Technical Reports Server (NTRS)
Day, W. B.
1985-01-01
The special nonlinearities of the Jeffcott equations in rotordynamics are examined. The immediate application of this analysis is directed toward understanding the excessive vibrations recorded in the LOX pump of the SSME during hot firing ground testing. Deadband, side force and rubbing are three possible sources of inducing nonlinearity in the Jeffcott equations. The present analysis initially reduces these problems to the same mathematical description. A special frequency, named the nonlinear natural frequency is defined and used to develop the solutions of the nonlinear Jeffcott equations as asympotic expansions. This nonlinear natural frequency which is the ratio of the cross-stiffness and the damping, plays a major role in determining response frequencies. Numerical solutions are included for comparison with the analysis. Also, nonlinear frequency-response tables are made for a typical range of values.
Stationary nonlinear Airy beams
Lotti, A.; Faccio, D.; Couairon, A.; Papazoglou, D. G.; Panagiotopoulos, P.; Tzortzakis, S.; Abdollahpour, D.
2011-08-15
We demonstrate the existence of an additional class of stationary accelerating Airy wave forms that exist in the presence of third-order (Kerr) nonlinearity and nonlinear losses. Numerical simulations and experiments, in agreement with the analytical model, highlight how these stationary solutions sustain the nonlinear evolution of Airy beams. The generic nature of the Airy solution allows extension of these results to other settings, and a variety of applications are suggested.
Nonlinear Simulation of the Tooth Enamel Spectrum for EPR Dosimetry
NASA Astrophysics Data System (ADS)
Kirillov, V. A.; Dubovsky, S. V.
2016-07-01
Software was developed where initial EPR spectra of tooth enamel were deconvoluted based on nonlinear simulation, line shapes and signal amplitudes in the model initial spectrum were calculated, the regression coefficient was evaluated, and individual spectra were summed. Software validation demonstrated that doses calculated using it agreed excellently with the applied radiation doses and the doses reconstructed by the method of additive doses.
Organic nonlinear optical materials
NASA Technical Reports Server (NTRS)
Umegaki, S.
1987-01-01
Recently, it became clear that organic compounds with delocalized pi electrons show a great nonlinear optical response. Especially, secondary nonlinear optical constants of more than 2 digits were often seen in the molecular level compared to the existing inorganic crystals such as LiNbO3. The crystallization was continuously tried. Organic nonlinear optical crystals have a new future as materials for use in the applied physics such as photomodulation, optical frequency transformation, opto-bistabilization, and phase conjugation optics. Organic nonlinear optical materials, e.g., urea, O2NC6H4NH2, I, II, are reviewed with 50 references.
Nonlinear optics at interfaces
Chen, C.K.
1980-12-01
Two aspects of surface nonlinear optics are explored in this thesis. The first part is a theoretical and experimental study of nonlinear intraction of surface plasmons and bulk photons at metal-dielectric interfaces. The second part is a demonstration and study of surface enhanced second harmonic generation at rough metal surfaces. A general formulation for nonlinear interaction of surface plasmons at metal-dielectric interfaces is presented and applied to both second and third order nonlinear processes. Experimental results for coherent second and third harmonic generation by surface plasmons and surface coherent antiStokes Raman spectroscopy (CARS) are shown to be in good agreement with the theory.
A Convenient Spreadsheet Method for Fitting the Nonlinear Langmuir Equation to Sorption Data
Technology Transfer Automated Retrieval System (TEKTRAN)
The Langmuir model is commonly used model for describing solute and metal sorption to soils. This model can be fit to data using nonlinear regression or, alternatively, a linearized version of the model can be fit to the data using linear regression. Although linearized versions of the Langmuir equa...
ERIC Educational Resources Information Center
Scheidt, Douglas M.
1995-01-01
Reviews three functions of the "Scientist" software package useful for the social sciences: nonlinear curve fitting, parameter estimation, and data/regression plotting. Social scientists are likely to find limitations and unfamiliar procedures in "Scientist". Its value lies in its visual presentation of data and regression curves and the…
Daniels, Bryan C.; Nemenman, Ilya
2015-01-01
The nonlinearity of dynamics in systems biology makes it hard to infer them from experimental data. Simple linear models are computationally efficient, but cannot incorporate these important nonlinearities. An adaptive method based on the S-system formalism, which is a sensible representation of nonlinear mass-action kinetics typically found in cellular dynamics, maintains the efficiency of linear regression. We combine this approach with adaptive model selection to obtain efficient and parsimonious representations of cellular dynamics. The approach is tested by inferring the dynamics of yeast glycolysis from simulated data. With little computing time, it produces dynamical models with high predictive power and with structural complexity adapted to the difficulty of the inference problem. PMID:25806510
ERIC Educational Resources Information Center
Bloom, Allan M.; And Others
In response to the increasing importance of student performance in required classes, research was conducted to compare two prediction procedures, linear modeling using multiple regression and nonlinear modeling using AID3. Performance in the first college math course (College Mathematics, Calculus, or Business Calculus Matrices) was the dependent…
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Wang, X-D; Qi, Y-X; Jiang, Z-L
2011-03-01
Many methods had been developed on inferring transcriptional network from gene expression. However, it is still necessary to design new method that discloses more detailed and exact network information. Using network-assisted regression, the authors combined the averaged three-way mutual information (AMI3) and non-linear ordinary differential equation (ODE) model to infer the transcriptional network, and to obtain both the topological structure and the regulatory dynamics. Synthetic and experimental data were used to evaluate the performance of the above approach. In comparison with the previous methods based on mutual information, AMI3 obtained higher precision with the same sensitivity. To describe the regulatory dynamics between transcription factors and target genes, network-assisted regression and regression without network, respectively, were applied in the steady-state and time series microarray data. The results revealed that comparing with regression without network, network-assisted regression increased the precision, but decreased the fitting goodness. Then, the authors reconstructed the transcriptional network of Escherichia coli and simulated the regulatory dynamics of genes. Furthermore, the authors' approach identified potential transcription factors regulating yeast cell cycle. In conclusion, network-assisted regression, combined AMI3 and ODE model, was a more precisely to infer the topological structure and the regulatory dynamics of transcriptional network from microarray data. [Includes supplementary material]. PMID:21405197
Hierarchical regression for analyses of multiple outcomes.
Richardson, David B; Hamra, Ghassan B; MacLehose, Richard F; Cole, Stephen R; Chu, Haitao
2015-09-01
In cohort mortality studies, there often is interest in associations between an exposure of primary interest and mortality due to a range of different causes. A standard approach to such analyses involves fitting a separate regression model for each type of outcome. However, the statistical precision of some estimated associations may be poor because of sparse data. In this paper, we describe a hierarchical regression model for estimation of parameters describing outcome-specific relative rate functions and associated credible intervals. The proposed model uses background stratification to provide flexible control for the outcome-specific associations of potential confounders, and it employs a hierarchical "shrinkage" approach to stabilize estimates of an exposure's associations with mortality due to different causes of death. The approach is illustrated in analyses of cancer mortality in 2 cohorts: a cohort of dioxin-exposed US chemical workers and a cohort of radiation-exposed Japanese atomic bomb survivors. Compared with standard regression estimates of associations, hierarchical regression yielded estimates with improved precision that tended to have less extreme values. The hierarchical regression approach also allowed the fitting of models with effect-measure modification. The proposed hierarchical approach can yield estimates of association that are more precise than conventional estimates when one wishes to estimate associations with multiple outcomes. PMID:26232395
Regression models for estimating coseismic landslide displacement
Jibson, R.W.
2007-01-01
Newmark's sliding-block model is widely used to estimate coseismic slope performance. Early efforts to develop simple regression models to estimate Newmark displacement were based on analysis of the small number of strong-motion records then available. The current availability of a much larger set of strong-motion records dictates that these regression equations be updated. Regression equations were generated using data derived from a collection of 2270 strong-motion records from 30 worldwide earthquakes. The regression equations predict Newmark displacement in terms of (1) critical acceleration ratio, (2) critical acceleration ratio and earthquake magnitude, (3) Arias intensity and critical acceleration, and (4) Arias intensity and critical acceleration ratio. These equations are well constrained and fit the data well (71% < R2 < 88%), but they have standard deviations of about 0.5 log units, such that the range defined by the mean ?? one standard deviation spans about an order of magnitude. These regression models, therefore, are not recommended for use in site-specific design, but rather for regional-scale seismic landslide hazard mapping or for rapid preliminary screening of sites. ?? 2007 Elsevier B.V. All rights reserved.
Mental chronometry with simple linear regression.
Chen, J Y
1997-10-01
Typically, mental chronometry is performed by means of introducing an independent variable postulated to affect selectively some stage of a presumed multistage process. However, the effect could be a global one that spreads proportionally over all stages of the process. Currently, there is no method to test this possibility although simple linear regression might serve the purpose. In the present study, the regression approach was tested with tasks (memory scanning and mental rotation) that involved a selective effect and with a task (word superiority effect) that involved a global effect, by the dominant theories. The results indicate (1) the manipulation of the size of a memory set or of angular disparity affects the intercept of the regression function that relates the times for memory scanning with different set sizes or for mental rotation with different angular disparities and (2) the manipulation of context affects the slope of the regression function that relates the times for detecting a target character under word and nonword conditions. These ratify the regression approach as a useful method for doing mental chronometry. PMID:9347535
Regression analysis in interlaboratory surveys: a case study with cholesterol and triglycerides.
Munster, D J; Lever, M; Walmsley, T A
1978-10-01
1. A new interlaboratory survey design, that uses regression analysis to compare results from each laboratory with target values, was tested using cholesterol and triglyceride analyses. The fifty New Zealand laboratories involved showed considerable interlaboratory variation (CV = 8% to 27% for cholesterol, 13% to 113% for triglycerides), 30% and 40% of which was associated with systematic differences between laboratories. 2. End-of-period summaries using regression analysis confirmed the presence of systematic errors. These were either simple types caused apparently by incorrect standardisation (regression slope, B not equal to 1.0) or inappropriate blank correction (intercept, A not equal to zero) or complex types presumably due to nonlinearity or nonspecificity. Graphical display of results from each laboratory aided fault diagnosis and allowed the detection of between-run standardisation differences. 3. Method comparison studies were made: the only highly significant result being lower precision achieved by enzymatic cholesterol methods compared with other colorimetric methods. PMID:729161
Fatigue design of a cellular phone folder using regression model-based multi-objective optimization
NASA Astrophysics Data System (ADS)
Kim, Young Gyun; Lee, Jongsoo
2016-08-01
In a folding cellular phone, the folding device is repeatedly opened and closed by the user, which eventually results in fatigue damage, particularly to the front of the folder. Hence, it is important to improve the safety and endurance of the folder while also reducing its weight. This article presents an optimal design for the folder front that maximizes its fatigue endurance while minimizing its thickness. Design data for analysis and optimization were obtained experimentally using a test jig. Multi-objective optimization was carried out using a nonlinear regression model. Three regression methods were employed: back-propagation neural networks, logistic regression and support vector machines. The AdaBoost ensemble technique was also used to improve the approximation. Two-objective Pareto-optimal solutions were identified using the non-dominated sorting genetic algorithm (NSGA-II). Finally, a numerically optimized solution was validated against experimental product data, in terms of both fatigue endurance and thickness index.
NASA Astrophysics Data System (ADS)
Lu, Dan; Ye, Ming; Hill, Mary C.
2012-09-01
Confidence intervals based on classical regression theories augmented to include prior information and credible intervals based on Bayesian theories are conceptually different ways to quantify parametric and predictive uncertainties. Because both confidence and credible intervals are used in environmental modeling, we seek to understand their differences and similarities. This is of interest in part because calculating confidence intervals typically requires tens to thousands of model runs, while Bayesian credible intervals typically require tens of thousands to millions of model runs. Given multi-Gaussian distributed observation errors, our theoretical analysis shows that, for linear or linearized-nonlinear models, confidence and credible intervals are always numerically identical when consistent prior information is used. For nonlinear models, nonlinear confidence and credible intervals can be numerically identical if parameter confidence regions defined using the approximate likelihood method and parameter credible regions estimated using Markov chain Monte Carlo realizations are numerically identical and predictions are a smooth, monotonic function of the parameters. Both occur if intrinsic model nonlinearity is small. While the conditions of Gaussian errors and small intrinsic model nonlinearity are violated by many environmental models, heuristic tests using analytical and numerical models suggest that linear and nonlinear confidence intervals can be useful approximations of uncertainty even under significantly nonideal conditions. In the context of epistemic model error for a complex synthetic nonlinear groundwater problem, the linear and nonlinear confidence and credible intervals for individual models performed similarly enough to indicate that the computationally frugal confidence intervals can be useful in many circumstances. Experiences with these groundwater models are expected to be broadly applicable to many environmental models. We suggest that for
Uncertainty quantification in DIC with Kriging regression
NASA Astrophysics Data System (ADS)
Wang, Dezhi; DiazDelaO, F. A.; Wang, Weizhuo; Lin, Xiaoshan; Patterson, Eann A.; Mottershead, John E.
2016-03-01
A Kriging regression model is developed as a post-processing technique for the treatment of measurement uncertainty in classical subset-based Digital Image Correlation (DIC). Regression is achieved by regularising the sample-point correlation matrix using a local, subset-based, assessment of the measurement error with assumed statistical normality and based on the Sum of Squared Differences (SSD) criterion. This leads to a Kriging-regression model in the form of a Gaussian process representing uncertainty on the Kriging estimate of the measured displacement field. The method is demonstrated using numerical and experimental examples. Kriging estimates of displacement fields are shown to be in excellent agreement with 'true' values for the numerical cases and in the experimental example uncertainty quantification is carried out using the Gaussian random process that forms part of the Kriging model. The root mean square error (RMSE) on the estimated displacements is produced and standard deviations on local strain estimates are determined.
Efficient Regressions via Optimally Combining Quantile Information*
Zhao, Zhibiao; Xiao, Zhijie
2014-01-01
We develop a generally applicable framework for constructing efficient estimators of regression models via quantile regressions. The proposed method is based on optimally combining information over multiple quantiles and can be applied to a broad range of parametric and nonparametric settings. When combining information over a fixed number of quantiles, we derive an upper bound on the distance between the efficiency of the proposed estimator and the Fisher information. As the number of quantiles increases, this upper bound decreases and the asymptotic variance of the proposed estimator approaches the Cramér-Rao lower bound under appropriate conditions. In the case of non-regular statistical estimation, the proposed estimator leads to super-efficient estimation. We illustrate the proposed method for several widely used regression models. Both asymptotic theory and Monte Carlo experiments show the superior performance over existing methods. PMID:25484481
A tutorial on Bayesian Normal linear regression
NASA Astrophysics Data System (ADS)
Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens
2015-12-01
Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.
Friction and nonlinear dynamics
NASA Astrophysics Data System (ADS)
Manini, N.; Braun, O. M.; Tosatti, E.; Guerra, R.; Vanossi, A.
2016-07-01
The nonlinear dynamics associated with sliding friction forms a broad interdisciplinary research field that involves complex dynamical processes and patterns covering a broad range of time and length scales. Progress in experimental techniques and computational resources has stimulated the development of more refined and accurate mathematical and numerical models, capable of capturing many of the essentially nonlinear phenomena involved in friction.
Friction and nonlinear dynamics.
Manini, N; Braun, O M; Tosatti, E; Guerra, R; Vanossi, A
2016-07-27
The nonlinear dynamics associated with sliding friction forms a broad interdisciplinary research field that involves complex dynamical processes and patterns covering a broad range of time and length scales. Progress in experimental techniques and computational resources has stimulated the development of more refined and accurate mathematical and numerical models, capable of capturing many of the essentially nonlinear phenomena involved in friction. PMID:27249652
Nonlinear Optics and Applications
NASA Technical Reports Server (NTRS)
Abdeldayem, Hossin A. (Editor); Frazier, Donald O. (Editor)
2007-01-01
Nonlinear optics is the result of laser beam interaction with materials and started with the advent of lasers in the early 1960s. The field is growing daily and plays a major role in emerging photonic technology. Nonlinear optics play a major role in many of the optical applications such as optical signal processing, optical computers, ultrafast switches, ultra-short pulsed lasers, sensors, laser amplifiers, and many others. This special review volume on Nonlinear Optics and Applications is intended for those who want to be aware of the most recent technology. This book presents a survey of the recent advances of nonlinear optical applications. Emphasis will be on novel devices and materials, switching technology, optical computing, and important experimental results. Recent developments in topics which are of historical interest to researchers, and in the same time of potential use in the fields of all-optical communication and computing technologies, are also included. Additionally, a few new related topics which might provoke discussion are presented. The book includes chapters on nonlinear optics and applications; the nonlinear Schrodinger and associated equations that model spatio-temporal propagation; the supercontinuum light source; wideband ultrashort pulse fiber laser sources; lattice fabrication as well as their linear and nonlinear light guiding properties; the second-order EO effect (Pockels), the third-order (Kerr) and thermo-optical effects in optical waveguides and their applications in optical communication; and, the effect of magnetic field and its role in nonlinear optics, among other chapters.
Spontaneous hypnotic age regression: case report.
Spiegel, D; Rosenfeld, A
1984-12-01
Age regression--reliving the past as though it were occurring in the present, with age appropriate vocabulary, mental content, and affect--can occur with instruction in highly hypnotizable individuals, but has rarely been reported to occur spontaneously, especially as a primary symptom. The psychiatric presentation and treatment of a 16-year-old girl with spontaneous age regressions accessible and controllable with hypnosis and psychotherapy are described. Areas of overlap and divergence between this patient's symptoms and those found in patients with hysterical fugue and multiple personality syndrome are also discussed. PMID:6501240
Spontaneous regression of a conjunctival naevus.
Haldar, Shreya; Leyland, Martin
2016-01-01
Conjunctival naevi are one of the most common lesions affecting the conjunctiva. While benign in the vast majority of cases, the risk of malignant transformation necessitates regular follow-up. They are well known to increase in size; however, we present the first photo-documented case of spontaneous regression of conjunctival naevus. In most cases, surgical excision is performed due to the clinician's concerns over malignancy. However, a substantial proportion of patients request excision. Highlighting the potential for regression of the lesion is important to ensure patients make an informed decision when contemplating such surgery. PMID:27581234
Heritability Estimation using Regression Models for Correlation
Lee, Hye-Seung; Paik, Myunghee Cho; Rundek, Tatjana; Sacco, Ralph L; Dong, Chuanhui; Krischer, Jeffrey P
2012-01-01
Heritability estimates a polygenic effect on a trait for a population. Reliable interpretation of heritability is critical in planning further genetic studies to locate a gene responsible for the trait. This study accommodates both single and multiple trait cases by employing regression models for correlation parameter to infer the heritability. Sharing the properties of regression approach, the proposed methods are exible to incorporate non-genetic and/or non-additive genetic information in the analysis. The performances of the proposed model are compared with those using the likelihood approach through simulations and carotid Intima Media Thickness analysis from Northern Manhattan family Study. PMID:22457844
Salience Assignment for Multiple-Instance Regression
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Lane, Terran
2007-01-01
We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.
Topics in route-regression analysis
Geissler, P.H.; Sauer, J.R.
1990-01-01
The route-regression method has been used in recent years to analyze data from roadside surveys. With this method, a population trend is estimated for each route in a region, then regional trends are estimated as a weighted mean of the individual route trends. This method can accurately incorporate data that is unbalanced by changes in years surveyed and observer differences. We suggest that route-regression methodology is most efficient in the estimation of long-term (>5 year) trends, and tends to provide conservative results for low-density species.
Liu, Zhan-yu; Huang, Jing-feng; Shi, Jing-jing; Tao, Rong-xiang; Zhou, Wan; Zhang, Li-Li
2007-10-01
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2,500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respectively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demonstrates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level. PMID:17910117
Demonstration of a Fiber Optic Regression Probe
NASA Technical Reports Server (NTRS)
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Influence of storm magnitude and watershed size on runoff nonlinearity
NASA Astrophysics Data System (ADS)
Lee, Kwan Tun; Huang, Jen-Kuo
2016-06-01
The inherent nonlinear characteristics of the watershed runoff process related to storm magnitude and watershed size are discussed in detail in this study. The first type of nonlinearity is referred to rainfall-runoff dynamic process and the second type is with respect to a Power-law relation between peak discharge and upstream drainage area. The dynamic nonlinearity induced by storm magnitude was first demonstrated by inspecting rainfall-runoff records at three watersheds in Taiwan. Then the derivation of the watershed unit hydrograph (UH) using two linear hydrological models shows that the peak discharge and time to peak discharge that characterize the shape of UH vary event-to-event. Hence, the intention of deriving a unique and universal UH for all rainfall-runoff simulation cases is questionable. In contrast, the UHs by the other two adopted nonlinear hydrological models were responsive to rainfall intensity without relying on linear proportion principle, and are excellent in presenting dynamic nonlinearity. Based on the two-segment regression, the scaling nonlinearity between peak discharge and drainage area was investigated by analyzing the variation of Power-law exponent. The results demonstrate that the scaling nonlinearity is particularly significant for a watershed having larger area and subjecting to a small-size of storm. For three study watersheds, a large tributary that contributes relatively great drainage area or inflow is found to cause a transition break in scaling relationship and convert the scaling relationship from linearity to nonlinearity.
Nonlinear GARCH model and 1 / f noise
NASA Astrophysics Data System (ADS)
Kononovicius, A.; Ruseckas, J.
2015-06-01
Auto-regressive conditionally heteroskedastic (ARCH) family models are still used, by practitioners in business and economic policy making, as a conditional volatility forecasting models. Furthermore ARCH models still are attracting an interest of the researchers. In this contribution we consider the well known GARCH(1,1) process and its nonlinear modifications, reminiscent of NGARCH model. We investigate the possibility to reproduce power law statistics, probability density function and power spectral density, using ARCH family models. For this purpose we derive stochastic differential equations from the GARCH processes in consideration. We find the obtained equations to be similar to a general class of stochastic differential equations known to reproduce power law statistics. We show that linear GARCH(1,1) process has power law distribution, but its power spectral density is Brownian noise-like. However, the nonlinear modifications exhibit both power law distribution and power spectral density of the 1 /fβ form, including 1 / f noise.
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
ERIC Educational Resources Information Center
Williams, John D.; Lindem, Alfred C.
Four computer programs using the general purpose multiple linear regression program have been developed. Setwise regression analysis is a stepwise procedure for sets of variables; there will be as many steps as there are sets. Covarmlt allows a solution to the analysis of covariance design with multiple covariates. A third program has three…
Bayesian nonparametric regression with varying residual density.
Pati, Debdeep; Dunson, David B
2014-02-01
We consider the problem of robust Bayesian inference on the mean regression function allowing the residual density to change flexibly with predictors. The proposed class of models is based on a Gaussian process prior for the mean regression function and mixtures of Gaussians for the collection of residual densities indexed by predictors. Initially considering the homoscedastic case, we propose priors for the residual density based on probit stick-breaking (PSB) scale mixtures and symmetrized PSB (sPSB) location-scale mixtures. Both priors restrict the residual density to be symmetric about zero, with the sPSB prior more flexible in allowing multimodal densities. We provide sufficient conditions to ensure strong posterior consistency in estimating the regression function under the sPSB prior, generalizing existing theory focused on parametric residual distributions. The PSB and sPSB priors are generalized to allow residual densities to change nonparametrically with predictors through incorporating Gaussian processes in the stick-breaking components. This leads to a robust Bayesian regression procedure that automatically down-weights outliers and influential observations in a locally-adaptive manner. Posterior computation relies on an efficient data augmentation exact block Gibbs sampler. The methods are illustrated using simulated and real data applications. PMID:24465053
Using Regression Analysis: A Guided Tour.
ERIC Educational Resources Information Center
Shelton, Fred Ames
1987-01-01
Discusses the use and interpretation of multiple regression analysis with computer programs and presents a flow chart of the process. A general explanation of the flow chart is provided, followed by an example showing the development of a linear equation which could be used in estimating manufacturing overhead cost. (Author/LRW)
Assumptions of Multiple Regression: Correcting Two Misconceptions
ERIC Educational Resources Information Center
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason
2013-01-01
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
ERIC Educational Resources Information Center
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
Assessing risk factors for periodontitis using regression
NASA Astrophysics Data System (ADS)
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Nodular fasciitis with degeneration and regression.
Yanagisawa, Akihiro; Okada, Hideki
2008-07-01
Nodular fasciitis is a benign reactive proliferation that is frequently misdiagnosed as a sarcoma. This article describes a case of nodular fasciitis of 6-month duration located in the cheek, which degenerated and spontaneously regressed after biopsy. The nodule was fixed to the zygoma but was free from the overlying skin. The mass was 3.0 cm in diameter and demonstrated high signal intensity on T2-weighted magnetic resonance imaging. A small part of the lesion was biopsied. Pathological and immunohistochemical examinations identified the nodule as nodular fasciitis with myxoid histology. One month after the biopsy, the mass showed decreased signal intensity on T2-weighted images and measured 2.2 cm in size. The signal on T2-weighted images showed time-dependent decreases, and the mass continued to reduce in size throughout the follow-up period. The lesion presented as hypointense to the surrounding muscles on T2-weighted images and was 0.4 cm in size at 2 years of follow-up. This case demonstrates that nodular fasciitis with myxoid histology can change to that with fibrous appearance gradually with time, thus bringing about spontaneous regression. Degeneration may be involved in the spontaneous regression of nodular fasciitis with myxoid appearance. The mechanism of regression, unclarified at present, should be further studied. PMID:18650753
Bootstrap inference longitudinal semiparametric regression model
NASA Astrophysics Data System (ADS)
Pane, Rahmawati; Otok, Bambang Widjanarko; Zain, Ismaini; Budiantara, I. Nyoman
2016-02-01
Semiparametric regression contains two components, i.e. parametric and nonparametric component. Semiparametric regression model is represented by yt i=μ (x˜'ti,zt i)+εt i where μ (x˜'ti,zt i)=x˜'tiβ ˜+g (zt i) and yti is response variable. It is assumed to have a linear relationship with the predictor variables x˜'ti=(x1 i 1,x2 i 2,…,xT i r) . Random error εti, i = 1, …, n, t = 1, …, T is normally distributed with zero mean and variance σ2 and g(zti) is a nonparametric component. The results of this study showed that the PLS approach on longitudinal semiparametric regression models obtain estimators β˜^t=[X'H(λ)X]-1X'H(λ )y ˜ and g˜^λ(z )=M (λ )y ˜ . The result also show that bootstrap was valid on longitudinal semiparametric regression model with g^λ(b )(z ) as nonparametric component estimator.
Prediction of dynamical systems by symbolic regression
NASA Astrophysics Data System (ADS)
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
A Constrained Linear Estimator for Multiple Regression
ERIC Educational Resources Information Center
Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.
2010-01-01
"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
Prediction of dynamical systems by symbolic regression.
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast. PMID:27575130
Multiple Linear Regression: A Realistic Reflector.
ERIC Educational Resources Information Center
Nutt, A. T.; Batsell, R. R.
Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…
Commonality Analysis for the Regression Case.
ERIC Educational Resources Information Center
Murthy, Kavita
Commonality analysis is a procedure for decomposing the coefficient of determination (R superscript 2) in multiple regression analyses into the percent of variance in the dependent variable associated with each independent variable uniquely, and the proportion of explained variance associated with the common effects of predictors in various…
A New Sample Size Formula for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The focus of this research was to determine the efficacy of a new method of selecting sample sizes for multiple linear regression. A Monte Carlo simulation was used to study both empirical predictive power rates and empirical statistical power rates of the new method and seven other methods: those of C. N. Park and A. L. Dudycha (1974); J. Cohen…
A Logistic Regression Model for Personnel Selection.
ERIC Educational Resources Information Center
Raju, Nambury S.; And Others
1991-01-01
A two-parameter logistic regression model for personnel selection is proposed. The model was tested with a database of 84,808 military enlistees. The probability of job success was related directly to trait levels, addressing such topics as selection, validity generalization, employee classification, selection bias, and utility-based fair…
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Climate Change Projections Using Regional Regression Models
NASA Astrophysics Data System (ADS)
Griffis, V. W.; Gyawali, R.; Watkins, D. W.
2012-12-01
A typical approach to project climate change impacts on water resources systems is to downscale general circulation model (GCM) or regional climate model (RCM) outputs as forcing data for a watershed model. With downscaled climate model outputs becoming readily available, multi-model ensemble approaches incorporating mutliple GCMs, multiple emissions scenarios and multiple initializations are increasingly being used. While these multi-model climate ensembles represent a range of plausible futures, different hydrologic models and methods may complicate impact assessment. In particular, associated loss, flow routing, snowmelt and evapotranspiration computation methods can markedly increase hydrological modeling uncertainty. Other challenges include properly calibrating and verifying the watershed model and maintaining a consistent energy budget between climate and hydrologic models. An alternative approach, particularly appealing for ungauged basins or locations where record lengths are short, is to directly predict selected streamflow quantiles from regional regression equations that include physical basin characteristics as well as meteorological variables output by climate models (Fennessey 2011). Two sets of regional regression models are developed for the Great Lakes states using ordinary least squares and weighted least squares regression. The regional regression modeling approach is compared with physically based hydrologic modeling approaches for selected Great Lakes watersheds using downscaled outputs from the Coupled Model Intercomparison Project (CMIP3) as inputs to the Large Basin Runoff Model (LBRM) and the U.S. Army Corps Hydrologic Modeling System (HEC-HMS).
Evaluating Aptness of a Regression Model
ERIC Educational Resources Information Center
Matson, Jack E.; Huguenard, Brian R.
2007-01-01
The data for 104 software projects is used to develop a linear regression model that uses function points (a measure of software project size) to predict development effort. The data set is particularly interesting in that it violates several of the assumptions required of a linear model; but when the data are transformed, the data set satisfies…
A Skew-Normal Mixture Regression Model
ERIC Educational Resources Information Center
Liu, Min; Lin, Tsung-I
2014-01-01
A challenge associated with traditional mixture regression models (MRMs), which rest on the assumption of normally distributed errors, is determining the number of unobserved groups. Specifically, even slight deviations from normality can lead to the detection of spurious classes. The current work aims to (a) examine how sensitive the commonly…
A Spline Regression Model for Latent Variables
ERIC Educational Resources Information Center
Harring, Jeffrey R.
2014-01-01
Spline (or piecewise) regression models have been used in the past to account for patterns in observed data that exhibit distinct phases. The changepoint or knot marking the shift from one phase to the other, in many applications, is an unknown parameter to be estimated. As an extension of this framework, this research considers modeling the…
Moving the Bar: Transformations in Linear Regression.
ERIC Educational Resources Information Center
Miranda, Janet
The assumption that is most important to the hypothesis testing procedure of multiple linear regression is the assumption that the residuals are normally distributed, but this assumption is not always tenable given the realities of some data sets. When normal distribution of the residuals is not met, an alternative method can be initiated. As an…
REGRESSION METHODS FOR DATA WITH INCOMPLETE COVARIATES
Modern statistical methods in chronic disease epidemiology allow simultaneous regression of disease status on several covariates. hese methods permit examination of the effects of one covariate while controlling for those of others that may be causally related to the disease. owe...
Student Selection and the Special Regression Model.
ERIC Educational Resources Information Center
Deck, Dennis D.
The feasibility of constructing composite scores which will yield pretest measures having all the properties required by the special regression model is explored as an alternative to the single pretest score usually used in student selection for Elementary Secondary Education Act Title I compensatory education programs. Reading data, including…
Code System to Calculate Correlation & Regression Coefficients.
1999-11-23
Version 00 PCC/SRC is designed for use in conjunction with sensitivity analyses of complex computer models. PCC/SRC calculates the partial correlation coefficients (PCC) and the standardized regression coefficients (SRC) from the multivariate input to, and output from, a computer model.
Gaussian conditional random fields for regression in remote sensing
NASA Astrophysics Data System (ADS)
Radosavljevic, Vladan
In recent years many remote sensing instruments of various properties have been employed in an attempt to better characterize important geophysical phenomena. Satellite instruments provide an exceptional opportunity for global long-term observations of the land, the biosphere, the atmosphere, and the oceans. The collected data are used for estimation and better understanding of geophysical parameters such as land cover type, atmospheric properties, or ocean temperature. Achieving accurate estimations of such parameters is an important requirement for development of models able to predict global climate changes. One of the most challenging climate research problems is estimation of global composition, load, and variability of aerosols, small airborne particles that reflect and absorb incoming solar radiation. The existing algorithm for aerosol prediction from satellite observations is deterministic and manually tuned by domain scientist. In contrast to domain-driven method, we show that aerosol prediction is achievable by completely data-driven approaches. These statistical methods consist of learning of nonlinear regression models to predict aerosol load using the satellite observations as inputs. Measurements from unevenly distributed ground-based sites over the world are used as proxy to ground-truth outputs. Although statistical methods achieve better accuracy than deterministic method this setup is appropriate when data are independently and identically distributed (IID). The IID assumption is often violated in remote sensing where data exhibit temporal, spatial, or spatio-temporal dependencies. In such cases, the traditional supervised learning approaches could result in a model with degraded accuracy. Conditional random fields (CRF) are widely used for predicting output variables that have some internal structure. Most of the CRF research has been done on structured classification where the outputs are discrete. We propose a CRF model for continuous outputs
Various approaches and tools exist to estimate local and regional PM_{2.5} impacts from a single emissions source, ranging from simple screening techniques to Gaussian based dispersion models and complex grid-based Eulerian photochemical transport models. These approache...
Harding, Brian J; Gehrels, Thomas W; Makela, Jonathan J
2014-02-01
The Earth's thermosphere plays a critical role in driving electrodynamic processes in the ionosphere and in transferring solar energy to the atmosphere, yet measurements of thermospheric state parameters, such as wind and temperature, are sparse. One of the most popular techniques for measuring these parameters is to use a Fabry-Perot interferometer to monitor the Doppler width and breadth of naturally occurring airglow emissions in the thermosphere. In this work, we present a technique for estimating upper-atmospheric winds and temperatures from images of Fabry-Perot fringes captured by a CCD detector. We estimate instrument parameters from fringe patterns of a frequency-stabilized laser, and we use these parameters to estimate winds and temperatures from airglow fringe patterns. A unique feature of this technique is the model used for the laser and airglow fringe patterns, which fits all fringes simultaneously and attempts to model the effects of optical defects. This technique yields accurate estimates for winds, temperatures, and the associated uncertainties in these parameters, as we show with a Monte Carlo simulation. PMID:24514183
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2010-01-01
Statistical prediction remains an important tool for decisions in a variety of disciplines. An equally important issue is identifying factors that contribute to more or less accurate predictions. The time series literature includes well developed methods for studying predictability and volatility over time. This article develops…
Zweig, George
2016-05-01
An earlier paper characterizing the linear mechanical response of the organ of Corti [J. Acoust. Soc. Am. 138, 1102-1121 (2015)] is extended to the nonlinear domain. Assuming the existence of nonlinear oscillators nonlocally coupled through the pressure they help create, the oscillator equations are derived and examined when the stimuli are modulated tones and clicks. The nonlinearities are constrained by the requirements of oscillator stability and the invariance of zero crossings in the click response to changes in click amplitude. The nonlinear oscillator equations for tones are solved in terms of the fluid pressure that drives them, and its time derivative, presumably a proxy for forces created by outer hair cells. The pressure equation is reduced to quadrature, the integrand depending on the oscillators' responses. The resulting nonlocally coupled nonlinear equations for the pressure, and oscillator amplitudes and phases, are solved numerically in terms of the fluid pressure at the stapes. Methods for determining the nonlinear damping directly from measurements are described. Once the oscillators have been characterized from their tone and click responses, the mechanical response of the cochlea to natural sounds may be computed numerically. Signal processing inspired by cochlear mechanics opens up a new area of nonlocal nonlinear time-frequency analysis. PMID:27250151
Nonlinear ordinary difference equations
NASA Technical Reports Server (NTRS)
Caughey, T. K.
1979-01-01
Future space vehicles will be relatively large and flexible, and active control will be necessary to maintain geometrical configuration. While the stresses and strains in these space vehicles are not expected to be excessively large, their cumulative effects will cause significant geometrical nonlinearities to appear in the equations of motion, in addition to the nonlinearities caused by material properties. Since the only effective tool for the analysis of such large complex structures is the digital computer, it will be necessary to gain a better understanding of the nonlinear ordinary difference equations which result from the time discretization of the semidiscrete equations of motion for such structures.
Metamaterials with conformational nonlinearity
NASA Astrophysics Data System (ADS)
Lapine, Mikhail; Shadrivov, Ilya V.; Powell, David A.; Kivshar, Yuri S.
2011-11-01
Within a decade of fruitful development, metamaterials became a prominent area of research, bridging theoretical and applied electrodynamics, electrical engineering and material science. Being man-made structures, metamaterials offer a particularly useful playground to develop interdisciplinary concepts. Here we demonstrate a novel principle in metamaterial assembly which integrates electromagnetic, mechanical, and thermal responses within their elements. Through these mechanisms, the conformation of the meta-molecules changes, providing a dual mechanism for nonlinearity and offering nonlinear chirality. Our proposal opens a wide road towards further developments of nonlinear metamaterials and photonic structures, adding extra flexibility to their design and control.
Embedded Sensors for Measuring Surface Regression
NASA Technical Reports Server (NTRS)
Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.
2006-01-01
The development and evaluation of new hybrid and solid rocket motors requires accurate characterization of the propellant surface regression as a function of key operational parameters. These characteristics establish the propellant flow rate and are prime design drivers affecting the propulsion system geometry, size, and overall performance. There is a similar need for the development of advanced ablative materials, and the use of conventional ablatives exposed to new operational environments. The Miniature Surface Regression Sensor (MSRS) was developed to serve these applications. It is designed to be cast or embedded in the material of interest and regresses along with it. During this process, the resistance of the sensor is related to its instantaneous length, allowing the real-time thickness of the host material to be established. The time derivative of this data reveals the instantaneous surface regression rate. The MSRS could also be adapted to perform similar measurements for a variety of other host materials when it is desired to monitor thicknesses and/or regression rate for purposes of safety, operational control, or research. For example, the sensor could be used to monitor the thicknesses of brake linings or racecar tires and indicate when they need to be replaced. At the time of this reporting, over 200 of these sensors have been installed into a variety of host materials. An MSRS can be made in either of two configurations, denoted ladder and continuous (see Figure 1). A ladder MSRS includes two highly electrically conductive legs, across which narrow strips of electrically resistive material are placed at small increments of length. These strips resemble the rungs of a ladder and are electrically equivalent to many tiny resistors connected in parallel. A substrate material provides structural support for the legs and rungs. The instantaneous sensor resistance is read by an external signal conditioner via wires attached to the conductive legs on the
Photonic nonlinear transient computing with multiple-delay wavelength dynamics.
Martinenghi, Romain; Rybalko, Sergei; Jacquot, Maxime; Chembo, Yanne K; Larger, Laurent
2012-06-15
We report on the experimental demonstration of a hybrid optoelectronic neuromorphic computer based on a complex nonlinear wavelength dynamics including multiple delayed feedbacks with randomly defined weights. This neuromorphic approach is based on a new paradigm of a brain-inspired computational unit, intrinsically differing from Turing machines. This recent paradigm consists in expanding the input information to be processed into a higher dimensional phase space, through the nonlinear transient response of a complex dynamics excited by the input information. The computed output is then extracted via a linear separation of the transient trajectory in the complex phase space. The hyperplane separation is derived from a learning phase consisting of the resolution of a regression problem. The processing capability originates from the nonlinear transient, resulting in nonlinear transient computing. The computational performance is successfully evaluated on a standard benchmark test, namely, a spoken digit recognition task. PMID:23004274
Photonic Nonlinear Transient Computing with Multiple-Delay Wavelength Dynamics
NASA Astrophysics Data System (ADS)
Martinenghi, Romain; Rybalko, Sergei; Jacquot, Maxime; Chembo, Yanne K.; Larger, Laurent
2012-06-01
We report on the experimental demonstration of a hybrid optoelectronic neuromorphic computer based on a complex nonlinear wavelength dynamics including multiple delayed feedbacks with randomly defined weights. This neuromorphic approach is based on a new paradigm of a brain-inspired computational unit, intrinsically differing from Turing machines. This recent paradigm consists in expanding the input information to be processed into a higher dimensional phase space, through the nonlinear transient response of a complex dynamics excited by the input information. The computed output is then extracted via a linear separation of the transient trajectory in the complex phase space. The hyperplane separation is derived from a learning phase consisting of the resolution of a regression problem. The processing capability originates from the nonlinear transient, resulting in nonlinear transient computing. The computational performance is successfully evaluated on a standard benchmark test, namely, a spoken digit recognition task.
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Mesch, Martin; Metzger, Bernd; Hentschel, Mario; Giessen, Harald
2016-05-11
We introduce the concept of nonlinear plasmonic sensing, relying on third harmonic generation from simple plasmonic nanoantennas. Because of the nonlinear conversion process we observe a larger sensitivity to a local change in the refractive index as compared to the commonly used linear localized surface plasmon resonance sensing. Refractive index changes as small as 10(-3) can be detected. In order to determine the spectral position of highest sensitivity, we perform linear and third harmonic spectroscopy on plasmonic nanoantenna arrays, which are the fundamental building blocks of our sensor. Furthermore, simultaneous detection of linear and nonlinear signals allows quantitative comparison of both methods, providing further insight into the working principle of our sensor. While the signal-to-noise ratio is comparable, nonlinear sensing gives about seven times higher relative signal changes. PMID:27050296
NASA Technical Reports Server (NTRS)
1984-01-01
Nonlinear structural analysis techniques for engine structures and components are addressed. The finite element method and boundary element method are discussed in terms of stress and structural analyses of shells, plates, and laminates.
Nonlinear Dynamics in Cardiology
Krogh-Madsen, Trine; Christini, David J.
2013-01-01
The dynamics of many cardiac arrhythmias, as well as the nature of transitions between different heart rhythms, have long been considered evidence of nonlinear phenomena playing a direct role in cardiac arrhythmogenesis. In most types of cardiac disease, the pathology develops slowly and gradually, often over many years. In contrast, arrhythmias often occur suddenly. In nonlinear systems, sudden changes in qualitative dynamics can, counter-intuitively, result from a gradual change in a system parameter –this is known as a bifurcation. Here, we review how nonlinearities in cardiac electrophysiology influence normal and abnormal rhythms and how bifurcations change the dynamics. In particular, we focus on the many recent developments in computational modeling at the cellular level focused on intracellular calcium dynamics. We discuss two areas where recent experimental and modeling work have suggested the importance of nonlinearities in calcium dynamics: repolarization alternans and pacemaker cell automaticity. PMID:22524390
Library for Nonlinear Optimization
2001-10-09
OPT++ is a C++ object-oriented library for nonlinear optimization. This incorporates an improved implementation of an existing capability and two new algorithmic capabilities based on existing journal articles and freely available software.
NASA Astrophysics Data System (ADS)
Frank, T. D.
2008-06-01
Some elementary properties and examples of Markov processes are reviewed. It is shown that the definition of the Markov property naturally leads to a classification of Markov processes into linear and nonlinear ones.
Nonlinear Refractive Properties
NASA Technical Reports Server (NTRS)
Vikram, Chandra S.; Witherow, William K.
2001-01-01
Using nonlinear refractive properties of a salt-water solution at two wavelengths, numerical analysis has been performed to extract temperature and concentration from interferometric fringe data. The theoretical study, using a commercially available equation solving software, starts with critical fringe counting needs and the role of nonlinear refractive properties in such measurements. Finally, methodology of the analysis, codes, fringe counting accuracy needs, etc. is described in detail.
[Caudal regression sequence: clinical-radiological case].
Zepeda T, Juan; García M, Mirna; Morales S, Jorge; Pantoja H, Miguel A; Espinoza G, Aníbal
2015-01-01
Caudal regression syndrome is an uncommon congenital malformation that includes a wide spectrum of clinical presentations. Characterised by caudal musculoskeletal compromise, it can be associated to neurological, gastrointestinal, renal and genitourinary defects. Although the specific aetiology has not been clarified, it has been associated with the presence of maternal diabetes and mutations in homeobox gene HBLX9. Its diagnosis is based on a good prenatal ultrasound detection, detailed physical examination, and post-natal imaging study using radiography and magnetic resonance. Caudal regression syndrome requires multidisciplinary management, and it seems that good metabolic control of gestational diabetes constitutes the best preventive measure available. We present the clinical case and images of a male term newborn, born to a pregestational diabetic mother with poor metabolic control and a prenatal ultrasound diagnosis of lumbar spine, iliac bones and lower limbs malformation. Born in good conditions, the diagnosis was confirmed using X-rays and magnetic resonance. PMID:26455704
Joint regression analysis for discrete longitudinal data.
Madsen, L; Fang, Y
2011-09-01
We introduce an approximation to the Gaussian copula likelihood of Song, Li, and Yuan (2009, Biometrics 65, 60-68) used to estimate regression parameters from correlated discrete or mixed bivariate or trivariate outcomes. Our approximation allows estimation of parameters from response vectors of length much larger than three, and is asymptotically equivalent to the Gaussian copula likelihood. We estimate regression parameters from the toenail infection data of De Backer et al. (1996, British Journal of Dermatology 134, 16-17), which consist of binary response vectors of length seven or less from 294 subjects. Although maximizing the Gaussian copula likelihood yields estimators that are asymptotically more efficient than generalized estimating equation (GEE) estimators, our simulation study illustrates that for finite samples, GEE estimators can actually be as much as 20% more efficient. PMID:21039391
Self-Adaptive Induction of Regression Trees.
Fidalgo-Merino, Raúl; Núñez, Marlon
2011-08-01
A new algorithm for incremental construction of binary regression trees is presented. This algorithm, called SAIRT, adapts the induced model when facing data streams involving unknown dynamics, like gradual and abrupt function drift, changes in certain regions of the function, noise, and virtual drift. It also handles both symbolic and numeric attributes. The proposed algorithm can automatically adapt its internal parameters and model structure to obtain new patterns, depending on the current dynamics of the data stream. SAIRT can monitor the usefulness of nodes and can forget examples from selected regions, storing the remaining ones in local windows associated to the leaves of the tree. On these conditions, current regression methods need a careful configuration depending on the dynamics of the problem. Experimentation suggests that the proposed algorithm obtains better results than current algorithms when dealing with data streams that involve changes with different speeds, noise levels, sampling distribution of examples, and partial or complete changes of the underlying function. PMID:21263164
Quantile regression modeling for Malaysian automobile insurance premium data
NASA Astrophysics Data System (ADS)
Fuzi, Mohd Fadzli Mohd; Ismail, Noriszura; Jemain, Abd Aziz
2015-09-01
Quantile regression is a robust regression to outliers compared to mean regression models. Traditional mean regression models like Generalized Linear Model (GLM) are not able to capture the entire distribution of premium data. In this paper we demonstrate how a quantile regression approach can be used to model net premium data to study the effects of change in the estimates of regression parameters (rating classes) on the magnitude of response variable (pure premium). We then compare the results of quantile regression model with Gamma regression model. The results from quantile regression show that some rating classes increase as quantile increases and some decrease with decreasing quantile. Further, we found that the confidence interval of median regression (τ = O.5) is always smaller than Gamma regression in all risk factors.
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Modeling confounding by half-sibling regression
Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-01-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154
Modeling confounding by half-sibling regression.
Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-07-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154
NASA Astrophysics Data System (ADS)
Yang, Qianli; Pitkow, Xaq
2015-03-01
Most interesting natural sensory stimuli are encoded in the brain in a form that can only be decoded nonlinearly. But despite being a core function of the brain, nonlinear population codes are rarely studied and poorly understood. Interestingly, the few existing models of nonlinear codes are inconsistent with known architectural features of the brain. In particular, these codes have information content that scales with the size of the cortical population, even if that violates the data processing inequality by exceeding the amount of information entering the sensory system. Here we provide a valid theory of nonlinear population codes by generalizing recent work on information-limiting correlations in linear population codes. Although these generalized, nonlinear information-limiting correlations bound the performance of any decoder, they also make decoding more robust to suboptimal computation, allowing many suboptimal decoders to achieve nearly the same efficiency as an optimal decoder. Although these correlations are extremely difficult to measure directly, particularly for nonlinear codes, we provide a simple, practical test by which one can use choice-related activity in small populations of neurons to determine whether decoding is suboptimal or optimal and limited by correlated noise. We conclude by describing an example computation in the vestibular system where this theory applies. QY and XP was supported by a grant from the McNair foundation.
Nonlinear systems in medicine.
Higgins, John P.
2002-01-01
Many achievements in medicine have come from applying linear theory to problems. Most current methods of data analysis use linear models, which are based on proportionality between two variables and/or relationships described by linear differential equations. However, nonlinear behavior commonly occurs within human systems due to their complex dynamic nature; this cannot be described adequately by linear models. Nonlinear thinking has grown among physiologists and physicians over the past century, and non-linear system theories are beginning to be applied to assist in interpreting, explaining, and predicting biological phenomena. Chaos theory describes elements manifesting behavior that is extremely sensitive to initial conditions, does not repeat itself and yet is deterministic. Complexity theory goes one step beyond chaos and is attempting to explain complex behavior that emerges within dynamic nonlinear systems. Nonlinear modeling still has not been able to explain all of the complexity present in human systems, and further models still need to be refined and developed. However, nonlinear modeling is helping to explain some system behaviors that linear systems cannot and thus will augment our understanding of the nature of complex dynamic systems within the human body in health and in disease states. PMID:14580107
Time series regression studies in environmental epidemiology
Bhaskaran, Krishnan; Gasparrini, Antonio; Hajat, Shakoor; Smeeth, Liam; Armstrong, Ben
2013-01-01
Time series regression studies have been widely used in environmental epidemiology, notably in investigating the short-term associations between exposures such as air pollution, weather variables or pollen, and health outcomes such as mortality, myocardial infarction or disease-specific hospital admissions. Typically, for both exposure and outcome, data are available at regular time intervals (e.g. daily pollution levels and daily mortality counts) and the aim is to explore short-term associations between them. In this article, we describe the general features of time series data, and we outline the analysis process, beginning with descriptive analysis, then focusing on issues in time series regression that differ from other regression methods: modelling short-term fluctuations in the presence of seasonal and long-term patterns, dealing with time varying confounding factors and modelling delayed (‘lagged’) associations between exposure and outcome. We finish with advice on model checking and sensitivity analysis, and some common extensions to the basic model. PMID:23760528
Time series regression studies in environmental epidemiology.
Bhaskaran, Krishnan; Gasparrini, Antonio; Hajat, Shakoor; Smeeth, Liam; Armstrong, Ben
2013-08-01
Time series regression studies have been widely used in environmental epidemiology, notably in investigating the short-term associations between exposures such as air pollution, weather variables or pollen, and health outcomes such as mortality, myocardial infarction or disease-specific hospital admissions. Typically, for both exposure and outcome, data are available at regular time intervals (e.g. daily pollution levels and daily mortality counts) and the aim is to explore short-term associations between them. In this article, we describe the general features of time series data, and we outline the analysis process, beginning with descriptive analysis, then focusing on issues in time series regression that differ from other regression methods: modelling short-term fluctuations in the presence of seasonal and long-term patterns, dealing with time varying confounding factors and modelling delayed ('lagged') associations between exposure and outcome. We finish with advice on model checking and sensitivity analysis, and some common extensions to the basic model. PMID:23760528
Transfer Learning Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Paul, A.; Rottensteiner, F.; Heipke, C.
2015-08-01
In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
General Regression and Representation Model for Classification
Qian, Jianjun; Yang, Jian; Xu, Yong
2014-01-01
Recently, the regularized coding-based classification methods (e.g. SRC and CRC) show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR) for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients) and the specific information (weight matrix of image pixels) to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel) weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR) and robust general regression and representation classifier (R-GRR). The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms. PMID:25531882
Multiple regression analyses in the prediction of aerospace instrument costs
NASA Astrophysics Data System (ADS)
Tran, Linh
The aerospace industry has been investing for decades in ways to improve its efficiency in estimating the project life cycle cost (LCC). One of the major focuses in the LCC is the cost/prediction of aerospace instruments done during the early conceptual design phase of the project. The accuracy of early cost predictions affects the project scheduling and funding, and it is often the major cause for project cost overruns. The prediction of instruments' cost is based on the statistical analysis of these independent variables: Mass (kg), Power (watts), Instrument Type, Technology Readiness Level (TRL), Destination: earth orbiting or planetary, Data rates (kbps), Number of bands, Number of channels, Design life (months), and Development duration (months). This author is proposing a cost prediction approach of aerospace instruments based on these statistical analyses: Clustering Analysis, Principle Components Analysis (PCA), Bootstrap, and multiple regressions (both linear and non-linear). In the proposed approach, the Cost Estimating Relationship (CER) will be developed for the dependent variable Instrument Cost by using a combination of multiple independent variables. "The Full Model" will be developed and executed to estimate the full set of nine variables. The SAS program, Excel, Automatic Cost Estimating Integrate Tool (ACEIT) and Minitab are the tools to aid the analysis. Through the analysis, the cost drivers will be identified which will help develop an ultimate cost estimating software tool for the Instrument Cost prediction and optimization of future missions.
End-user display calibration via support vector regression
NASA Astrophysics Data System (ADS)
Bastani, Behnam; Funt, Brian; Xiong, Weihua
2006-01-01
The technique of support vector regression (SVR) is applied to the color display calibration problem. Given a set of training data, SVR estimates a continuous-valued function encoding the fundamental interrelation between a given input and its corresponding output. This mapping can then be used to find an output value for a given input value not in the training data set. Here, SVR is applied directly to the display's non-linearized RGB digital input values to predict output CIELAB values. There are several different linear methods for calibrating different display technologies (GOG, Masking and Wyble). An advantage of using SVR for color calibration is that the end-user does not need to apply a different calibration model for each different display technology. We show that the same model can be used to calibrate CRT, LCD and DLP displays accurately. We also show that the accuracy of the model is comparable to that of the optimal linear transformation introduced by Funt et al.
On robust regression with high-dimensional predictors.
El Karoui, Noureddine; Bean, Derek; Bickel, Peter J; Lim, Chinghway; Yu, Bin
2013-09-01
We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but p ≤ n. We find an exact stochastic representation for the distribution of β = argmin(β∈ℝ(p)) Σ(i=1)(n) ρ(Y(i) - X(i')β) at fixed p and n under various assumptions on the objective function ρ and our statistical model. A scalar random variable whose deterministic limit rρ(κ) can be studied when p/n → κ > 0 plays a central role in this representation. We discover a nonlinear system of two deterministic equations that characterizes rρ(κ). Interestingly, the system shows that rρ(κ) depends on ρ through proximal mappings of ρ as well as various aspects of the statistical model underlying our study. Several surprising results emerge. In particular, we show that, when p/n is large enough, least squares becomes preferable to least absolute deviations for double-exponential errors. PMID:23954908
Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie
2014-09-01
Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. PMID:24874183
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States.
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects' affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain's motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Flexible link functions in nonparametric binary regression with Gaussian process priors.
Li, Dan; Wang, Xia; Lin, Lizhen; Dey, Dipak K
2016-09-01
In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes. A critical issue in modeling binary response data is the appropriate choice of link functions. Commonly adopted link functions such as probit or logit links have fixed skewness and lack the flexibility to allow the data to determine the degree of the skewness. To address this limitation, we propose a flexible binary regression model which combines a generalized extreme value link function with a Gaussian process prior on the latent structure. Bayesian computation is employed in model estimation. Posterior consistency of the resulting posterior distribution is demonstrated. The flexibility and gains of the proposed model are illustrated through detailed simulation studies and two real data examples. Empirical results show that the proposed model outperforms a set of alternative models, which only have either a Gaussian process prior on the latent regression function or a Dirichlet prior on the link function. PMID:26686333
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
The Regression Trunk Approach to Discover Treatment Covariate Interaction
ERIC Educational Resources Information Center
Dusseldorp, Elise; Meulman, Jacqueline J.
2004-01-01
The regression trunk approach (RTA) is an integration of regression trees and multiple linear regression analysis. In this paper RTA is used to discover treatment covariate interactions, in the regression of one continuous variable on a treatment variable with "multiple" covariates. The performance of RTA is compared to the classical method of…
Analyzing Historical Count Data: Poisson and Negative Binomial Regression Models.
ERIC Educational Resources Information Center
Beck, E. M.; Tolnay, Stewart E.
1995-01-01
Asserts that traditional approaches to multivariate analysis, including standard linear regression techniques, ignore the special character of count data. Explicates three suitable alternatives to standard regression techniques, a simple Poisson regression, a modified Poisson regression, and a negative binomial model. (MJP)
Nanodispersion, nonlinear image filtering, and materials classification
NASA Astrophysics Data System (ADS)
Crosta, Giovanni F.; Lee, Jun S.
2011-06-01
Polyethylene terephthalate-alumina nano-composites from two production processes gave rise to materials H and T, further divided into four and, respectively, three classes of belonging. Electron microscope images of the materials had been visually scored by an expert in terms of an index, β, aimed at assessing filler dispersion and distribution. These properties characterize the nano-composite. Herewith a classification algorithm which includes image spatial differentiation and non-linear filtering interlaced with multivariate statistics is applied to the same images of materials Hand T. The classification algorithm depends on a few parameters, which are automatically determined by maximizing a figure of merit in the supervised training stage. The classifier output is a display on the plane of the first two principal components. By regressing the 1st principal component affinely against β a remarkable agreement is found between automated classification and visual scoring of material H. The regression result for materialT is not significant, because the assigned classes reduce from 3 to 2, both by visual and automated scoring. The output from the non-linear image filter can be related to filler dispersion and distribution.
Transform-both-sides nonlinear models for in vitro pharmacokinetic experiments.
Latif, A H M Mahbub; Gilmour, Steven G
2015-06-01
Transform-both-sides nonlinear models have proved useful in many experimental applications including those in pharmaceutical sciences and biochemistry. The maximum likelihood method is commonly used to fit transform-both-sides nonlinear models, where the regression and transformation parameters are estimated simultaneously. In this paper, an analysis of variance-based method is described in detail for estimating transform-both-sides nonlinear models from randomized experiments. It estimates the transformation parameter from the full treatment model and then the regression parameters are estimated conditionally on this estimate of the transformation parameter. The analysis of variance method is computationally simpler compared with the maximum likelihood method of estimation and allows a more natural separation of different sources of lack of fit. Simulation studies show that the analysis of variance method can provide unbiased estimators of complex transform-both-sides nonlinear models, such as transform-both-sides random coefficient nonlinear regression models and transform-both-sides fixed coefficient nonlinear regression models with random block effects. PMID:25038072
Mission assurance increased with regression testing
NASA Astrophysics Data System (ADS)
Lang, R.; Spezio, M.
Knowing what to test is an important attribute in any testing campaign, especially when it has to be right or the mission could be in jeopardy. The New Horizons mission, developed and operated by the John Hopkins University Applied Physics Laboratory, received a planned major upgrade to their Mission Operations and Control (MOC) ground system architecture. Early in the mission planning it was recognized that the ground system platform would require an upgrade to assure continued support of technology used for spacecraft operations. With the planned update to the six year operational ground architecture from Solaris 8 to Solaris 10, it was critical that the new architecture maintain critical operations and control functions. The New Horizons spacecraft is heading to its historic rendezvous with Pluto in July 2015 and then proceeding into the Kuiper Belt. This paper discusses the Independent Software Acceptance Testing (ISAT) Regression test campaign that played a critical role to assure the continued success of the New Horizons mission. The New Horizons ISAT process was designed to assure all the requirements were being met for the ground software functions developed to support the mission objectives. The ISAT team developed a test plan with a series of test case designs. The test objectives were to verify that the software developed from the requirements functioned as expected in the operational environment. As the test cases were developed and executed, a regression test suite was identified at the functional level. This regression test suite would serve as a crucial resource in assuring the operational system continued to function as required with such a large scale change being introduced. Some of the New Horizons ground software changes required modifications to the most critical functions of the operational software. Of particular concern was the new MOC architecture (Solaris 10) is Intel based and little endian, and the legacy architecture (Solaris 8) was SPA
Monthly streamflow forecasting using Gaussian Process Regression
NASA Astrophysics Data System (ADS)
Sun, Alexander Y.; Wang, Dingbao; Xu, Xianli
2014-04-01
Streamflow forecasting plays a critical role in nearly all aspects of water resources planning and management. In this work, Gaussian Process Regression (GPR), an effective kernel-based machine learning algorithm, is applied to probabilistic streamflow forecasting. GPR is built on Gaussian process, which is a stochastic process that generalizes multivariate Gaussian distribution to infinite-dimensional space such that distributions over function values can be defined. The GPR algorithm provides a tractable and flexible hierarchical Bayesian framework for inferring the posterior distribution of streamflows. The prediction skill of the algorithm is tested for one-month-ahead prediction using the MOPEX database, which includes long-term hydrometeorological time series collected from 438 basins across the U.S. from 1948 to 2003. Comparisons with linear regression and artificial neural network models indicate that GPR outperforms both regression methods in most cases. The GPR prediction of MOPEX basins is further examined using the Budyko framework, which helps to reveal the close relationships among water-energy partitions, hydrologic similarity, and predictability. Flow regime modification and the resulting loss of predictability have been a major concern in recent years because of climate change and anthropogenic activities. The persistence of streamflow predictability is thus examined by extending the original MOPEX data records to 2012. Results indicate relatively strong persistence of streamflow predictability in the extended period, although the low-predictability basins tend to show more variations. Because many low-predictability basins are located in regions experiencing fast growth of human activities, the significance of sustainable development and water resources management can be even greater for those regions.
Regression models for expected length of stay.
Grand, Mia Klinten; Putter, Hein
2016-03-30
In multi-state models, the expected length of stay (ELOS) in a state is not a straightforward object to relate to covariates, and the traditional approach has instead been to construct regression models for the transition intensities and calculate ELOS from these. The disadvantage of this approach is that the effect of covariates on the intensities is not easily translated into the effect on ELOS, and it typically relies on the Markov assumption. We propose to use pseudo-observations to construct regression models for ELOS, thereby allowing a direct interpretation of covariate effects while at the same time avoiding the Markov assumption. For this approach, all we need is a non-parametric consistent estimator for ELOS. For every subject (and for every state of interest), a pseudo-observation is constructed, and they are then used as outcome variables in the regression model. We furthermore show how to construct longitudinal (pseudo-) data when combining the concept of pseudo-observations with landmarking. In doing so, covariates are allowed to be time-varying, and we can investigate potential time-varying effects of the covariates. The models can be fitted using generalized estimating equations, and dependence between observations on the same subject is handled by applying the sandwich estimator. The method is illustrated using data from the US Health and Retirement Study where the impact of socio-economic factors on ELOS in health and disability is explored. Finally, we investigate the performance of our approach under different degrees of left-truncation, non-Markovianity, and right-censoring by means of simulation. PMID:26497637
Mapping geogenic radon potential by regression kriging.
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-02-15
Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. PMID:26706761
Penalized solutions to functional regression problems
Harezlak, Jaroslaw; Coull, Brent A.; Laird, Nan M.; Magari, Shannon R.; Christiani, David C.
2007-01-01
SUMMARY Recent technological advances in continuous biological monitoring and personal exposure assessment have led to the collection of subject-specific functional data. A primary goal in such studies is to assess the relationship between the functional predictors and the functional responses. The historical functional linear model (HFLM) can be used to model such dependencies of the response on the history of the predictor values. An estimation procedure for the regression coefficients that uses a variety of regularization techniques is proposed. An approximation of the regression surface relating the predictor to the outcome by a finite-dimensional basis expansion is used, followed by penalization of the coefficients of the neighboring basis functions by restricting the size of the coefficient differences to be small. Penalties based on the absolute values of the basis function coefficient differences (corresponding to the LASSO) and the squares of these differences (corresponding to the penalized spline methodology) are studied. The fits are compared using an extension of the Akaike Information Criterion that combines the error variance estimate, degrees of freedom of the fit and the norm of the bases function coefficients. The performance of the proposed methods is evaluated via simulations. The LASSO penalty applied to the linearly transformed coefficients yields sparser representations of the estimated regression surface, while the quadratic penalty provides solutions with the smallest L2-norm of the basis functions coefficients. Finally, the new estimation procedure is applied to the analysis of the effects of occupational particulate matter (PM) exposure on the heart rate variability (HRV) in a cohort of boilermaker workers. Results suggest that the strongest association between PM exposure and HRV in these workers occurs as a result of point exposures to the increased levels of particulate matter corresponding to smoking breaks. PMID:18552972
Penalized solutions to functional regression problems.
Harezlak, Jaroslaw; Coull, Brent A; Laird, Nan M; Magari, Shannon R; Christiani, David C
2007-06-15
Recent technological advances in continuous biological monitoring and personal exposure assessment have led to the collection of subject-specific functional data. A primary goal in such studies is to assess the relationship between the functional predictors and the functional responses. The historical functional linear model (HFLM) can be used to model such dependencies of the response on the history of the predictor values. An estimation procedure for the regression coefficients that uses a variety of regularization techniques is proposed. An approximation of the regression surface relating the predictor to the outcome by a finite-dimensional basis expansion is used, followed by penalization of the coefficients of the neighboring basis functions by restricting the size of the coefficient differences to be small. Penalties based on the absolute values of the basis function coefficient differences (corresponding to the LASSO) and the squares of these differences (corresponding to the penalized spline methodology) are studied. The fits are compared using an extension of the Akaike Information Criterion that combines the error variance estimate, degrees of freedom of the fit and the norm of the bases function coefficients. The performance of the proposed methods is evaluated via simulations. The LASSO penalty applied to the linearly transformed coefficients yields sparser representations of the estimated regression surface, while the quadratic penalty provides solutions with the smallest L(2)-norm of the basis functions coefficients. Finally, the new estimation procedure is applied to the analysis of the effects of occupational particulate matter (PM) exposure on the heart rate variability (HRV) in a cohort of boilermaker workers. Results suggest that the strongest association between PM exposure and HRV in these workers occurs as a result of point exposures to the increased levels of particulate matter corresponding to smoking breaks. PMID:18552972
Multiple linear regression for isotopic measurements
NASA Astrophysics Data System (ADS)
Garcia Alonso, J. I.
2012-04-01
There are two typical applications of isotopic measurements: the detection of natural variations in isotopic systems and the detection man-made variations using enriched isotopes as indicators. For both type of measurements accurate and precise isotope ratio measurements are required. For the so-called non-traditional stable isotopes, multicollector ICP-MS instruments are usually applied. In many cases, chemical separation procedures are required before accurate isotope measurements can be performed. The off-line separation of Rb and Sr or Nd and Sm is the classical procedure employed to eliminate isobaric interferences before multicollector ICP-MS measurement of Sr and Nd isotope ratios. Also, this procedure allows matrix separation for precise and accurate Sr and Nd isotope ratios to be obtained. In our laboratory we have evaluated the separation of Rb-Sr and Nd-Sm isobars by liquid chromatography and on-line multicollector ICP-MS detection. The combination of this chromatographic procedure with multiple linear regression of the raw chromatographic data resulted in Sr and Nd isotope ratios with precisions and accuracies typical of off-line sample preparation procedures. On the other hand, methods for the labelling of individual organisms (such as a given plant, fish or animal) are required for population studies. We have developed a dual isotope labelling procedure which can be unique for a given individual, can be inherited in living organisms and it is stable. The detection of the isotopic signature is based also on multiple linear regression. The labelling of fish and its detection in otoliths by Laser Ablation ICP-MS will be discussed using trout and salmon as examples. As a conclusion, isotope measurement procedures based on multiple linear regression can be a viable alternative in multicollector ICP-MS measurements.
Validation of a heteroscedastic hazards regression model.
Wu, Hong-Dar Isaac; Hsieh, Fushing; Chen, Chen-Hsin
2002-03-01
A Cox-type regression model accommodating heteroscedasticity, with a power factor of the baseline cumulative hazard, is investigated for analyzing data with crossing hazards behavior. Since the approach of partial likelihood cannot eliminate the baseline hazard, an overidentified estimating equation (OEE) approach is introduced in the estimation procedure. It by-product, a model checking statistic, is presented to test for the overall adequacy of the heteroscedastic model. Further, under the heteroscedastic model setting, we propose two statistics to test the proportional hazards assumption. Implementation of this model is illustrated in a data analysis of a cancer clinical trial. PMID:11878222
Convex Regression with Interpretable Sharp Partitions
Petersen, Ashley; Simon, Noah; Witten, Daniela
2016-01-01
We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set.
Learning regulatory programs by threshold SVD regression
Ma, Xin; Xiao, Luo; Wong, Wing Hung
2014-01-01
We formulate a statistical model for the regulation of global gene expression by multiple regulatory programs and propose a thresholding singular value decomposition (T-SVD) regression method for learning such a model from data. Extensive simulations demonstrate that this method offers improved computational speed and higher sensitivity and specificity over competing approaches. The method is used to analyze microRNA (miRNA) and long noncoding RNA (lncRNA) data from The Cancer Genome Atlas (TCGA) consortium. The analysis yields previously unidentified insights into the combinatorial regulation of gene expression by noncoding RNAs, as well as findings that are supported by evidence from the literature. PMID:25331876
Weather adjustment using seemingly unrelated regression
Noll, T.A.
1995-05-01
Seemingly unrelated regression (SUR) is a system estimation technique that accounts for time-contemporaneous correlation between individual equations within a system of equations. SUR is suited to weather adjustment estimations when the estimation is: (1) composed of a system of equations and (2) the system of equations represents either different weather stations, different sales sectors or a combination of different weather stations and different sales sectors. SUR utilizes the cross-equation error values to develop more accurate estimates of the system coefficients than are obtained using ordinary least-squares (OLS) estimation. SUR estimates can be generated using a variety of statistical software packages including MicroTSP and SAS.
An operational GLS model for hydrologic regression
Tasker, Gary D.; Stedinger, J.R.
1989-01-01
Recent Monte Carlo studies have documented the value of generalized least squares (GLS) procedures to estimate empirical relationships between streamflow statistics and physiographic basin characteristics. This paper presents a number of extensions of the GLS method that deal with realities and complexities of regional hydrologic data sets that were not addressed in the simulation studies. These extensions include: (1) a more realistic model of the underlying model errors; (2) smoothed estimates of cross correlation of flows; (3) procedures for including historical flow data; (4) diagnostic statistics describing leverage and influence for GLS regression; and (5) the formulation of a mathematical program for evaluating future gaging activities. ?? 1989.
ERIC Educational Resources Information Center
Giannotti, Flavia; Cortesi, Flavia; Cerquiglini, Antonella; Miraglia, Daniela; Vagnoni, Cristina; Sebastiani, Teresa; Bernabei, Paola
2008-01-01
This study investigated sleep of children with autism and developmental regression and the possible relationship with epilepsy and epileptiform abnormalities. Participants were 104 children with autism (70 non-regressed, 34 regressed) and 162 typically developing children (TD). Results suggested that the regressed group had higher incidence of…
Regression analysis of growth responses to water depth in three wetland plant species
Sorrell, Brian K.; Tanner, Chris C.; Brix, Hans
2012-01-01
Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation, which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta and Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 to 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated-measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. Phormium tenax growth decreased rapidly in standing water >0.25 m depth, C. secta growth increased initially with depth but then decreased at depths >0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0- to 0.50-m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences. PMID:23259044
NASA Astrophysics Data System (ADS)
Wang, Hu; Li, Enying; Li, G. Y.
2011-03-01
This paper presents a crashworthiness design optimization method based on a metamodeling technique. The crashworthiness optimization is a highly nonlinear and large scale problem, which is composed various nonlinearities, such as geometry, material and contact and needs a large number expensive evaluations. In order to obtain a robust approximation efficiently, a probability-based least square support vector regression is suggested to construct metamodels by considering structure risk minimization. Further, to save the computational cost, an intelligent sampling strategy is applied to generate sample points at the stage of design of experiment (DOE). In this paper, a cylinder, a full vehicle frontal collision is involved. The results demonstrate that the proposed metamodel-based optimization is efficient and effective in solving crashworthiness, design optimization problems.
Nonlinear optomechanics with graphene
NASA Astrophysics Data System (ADS)
Shaffer, Airlia; Patil, Yogesh Sharad; Cheung, Hil F. H.; Wang, Ke; Vengalattore, Mukund
2016-05-01
To date, studies of cavity optomechanics have been limited to exploiting the linear interactions between the light and mechanics. However, investigations of quantum signal transduction, quantum enhanced metrology and manybody physics with optomechanics each require strong, nonlinear interactions. Graphene nanomembranes are an exciting prospect for realizing such studies due to their inherently nonlinear nature and low mass. We fabricate large graphene nanomembranes and study their mechanical and optical properties. By using dark ground imaging techniques, we correlate their eigenmode shapes with the measured dissipation. We study their hysteretic response present even at low driving amplitudes, and their nonlinear dissipation. Finally, we discuss ongoing efforts to use these resonators for studies of quantum optomechanics and force sensing. This work is supported by the DARPA QuASAR program through a Grant from the ARO.
Nonlinear Ehrenfest's urn model.
Casas, G A; Nobre, F D; Curado, E M F
2015-04-01
Ehrenfest's urn model is modified by introducing nonlinear terms in the associated transition probabilities. It is shown that these modifications lead, in the continuous limit, to a Fokker-Planck equation characterized by two competing diffusion terms, namely, the usual linear one and a nonlinear diffusion term typical of anomalous diffusion. By considering a generalized H theorem, the associated entropy is calculated, resulting in a sum of Boltzmann-Gibbs and Tsallis entropic forms. It is shown that the stationary state of the associated Fokker-Planck equation satisfies precisely the same equation obtained by extremization of the entropy. Moreover, the effects of the nonlinear contributions on the entropy production phenomenon are also analyzed. PMID:25974470
Nonlinear optical Galton board
Navarrete-Benlloch, C.; Perez, A.; Roldan, Eugenio
2007-06-15
We generalize the concept of optical Galton board (OGB), first proposed by Bouwmeester et al. [Phys. Rev. A 61, 013410 (2000)], by introducing the possibility of nonlinear self-phase modulation on the wave function during the walker evolution. If the original Galton board illustrates classical diffusion, the OGB, which can be understood as a grid of Landau-Zener crossings, illustrates the influence of interference on diffusion, and is closely connected with the quantum walk. Our nonlinear generalization of the OGB shows new phenomena, the most striking of which is the formation of nondispersive pulses in the field distribution (solitonlike structures). These exhibit a variety of dynamical behaviors, including ballistic motion, dynamical localization, nonelastic collisions, and chaotic behavior, in the sense that the dynamics is very sensitive to the nonlinearity strength.
Regression models for convex ROC curves.
Lloyd, C J
2000-09-01
The performance of a diagnostic test is summarized by its receiver operating characteristic (ROC) curve. Under quite natural assumptions about the latent variable underlying the test, the ROC curve is convex. Empirical data on a test's performance often comes in the form of observed true positive and false positive relative frequencies under varying conditions. This paper describes a family of regression models for analyzing such data. The underlying ROC curves are specified by a quality parameter delta and a shape parameter mu and are guaranteed to be convex provided delta > 1. Both the position along the ROC curve and the quality parameter delta are modeled linearly with covariates at the level of the individual. The shape parameter mu enters the model through the link functions log(p mu) - log(1 - p mu) of a binomial regression and is estimated either by search or from an appropriate constructed variate. One simple application is to the meta-analysis of independent studies of the same diagnostic test, illustrated on some data of Moses, Shapiro, and Littenberg (1993). A second application, to so-called vigilance data, is given, where ROC curves differ across subjects and modeling of the position along the ROC curve is of primary interest. PMID:10985227
Double linear regression classification for face recognition
NASA Astrophysics Data System (ADS)
Feng, Qingxiang; Zhu, Qi; Tang, Lin-Lin; Pan, Jeng-Shyang
2015-02-01
A new classifier designed based on linear regression classification (LRC) classifier and simple-fast representation-based classifier (SFR), named double linear regression classification (DLRC) classifier, is proposed for image recognition in this paper. As we all know, the traditional LRC classifier only uses the distance between test image vectors and predicted image vectors of the class subspace for classification. And the SFR classifier uses the test image vectors and the nearest image vectors of the class subspace to classify the test sample. However, the DLRC classifier computes out the predicted image vectors of each class subspace and uses all the predicted vectors to construct a novel robust global space. Then, the DLRC utilizes the novel global space to get the novel predicted vectors of each class for classification. A mass number of experiments on AR face database, JAFFE face database, Yale face database, Extended YaleB face database, and PIE face database are used to evaluate the performance of the proposed classifier. The experimental results show that the proposed classifier achieves better recognition rate than the LRC classifier, SFR classifier, and several other classifiers.
Shape regression for vertebra fracture quantification
NASA Astrophysics Data System (ADS)
Lund, Michael Tillge; de Bruijne, Marleen; Tanko, Laszlo B.; Nielsen, Mads
2005-04-01
Accurate and reliable identification and quantification of vertebral fractures constitute a challenge both in clinical trials and in diagnosis of osteoporosis. Various efforts have been made to develop reliable, objective, and reproducible methods for assessing vertebral fractures, but at present there is no consensus concerning a universally accepted diagnostic definition of vertebral fractures. In this project we want to investigate whether or not it is possible to accurately reconstruct the shape of a normal vertebra, using a neighbouring vertebra as prior information. The reconstructed shape can then be used to develop a novel vertebra fracture measure, by comparing the segmented vertebra shape with its reconstructed normal shape. The vertebrae in lateral x-rays of the lumbar spine were manually annotated by a medical expert. With this dataset we built a shape model, with equidistant point distribution between the four corner points. Based on the shape model, a multiple linear regression model of a normal vertebra shape was developed for each dataset using leave-one-out cross-validation. The reconstructed shape was calculated for each dataset using these regression models. The average prediction error for the annotated shape was on average 3%.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Regression Models For Saffron Yields in Iran
NASA Astrophysics Data System (ADS)
S. H, Sanaeinejad; S. N, Hosseini
Saffron is an important crop in social and economical aspects in Khorassan Province (Northeast of Iran). In this research wetried to evaluate trends of saffron yield in recent years and to study the relationship between saffron yield and the climate change. A regression analysis was used to predict saffron yield based on 20 years of yield data in Birjand, Ghaen and Ferdows cities.Climatologically data for the same periods was provided by database of Khorassan Climatology Center. Climatologically data includedtemperature, rainfall, relative humidity and sunshine hours for ModelI, and temperature and rainfall for Model II. The results showed the coefficients of determination for Birjand, Ferdows and Ghaen for Model I were 0.69, 0.50 and 0.81 respectively. Also coefficients of determination for the same cities for model II were 0.53, 0.50 and 0.72 respectively. Multiple regression analysisindicated that among weather variables, temperature was the key parameter for variation ofsaffron yield. It was concluded that increasing temperature at spring was the main cause of declined saffron yield during recent years across the province. Finally, yield trend was predicted for the last 5 years using time series analysis.
Regression testing in the TOTEM DCS
NASA Astrophysics Data System (ADS)
Rodríguez, F. Lucas; Atanassov, I.; Burkimsher, P.; Frost, O.; Taskinen, J.; Tulimaki, V.
2012-12-01
The Detector Control System of the TOTEM experiment at the LHC is built with the industrial product WinCC OA (PVSS). The TOTEM system is generated automatically through scripts using as input the detector Product Breakdown Structure (PBS) structure and its pinout connectivity, archiving and alarm metainformation, and some other heuristics based on the naming conventions. When those initial parameters and automation code are modified to include new features, the resulting PVSS system can also introduce side-effects. On a daily basis, a custom developed regression testing tool takes the most recent code from a Subversion (SVN) repository and builds a new control system from scratch. This system is exported in plain text format using the PVSS export tool, and compared with a system previously validated by a human. A report is sent to the developers with any differences highlighted, in readiness for validation and acceptance as a new stable version. This regression approach is not dependent on any development framework or methodology. This process has been satisfactory during several months, proving to be a very valuable tool before deploying new versions in the production systems.
A Gibbs sampler for multivariate linear regression
NASA Astrophysics Data System (ADS)
Mantz, Adam B.
2016-04-01
Kelly described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modelled by a flexible mixture of Gaussians rather than assumed to be uniform. Here, I extend the Kelly algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Secondly, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.
Multivariate Regression with Block-structured Predictors
NASA Astrophysics Data System (ADS)
Ye, Saier
We study the problem of predicting multiple responses with a common set of predicting variables. Applying generalized Ordinary Least Squares (OLS) criterion on the responses altogether is practically equivalent to OLS estimation on the responses separately. Possible correlations between the response variables are overlooked. In order to take advantage of these interrelationships, Reduced-Rank Regression (RRR) imposes rank constraint on the coefficient matrix. RRR constructs latent factors from the original predicting variables, and the latent factors are the effective predictors. RRR reduces number of parameters to be estimated, and improves estimation efficiency. In the present work, we explore a novel regression model to incorporate "block-structured" predicting variables, where the predictors can be naturally partitioned into several groups or blocks. Variables in the same block share similar characteristics. It is reasonable to assume that in addition to an overall impact, predictors also have block-specific effects on the responses. Furthermore, we impose rank constraints on the coefficient matrices. In our framework, we construct two types of latent factors that drive the variation in the responses. We have joint factors, which are formed by all predictors across all blocks; and individual factors, which are formed by variables within individual blocks. The proposed method exceeds RRR in terms of prediction accuracy and ease of interpretation in the presence of block structure in the predicting variables.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
A reconsideration of the concept of regression.
Dowling, A Scott
2004-01-01
Regression has been a useful psychoanalytic concept, linking present mental functioning with past experiences and levels of functioning. The concept originated as an extension of the evolutionary zeitgeist of the day as enunciated by H. Spencer and H. Jackson and applied by Freud to psychological phenomena. The value system implicit in the contrast of evolution/progression vs dissolution/regression has given rise to unfortunate and powerful assumptions of social, cultural, developmental and individual value as embodied in notions of "higher," "lower;" "primitive," "mature," "archaic," and "advanced." The unhelpful results of these assumptions are evident, for example, in attitudes concerning cultural, sexual, and social "correctness, " same-sex object choice, and goals of treatment. An alternative, a continuously constructed, continuously emerging mental life, in analogy to the ever changing, continuous physical body, is suggested. This view retains the fundamentals of psychoanalysis, for example, unconscious mental life, drive, defense, and psychic structure, but stresses a functional, ever changing, present oriented understanding of mental life as contrasted with a static, onion-layered view. PMID:16240612
HOS network-based classification of power quality events via regression algorithms
NASA Astrophysics Data System (ADS)
Palomares Salas, José Carlos; González de la Rosa, Juan José; Sierra Fernández, José María; Pérez, Agustín Agüera
2015-12-01
This work compares seven regression algorithms implemented in artificial neural networks (ANNs) supported by 14 power-quality features, which are based in higher-order statistics. Combining time and frequency domain estimators to deal with non-stationary measurement sequences, the final goal of the system is the implementation in the future smart grid to guarantee compatibility between all equipment connected. The principal results are based in spectral kurtosis measurements, which easily adapt to the impulsive nature of the power quality events. These results verify that the proposed technique is capable of offering interesting results for power quality (PQ) disturbance classification. The best results are obtained using radial basis networks, generalized regression, and multilayer perceptron, mainly due to the non-linear nature of data.
Maimistov, Andrei I
2010-11-13
The classic examples of optical phenomena resulting in the appearance of solitons are self-focusing, self-induced transparency, and parametric three-wave interaction. To date, the list of the fields of nonlinear optics and models where solitons play an important role has significantly expanded. Now long-lived or stable solitary waves are called solitons, including, for example, dissipative, gap, parametric, and topological solitons. This review considers nonlinear optics models giving rise to the appearance of solitons in a narrow sense: solitary waves corresponding to the solutions of completely integrable systems of equations basic for the models being discussed. (review)
Nonlinear magnetohydrodynamic stability
NASA Technical Reports Server (NTRS)
Bauer, F.; Betancourt, O.; Garabedian, P.
1981-01-01
The computer code developed by Bauer et al. (1978) for the study of the magnetohydrodynamic equilibrium and stability of a plasma in toroidal geometry is extended so that the growth rates of instabilities may be estimated more accurately. The original code, which is based on the variational principle of ideal magnetohydrodynamics, is upgraded by the introduction of a nonlinear formula for the growth rate of an unstable mode which acts as a quantitative measure of instability that is important in estimating numerical errors. The revised code has been applied to the determination of the nonlinear saturation, ballooning modes and beta limits for tokamaks, stellarators and torsatrons.
Bias and uncertainty in regression-calibrated models of groundwater flow in heterogeneous media
Cooley, R.L.; Christensen, S.
2006-01-01
Groundwater models need to account for detailed but generally unknown spatial variability (heterogeneity) of the hydrogeologic model inputs. To address this problem we replace the large, m-dimensional stochastic vector ?? that reflects both small and large scales of heterogeneity in the inputs by a lumped or smoothed m-dimensional approximation ????*, where ?? is an interpolation matrix and ??* is a stochastic vector of parameters. Vector ??* has small enough dimension to allow its estimation with the available data. The consequence of the replacement is that model function f(????*) written in terms of the approximate inputs is in error with respect to the same model function written in terms of ??, ??,f(??), which is assumed to be nearly exact. The difference f(??) - f(????*), termed model error, is spatially correlated, generates prediction biases, and causes standard confidence and prediction intervals to be too small. Model error is accounted for in the weighted nonlinear regression methodology developed to estimate ??* and assess model uncertainties by incorporating the second-moment matrix of the model errors into the weight matrix. Techniques developed by statisticians to analyze classical nonlinear regression methods are extended to analyze the revised method. The analysis develops analytical expressions for bias terms reflecting the interaction of model nonlinearity and model error, for correction factors needed to adjust the sizes of confidence and prediction intervals for this interaction, and for correction factors needed to adjust the sizes of confidence and prediction intervals for possible use of a diagonal weight matrix in place of the correct one. If terms expressing the degree of intrinsic nonlinearity for f(??) and f(????*) are small, then most of the biases are small and the correction factors are reduced in magnitude. Biases, correction factors, and confidence and prediction intervals were obtained for a test problem for which model error is
Larsson, A
1997-08-01
The objective of this study was to investigate the conditions for regression analysis of data from equilibrium experiments. One important issue was to recognize that Kd and the binding site concentration (A) are not of equal nature, although both are parameters in the regression analysis. Whereas Kd approximates to a true constant, A is subject to experimental variation due to pipetting errors and in solid-phase experiments also to uneven coating properties. While recognizing that the ideal assumptions for ordinary regression analysis are poorly satisfied, different regression models were evaluated by extensive simulations. It was first established by a 'worst case' investigation that a limited error (8%) in the dependent variable is not critical for the results obtained at curve-fitting to Langmuir's equation. Seven different equations were compared for the calculation of data representing a solid-phase equilibrium experiment with statistical but no systematic errors. All the equations are rearrangements of the law of mass action. In this setting the Scatchrd plot gave the best result, but also the double reciprocal and the Woolf plots worked well in weighted analysis. Langmuir's equation gave the best result of the 4 nonlinear regression models tested. The influence of one type of systematic error was also investigated. This assumed that 10% of the label was positioned on particles other than the functional ligand molecules. This systematic error was amplified, which resulted in a substantial bias. The calculated Kd-values varied slightly with the regression method used and were almost 24% too high in the best methods. PMID:9328576
NASA Astrophysics Data System (ADS)
Wheeler, David; Tiefelsdorf, Michael
2005-06-01
Present methodological research on geographically weighted regression (GWR) focuses primarily on extensions of the basic GWR model, while ignoring well-established diagnostics tests commonly used in standard global regression analysis. This paper investigates multicollinearity issues surrounding the local GWR coefficients at a single location and the overall correlation between GWR coefficients associated with two different exogenous variables. Results indicate that the local regression coefficients are potentially collinear even if the underlying exogenous variables in the data generating process are uncorrelated. Based on these findings, applied GWR research should practice caution in substantively interpreting the spatial patterns of local GWR coefficients. An empirical disease-mapping example is used to motivate the GWR multicollinearity problem. Controlled experiments are performed to systematically explore coefficient dependency issues in GWR. These experiments specify global models that use eigenvectors from a spatial link matrix as exogenous variables.
A novel strategy for forensic age prediction by DNA methylation and support vector regression model
Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei
2015-01-01
High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134
NASA Astrophysics Data System (ADS)
Na'imi, S. R.; Shadizadeh, S. R.; Riahi, M. A.; Mirzakhanian, M.
2014-08-01
Porosity and fluid saturation distributions are crucial properties of hydrocarbon reservoirs and are involved in almost all calculations related to reservoir and production. True measurements of these parameters derived from laboratory measurements, are only available at the isolated localities of a reservoir and also are expensive and time-consuming. Therefore, employing other methodologies which have stiffness, simplicity, and cheapness is needful. Support Vector Regression approach is a moderately novel method for doing functional estimation in regression problems. Contrary to conventional neural networks which minimize the error on the training data by the use of usual Empirical Risk Minimization principle, Support Vector Regression minimizes an upper bound on the anticipated risk by means of the Structural Risk Minimization principle. This difference which is the destination in statistical learning causes greater ability of this approach for generalization tasks. In this study, first, appropriate seismic attributes which have an underlying dependency with reservoir porosity and water saturation are extracted. Subsequently, a non-linear support vector regression algorithm is utilized to obtain quantitative formulation between porosity and water saturation parameters and selected seismic attributes. For an undrilled reservoir, in which there are no sufficient core and log data, it is moderately possible to characterize hydrocarbon bearing formation by means of this method.
Nonlinear growing neutrino cosmology
NASA Astrophysics Data System (ADS)
Ayaita, Youness; Baldi, Marco; Führer, Florian; Puchwein, Ewald; Wetterich, Christof
2016-03-01
The energy scale of dark energy, ˜2 ×10-3 eV , is a long way off compared to all known fundamental scales—except for the neutrino masses. If dark energy is dynamical and couples to neutrinos, this is no longer a coincidence. The time at which dark energy starts to behave as an effective cosmological constant can be linked to the time at which the cosmic neutrinos become nonrelativistic. This naturally places the onset of the Universe's accelerated expansion in recent cosmic history, addressing the why-now problem of dark energy. We show that these mechanisms indeed work in the growing neutrino quintessence model—even if the fully nonlinear structure formation and backreaction are taken into account, which were previously suspected of spoiling the cosmological evolution. The attractive force between neutrinos arising from their coupling to dark energy grows as large as 106 times the gravitational strength. This induces very rapid dynamics of neutrino fluctuations which are nonlinear at redshift z ≈2 . Nevertheless, a nonlinear stabilization phenomenon ensures only mildly nonlinear oscillating neutrino overdensities with a large-scale gravitational potential substantially smaller than that of cold dark matter perturbations. Depending on model parameters, the signals of large-scale neutrino lumps may render the cosmic neutrino background observable.
Nonlinear phased array imaging
NASA Astrophysics Data System (ADS)
Croxford, Anthony J.; Cheng, Jingwei; Potter, Jack N.
2016-04-01
A technique is presented for imaging acoustic nonlinearity within a specimen using ultrasonic phased arrays. Acoustic nonlinearity is measured by evaluating the difference in energy of the transmission bandwidth within the diffuse field produced through different focusing modes. The two different modes being classical beam forming, where delays are applied to different element of a phased array to physically focus the energy at a single location (parallel firing) and focusing in post processing, whereby one element at a time is fired and a focused image produced in post processing (sequential firing). Although these two approaches are linearly equivalent the difference in physical displacement within the specimen leads to differences in nonlinear effects. These differences are localized to the areas where the amplitude is different, essentially confining the differences to the focal point. Direct measurement at the focal point are however difficult to make. In order to measure this the diffuse field is used. It is a statistical property of the diffuse field that it represents the total energy in the system. If the energy in the diffuse field for both the sequential and parallel firing case is measured then the difference between these, within the input signal bandwidth, is largely due to differences at the focal spot. This difference therefore gives a localized measurement of where energy is moving out of the transmission bandwidth due to nonlinear effects. This technique is used to image fatigue cracks and other damage types undetectable with conventional linear ultrasonic measurements.
Intramolecular and nonlinear dynamics
Davis, M.J.
1993-12-01
Research in this program focuses on three interconnected areas. The first involves the study of intramolecular dynamics, particularly of highly excited systems. The second area involves the use of nonlinear dynamics as a tool for the study of molecular dynamics and complex kinetics. The third area is the study of the classical/quantum correspondence for highly excited systems, particularly systems exhibiting classical chaos.
Callen, J. D.
2002-11-04
The primary efforts this year have focused on exploring the nonlinear evolution of localized interchange instabilities, some extensions of neoclassical tearing mode theory, and developing a model for the dynamic electrical conductivity in a bumpy cylinder magnetic field. In addition, we have vigorously participated in the computationally-focused NIMROD and CEMM projects.
Nonlinear Theory and Breakdown
NASA Technical Reports Server (NTRS)
Smith, Frank
2007-01-01
The main points of recent theoretical and computational studies on boundary-layer transition and turbulence are to be highlighted. The work is based on high Reynolds numbers and attention is drawn to nonlinear interactions, breakdowns and scales. The research focuses in particular on truly nonlinear theories, i.e. those for which the mean-flow profile is completely altered from its original state. There appear to be three such theories dealing with unsteady nonlinear pressure-displacement interactions (I), with vortex/wave interactions (II), and with Euler-scale flows (III). Specific recent findings noted for these three, and in quantitative agreement with experiments, are the following. Nonlinear finite-time break-ups occur in I, leading to sublayer eruption and vortex formation; here the theory agrees with experiments (Nishioka) regarding the first spike. II gives rise to finite-distance blowup of displacement thickness, then interaction and break-up as above; this theory agrees with experiments (Klebanoff, Nishioka) on the formation of three-dimensional streets. III leads to the prediction of turbulent boundary-layer micro-scale, displacement-and stress-sublayer-thicknesses.
Nonlinear plasmonic nanorulers.
Butet, Jérémy; Martin, Olivier J F
2014-05-27
The evaluation of distances as small as few nanometers using optical waves is a very challenging task that can pave the way for the development of new applications in biotechnology and nanotechnology. In this article, we propose a new measurement method based on the control of the nonlinear optical response of plasmonic nanostructures by means of Fano resonances. It is shown that Fano resonances resulting from the coupling between a bright mode and a dark mode at the fundamental wavelength enable unprecedented and direct manipulation of the nonlinear electromagnetic sources at the nanoscale. In the case of second harmonic generation from gold nanodolmens, the different nonlinear sources distributions induced by the different coupling regimes are clearly revealed in the far-field distribution. Hence, the configuration of the nanostructure can be accurately determined in 3-dimensions by recording the wave scattered at the second harmonic wavelength. Indeed, the conformation of the different elements building the system is encoded in the nonlinear far-field distribution, making second harmonic generation a promising tool for reading 3-dimension plasmonic nanorulers. Furthemore, it is shown that 3-dimension plasmonic nanorulers can be implemented with simpler geometries than in the linear regime while providing complete information on the structure conformation, including the top nanobar position and orientation. PMID:24697565
Universal nonlinear entanglement witnesses
Kotowski, Marcin; Kotowski, Michal
2010-06-15
We give a universal recipe for constructing nonlinear entanglement witnesses able to detect nonclassical correlations in arbitrary systems of distinguishable and/or identical particles for an arbitrary number of constituents. The constructed witnesses are expressed in terms of expectation values of observables. As such, they are, at least in principle, measurable in experiments.
Sparse brain network using penalized linear regression
NASA Astrophysics Data System (ADS)
Lee, Hyekyoung; Lee, Dong Soo; Kang, Hyejin; Kim, Boong-Nyun; Chung, Moo K.
2011-03-01
Sparse partial correlation is a useful connectivity measure for brain networks when it is difficult to compute the exact partial correlation in the small-n large-p setting. In this paper, we formulate the problem of estimating partial correlation as a sparse linear regression with a l1-norm penalty. The method is applied to brain network consisting of parcellated regions of interest (ROIs), which are obtained from FDG-PET images of the autism spectrum disorder (ASD) children and the pediatric control (PedCon) subjects. To validate the results, we check their reproducibilities of the obtained brain networks by the leave-one-out cross validation and compare the clustered structures derived from the brain networks of ASD and PedCon.
Sibling dilution hypothesis: a regression surface analysis.
Marjoribanks, K
2001-08-01
This study examined relationships between sibship size (the number of children in a family), birth order, and measures of academic performance, academic self-concept, and educational aspirations at different levels of family educational resources. As part of a national longitudinal study of Australian secondary school students data were collected from 2,530 boys and 2,450 girls in Years 9 and 10. Regression surfaces were constructed from models that included terms to account for linear, interaction, and curvilinear associations among the variables. Analysis suggests the general propositions (a) family educational resources have significant associations with children's school-related outcomes at different levels of sibling variables, the relationships for girls being curvilinear, and (b) sibling variables continue to have small significant associations with affective and cognitive outcomes, after taking into account variations in family educational resources. That is, the investigation provides only partial support for the sibling dilution hypothesis. PMID:11729548
Tolerance bounds for log gamma regression models
NASA Technical Reports Server (NTRS)
Jones, R. A.; Scholz, F. W.; Ossiander, M.; Shorack, G. R.
1985-01-01
The present procedure for finding lower confidence bounds for the quantiles of Weibull populations, on the basis of the solution of a quadratic equation, is more accurate than current Monte Carlo tables and extends to any location-scale family. It is shown that this method is accurate for all members of the log gamma(K) family, where K = 1/2 to infinity, and works well for censored data, while also extending to regression data. An even more accurate procedure involving an approximation to the Lawless (1982) conditional procedure, with numerical integrations whose tables are independent of the data, is also presented. These methods are applied to the case of failure strengths of ceramic specimens from each of three billets of Si3N4, which have undergone flexural strength testing.
Robust Mediation Analysis Based on Median Regression
Yuan, Ying; MacKinnon, David P.
2014-01-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925
Improving phylogenetic regression under complex evolutionary models.
Mazel, Florent; Davies, T Jonathan; Georges, Damien; Lavergne, Sébastien; Thuiller, Wilfried; Peres-NetoO, Pedro R
2016-02-01
Phylogenetic Generalized Least Square (PGLS) is the tool of choice among phylogenetic comparative methods to measure the correlation between species features such as morphological and life-history traits or niche characteristics. In its usual form, it assumes that the residual variation follows a homogenous model of evolution across the branches of the phylogenetic tree. Since a homogenous model of evolution is unlikely to be realistic in nature, we explored the robustness of the phylogenetic regression when this assumption is violated. We did so by simulating a set of traits under various heterogeneous models of evolution, and evaluating the statistical performance (type I error [the percentage of tests based on samples that incorrectly rejected a true null hypothesis] and power [the percentage of tests that correctly rejected a false null hypothesis]) of classical phylogenetic regression. We found that PGLS has good power but unacceptable type I error rates. This finding is important since this method has been increasingly used in comparative analyses over the last decade. To address this issue, we propose a simple solution based on transforming the underlying variance-covariance matrix to adjust for model heterogeneity within PGLS. We suggest that heterogeneous rates of evolution might be particularly prevalent in large phylogenetic trees, while most current approaches assume a homogenous rate of evolution. Our analysis demonstrates that overlooking rate heterogeneity can result in inflated type I errors, thus misleading comparative analyses. We show that it is possible to correct for this bias even when the underlying model of evolution is not known a priori. PMID:27145604
Collaborative regression-based anatomical landmark detection
NASA Astrophysics Data System (ADS)
Gao, Yaozong; Shen, Dinggang
2015-12-01
Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head & neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods.
Magnetotelluric Data, Stable Distributions and Stable Regression
NASA Astrophysics Data System (ADS)
Chave, A. D.
2013-12-01
The author has noted for many years that the residuals from robust or bounded influence estimates of the magnetotelluric response function are systematically long tailed compared to a Gaussian or Rayleigh distribution. Consequently, the standard statistical model of a Gaussian core contaminated by a fraction of outlying data is not really valid. However, the typical result is an improvement on ordinary least squares, and has become standard in the electromagnetic induction community. A recent re-evaluation of the statistics of magnetotelluric response function estimation has shown that, in almost all cases, the residuals are alpha stable rather than Gaussian. Alpha stable distributions are characterized by four parameters: a shape parameter lying on (0, 2], a skewness parameter, a scale parameter and a location parameter, and cannot be expressed in closed form except for a few special cases. When the shape parameter is 2, the result is Gaussian, but when it is smaller the resulting distribution has infinite variance. Typical magnetotelluric residuals are alpha stable with a shape parameter lying between 1 and 2. This suggests that robust methods improve response function estimates by eliminating data corresponding to the largest stable residuals while leaving the bulk of the population alone. A better statistical approach is based on stable regression that directly accommodates the actual residual distribution without eliminating the most extreme ones. This paper will introduce such an algorithm, and illustrate its functionality with a variety of magnetotelluric data. Further work remains to produce a robust stable regression algorithm that will eliminate real outliers such as lightning strikes or instrument problems without affecting the bulk stable population. Stable distributions are intimately associated with fractional derivative physical processes. Since the Maxwell equations and the constitutive relations pertaining to the earth do not contain any fractional
Tortajada-Genaro, L A; Campíns-Falcó, P
2007-05-15
Multivariate standardisation is proposed for the successful chemiluminescence determination of chromium based on luminol-hydrogen peroxide reaction. In an extended concentration range, non-linear calibration model is needed. The studied instrumental situations were different detection cells, instruments, assemblies, time and their possible combinations. Chemiluminescence kinetic registers have been transferred using piecewise direct standardisation (PDS) method. The optimisation of transfer parameters has been carried out based on the prediction residual error criteria. Non-linear principal component regression (NL-PCR) and non-linear partial least square regression (NL-PLS) were chosen for modelling the relationship signal-concentration of transferred registers. Good accuracy and precision were obtained for water samples. The concentrations of chromium were statistically in agreement with reference method values and with recovery studies. Therefore, it is possible to transfer chemiluminescence curves without loosing ability of prediction, even the presence of a non-linear behaviour. PMID:19071716
Generalized nonlinear models for rear-end crash risk analysis.
Lao, Yunteng; Zhang, Guohui; Wang, Yinhai; Milton, John
2014-01-01
A generalized nonlinear model (GNM)-based approach for modeling highway rear-end crash risk is formulated using Washington State traffic safety data. Previous studies majorly focused on causal factor identification and crash risk modeling using Generalized linear Models (GLMs), such as Poisson regression, Logistic regression, etc. However, their basic assumption of a generalized linear relationship between the dependent variable (for example, crash rate) and independent variables (for example, contribute factors to crashes) established via a link function can be often violated in reality. Consequently, the GLM-based modeling results could provide biased findings and conclusions. In this research, a GNM-based approach is developed to utilize a nonlinear regression function to better elaborate non-monotonic relationships between the independent and dependent variables using the rear end accident data collected from 10 highway routes from 2002 through 2006. The results show for example that truck percentage and grade have a parabolic impact: they increase crash risks initially, but decrease them after the certain thresholds. Such non-monotonic relationships cannot be captured by regular GLMs which further demonstrate the flexibility of GNM-based approaches in the nonlinear relationship among data and providing more reasonable explanations. The superior GNM-based model interpretations help better understand the parabolic impacts of some specific contributing factors for selecting and evaluating rear-end crash safety improvement plans. PMID:24125803
Nonlinear EEG Decoding Based on a Particle Filter Model
Hong, Jun
2014-01-01
While the world is stepping into the aging society, rehabilitation robots play a more and more important role in terms of both rehabilitation treatment and nursing of the patients with neurological diseases. Benefiting from the abundant contents of movement information, electroencephalography (EEG) has become a promising information source for rehabilitation robots control. Although the multiple linear regression model was used as the decoding model of EEG signals in some researches, it has been considered that it cannot reflect the nonlinear components of EEG signals. In order to overcome this shortcoming, we propose a nonlinear decoding model, the particle filter model. Two- and three-dimensional decoding experiments were performed to test the validity of this model. In decoding accuracy, the results are comparable to those of the multiple linear regression model and previous EEG studies. In addition, the particle filter model uses less training data and more frequency information than the multiple linear regression model, which shows the potential of nonlinear decoding models. Overall, the findings hold promise for the furtherance of EEG-based rehabilitation robots. PMID:24949420
ERIC Educational Resources Information Center
Hecht, Jeffrey B.
The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…
Photonic single nonlinear-delay dynamical node for information processing
NASA Astrophysics Data System (ADS)
Ortín, Silvia; San-Martín, Daniel; Pesquera, Luis; Gutiérrez, José Manuel
2012-06-01
An electro-optical system with a delay loop based on semiconductor lasers is investigated for information processing by performing numerical simulations. This system can replace a complex network of many nonlinear elements for the implementation of Reservoir Computing. We show that a single nonlinear-delay dynamical system has the basic properties to perform as reservoir: short-term memory and separation property. The computing performance of this system is evaluated for two prediction tasks: Lorenz chaotic time series and nonlinear auto-regressive moving average (NARMA) model. We sweep the parameters of the system to find the best performance. The results achieved for the Lorenz and the NARMA-10 tasks are comparable to those obtained by other machine learning methods.
Probabilistic seismic demand analysis of nonlinear structures
NASA Astrophysics Data System (ADS)
Shome, Nilesh
Recent earthquakes in California have initiated improvement in current design philosophy and at present the civil engineering community is working towards development of performance-based earthquake engineering of structures. The objective of this study is to develop efficient, but accurate procedures for probabilistic analysis of nonlinear seismic behavior of structures. The proposed procedures help the near-term development of seismic-building assessments which require an estimation of seismic demand at a given intensity level. We also develop procedures to estimate the probability of exceedance of any specified nonlinear response level due to future ground motions at a specific site. This is referred as Probabilistic Seismic Demand Analysis (PSDA). The latter procedure prepares the way for the next stage development of seismic assessment that consider the uncertainties in nonlinear response and capacity. The proposed procedures require structure-specific nonlinear analyses for a relatively small set of recorded accelerograms and (site-specific or USGS-map-like) seismic hazard analyses. We have addressed some of the important issues of nonlinear seismic demand analysis, which are selection of records for structural analysis, the number of records to be used, scaling of records, etc. Initially these issues are studied through nonlinear analysis of structures for a number of magnitude-distance bins of records. Subsequently we introduce regression analysis of response results against spectral acceleration, magnitude, duration, etc., which helps to resolve these issues more systematically. We illustrate the demand-hazard calculations through two major example problems: a 5story and a 20-story SMRF building. Several simple, but quite accurate closed-form solutions have also been proposed to expedite the demand-hazard calculations. We find that vector-valued (e.g., 2-D) PSDA estimates demand hazard more accurately. This procedure, however, requires information about 2
Technological Forecasting with a Multiple Regression Analysis Approach.
ERIC Educational Resources Information Center
Luftig, Jeffrey T.; Norton, Willis P.
1981-01-01
This article examines simple and multiple regression analysis as forecasting tools, and details the process by which multiple regression analysis may be used to increase the accuracy of the technology forecast. (CT)
Cubication of Conservative Nonlinear Oscillators
ERIC Educational Resources Information Center
Belendez, Augusto; Alvarez, Mariela L.; Fernandez, Elena; Pascual, Immaculada
2009-01-01
A cubication procedure of the nonlinear differential equation for conservative nonlinear oscillators is analysed and discussed. This scheme is based on the Chebyshev series expansion of the restoring force, and this allows us to approximate the original nonlinear differential equation by a Duffing equation in which the coefficients for the linear…
Logistic Regression: Going beyond Point-and-Click.
ERIC Educational Resources Information Center
King, Jason E.
A review of the literature reveals that important statistical algorithms and indices pertaining to logistic regression are being underused. This paper describes logistic regression in comparison with discriminant analysis and linear regression, and suggests that some techniques only accessible through computer syntax should be consulted in…
Heterogeneous Treatment Effects: What Does a Regression Estimate?
ERIC Educational Resources Information Center
Rhodes, William
2010-01-01
Regressions that control for confounding factors are the workhorse of evaluation research. When treatment effects are heterogeneous, however, the workhorse regression leads to estimated treatment effects that lack behavioral interpretations even when the selection on observables assumption holds. Regressions that use propensity scores as weights…
Orthogonal Projection in Teaching Regression and Financial Mathematics
ERIC Educational Resources Information Center
Kachapova, Farida; Kachapov, Ilias
2010-01-01
Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Using Time-Series Regression to Predict Academic Library Circulations.
ERIC Educational Resources Information Center
Brooks, Terrence A.
1984-01-01
Four methods were used to forecast monthly circulation totals in 15 midwestern academic libraries: dummy time-series regression, lagged time-series regression, simple average (straight-line forecasting), monthly average (naive forecasting). In tests of forecasting accuracy, dummy regression method and monthly mean method exhibited smallest average…
Spatial vulnerability assessments by regression kriging
NASA Astrophysics Data System (ADS)
Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor
2016-04-01
information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).
NASA Technical Reports Server (NTRS)
Van Hoven, G.; Steinolfson, R. S.
1984-01-01
A series of nonlinear computations of tearing-mode development have been performed which achieve higher values of the magnetic Reynolds number and larger wavelengths than previously considered. A prime candidate for the realization of dynamic reconnection is the resistive magnetic tearing mode, a spontaneous instability of a stressed magnetic field. Typical simulations are described for a magnetic Lundquist number S of 10 to the 4th and wavelength parameters alpha from 0.05 to 0.5. In all cases, the nonlinear mode initially evolves at the linear growth rate, followed by a period of reduced growth. Another common feature is the formation of secondary flow vortices, near the tearing surface, which are opposite in direction to the initial linear vortices.
Nonlinear metamaterials for holography
Almeida, Euclides; Bitton, Ora
2016-01-01
A hologram is an optical element storing phase and possibly amplitude information enabling the reconstruction of a three-dimensional image of an object by illumination and scattering of a coherent beam of light, and the image is generated at the same wavelength as the input laser beam. In recent years, it was shown that information can be stored in nanometric antennas giving rise to ultrathin components. Here we demonstrate nonlinear multilayer metamaterial holograms. A background free image is formed at a new frequency—the third harmonic of the illuminating beam. Using e-beam lithography of multilayer plasmonic nanoantennas, we fabricate polarization-sensitive nonlinear elements such as blazed gratings, lenses and other computer-generated holograms. These holograms are analysed and prospects for future device applications are discussed. PMID:27545581
Nonlinear differential equations
Dresner, L.
1988-01-01
This report is the text of a graduate course on nonlinear differential equations given by the author at the University of Wisconsin-Madison during the summer of 1987. The topics covered are: direction fields of first-order differential equations; the Lie (group) theory of ordinary differential equations; similarity solutions of second-order partial differential equations; maximum principles and differential inequalities; monotone operators and iteration; complementary variational principles; and stability of numerical methods. The report should be of interest to graduate students, faculty, and practicing scientists and engineers. No prior knowledge is required beyond a good working knowledge of the calculus. The emphasis is on practical results. Most of the illustrative examples are taken from the fields of nonlinear diffusion, heat and mass transfer, applied superconductivity, and helium cryogenics.
Nonlinear chiral transport phenomena
NASA Astrophysics Data System (ADS)
Chen, Jiunn-Wei; Ishii, Takeaki; Pu, Shi; Yamamoto, Naoki
2016-06-01
We study the nonlinear responses of relativistic chiral matter to the external fields such as the electric field E , gradients of temperature and chemical potential, ∇T and ∇μ . Using the kinetic theory with Berry curvature corrections under the relaxation time approximation, we compute the transport coefficients of possible new electric currents that are forbidden in usual chirally symmetric matter but are allowed in chirally asymmetric matter by parity. In particular, we find a new type of electric current proportional to ∇μ ×E due to the interplay between the effects of the Berry curvature and collisions. We also derive an analog of the "Wiedemann-Franz" law specific for anomalous nonlinear transport in relativistic chiral matter.
Nonlinearity of Helmholtz resonators
NASA Technical Reports Server (NTRS)
Sirignano, W. A.
1972-01-01
Consideration of the nonlinear damping of pressure oscillations by means of acoustic liners consisting of a perforated plate communicating with a volume or of individual Helmholtz resonators. A nonlinear analysis leads to a modified first-order theory; in particular, some second-order damping effects (due to the formation of jets through the orifices) are considered, while other less important damping effects (of second order) are neglected. The effect of the vena contracta in the orifice flow is also taken into account, and the conditions of maximum damping are discussed. A determination is made of the orifice velocity, the cavity pressure, the admittance coefficient, the resistance, and the reactance, and good agreement is found between the theoretically determined resistance and orifice velocity and the pertinent experimental data.
Nonlinear metamaterials for holography.
Almeida, Euclides; Bitton, Ora; Prior, Yehiam
2016-01-01
A hologram is an optical element storing phase and possibly amplitude information enabling the reconstruction of a three-dimensional image of an object by illumination and scattering of a coherent beam of light, and the image is generated at the same wavelength as the input laser beam. In recent years, it was shown that information can be stored in nanometric antennas giving rise to ultrathin components. Here we demonstrate nonlinear multilayer metamaterial holograms. A background free image is formed at a new frequency-the third harmonic of the illuminating beam. Using e-beam lithography of multilayer plasmonic nanoantennas, we fabricate polarization-sensitive nonlinear elements such as blazed gratings, lenses and other computer-generated holograms. These holograms are analysed and prospects for future device applications are discussed. PMID:27545581
Information geometric nonlinear filtering
NASA Astrophysics Data System (ADS)
Newton, Nigel J.
2015-06-01
This paper develops information geometric representations for nonlinear filters in continuous time. The posterior distribution associated with an abstract nonlinear filtering problem is shown to satisfy a stochastic differential equation on a Hilbert information manifold. This supports the Fisher metric as a pseudo-Riemannian metric. Flows of Shannon information are shown to be connected with the quadratic variation of the process of posterior distributions in this metric. Apart from providing a suitable setting in which to study such information-theoretic properties, the Hilbert manifold has an appropriate topology from the point of view of multi-objective filter approximations. A general class of finite-dimensional exponential filters is shown to fit within this framework, and an intrinsic evolution equation, involving Amari's -1-covariant derivative, is developed for such filters. Three example systems, one of infinite dimension, are developed in detail.
Optothermal nonlinearity of silica aerogel
NASA Astrophysics Data System (ADS)
Braidotti, Maria Chiara; Gentilini, Silvia; Fleming, Adam; Samuels, Michiel C.; Di Falco, Andrea; Conti, Claudio
2016-07-01
We report on the characterization of silica aerogel thermal optical nonlinearity, obtained by z-scan technique. The results show that typical silica aerogels have nonlinear optical coefficient similar to that of glass (≃10-12 m2/W), with negligible optical nonlinear absorption. The nonlinear coefficient can be increased to values in the range of 10-10 m2/W by embedding an absorbing dye in the aerogel. This value is one order of magnitude higher than that observed in the pure dye and in typical highly nonlinear materials like liquid crystals.
Shen, Y.R.; Chen, C.K.; de Castro, A.R.B.
1980-01-01
Surface electromagnetic waves are waves propagating along the interface of two media. Their existence was predicted by Sommerfield in 1909. In recent years, interesting applications have been found in the study of overlayers and molecular adsorption on surfaces, in probing of phase transitions, and in measurements of refractive indices. In the laboratory, the nonlinear interaction of surface electromagnetic waves were studied. The preliminary results of this recent venture in this area are presented.
Noack, Kristina; Eskofier, Björn; Kiefer, Johannes; Dilk, Christina; Bilow, Georg; Schirmer, Matthias; Buchholz, Rainer; Leipertz, Alfred
2013-10-01
The applicability of shifted-excitation Raman difference spectroscopy (SERDS) in combination with signal regression analysis as an alternative and non-invasive approach for monitoring the cultivation of phototrophic microorganisms producing complex molecules of pharmaceutical relevance in a bioreactor is demonstrated. As a model system, the cultivation of the red unicellular algae Porphyridium purpureum is used for focusing on the segregation of sulfated exopolysaccharides (EPS) which exhibit antiviral activity. The spectroscopic results obtained by partial linear least squares regression (PLSR) and by nonlinear support vector regression (SVR) are discussed against the corresponding results from the reference analytics based on the phenol-sulfuric acid assay. The SERDS-approach turns out to have strong potential as a non-invasive tool for online-monitoring of biotechnological processes. PMID:23905163
Data selection using support vector regression
NASA Astrophysics Data System (ADS)
Richman, Michael B.; Leslie, Lance M.; Trafalis, Theodore B.; Mansouri, Hicham
2015-03-01
Geophysical data sets are growing at an ever-increasing rate, requiring computationally efficient data selection (thinning) methods to preserve essential information. Satellites, such as WindSat, provide large data sets for assessing the accuracy and computational efficiency of data selection techniques. A new data thinning technique, based on support vector regression (SVR), is developed and tested. To manage large on-line satellite data streams, observations from WindSat are formed into subsets by Voronoi tessellation and then each is thinned by SVR (TSVR). Three experiments are performed. The first confirms the viability of TSVR for a relatively small sample, comparing it to several commonly used data thinning methods (random selection, averaging and Barnes filtering), producing a 10% thinning rate (90% data reduction), low mean absolute errors (MAE) and large correlations with the original data. A second experiment, using a larger dataset, shows TSVR retrievals with MAE < 1 m s-1 and correlations ⩽ 0.98. TSVR was an order of magnitude faster than the commonly used thinning methods. A third experiment applies a two-stage pipeline to TSVR, to accommodate online data. The pipeline subsets reconstruct the wind field with the same accuracy as the second experiment, is an order of magnitude faster than the nonpipeline TSVR. Therefore, pipeline TSVR is two orders of magnitude faster than commonly used thinning methods that ingest the entire data set. This study demonstrates that TSVR pipeline thinning is an accurate and computationally efficient alternative to commonly used data selection techniques.
A rotor optimization using regression analysis
NASA Technical Reports Server (NTRS)
Giansante, N.
1984-01-01
The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.
Sparse Regression as a Sparse Eigenvalue Problem
NASA Technical Reports Server (NTRS)
Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai
2008-01-01
We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization
Competing Risk Regression Models for Epidemiologic Data
Cole, Stephen R.; Gange, Stephen J.
2009-01-01
Competing events can preclude the event of interest from occurring in epidemiologic data and can be analyzed by using extensions of survival analysis methods. In this paper, the authors outline 3 regression approaches for estimating 2 key quantities in competing risks analysis: the cause-specific relative hazard (csRH) and the subdistribution relative hazard (sdRH). They compare and contrast the structure of the risk sets and the interpretation of parameters obtained with these methods. They also demonstrate the use of these methods with data from the Women's Interagency HIV Study established in 1993, treating time to initiation of highly active antiretroviral therapy or to clinical disease progression as competing events. In our example, women with an injection drug use history were less likely than those without a history of injection drug use to initiate therapy prior to progression to acquired immunodeficiency syndrome or death by both measures of association (csRH = 0.67, 95% confidence interval: 0.57, 0.80 and sdRH = 0.60, 95% confidence interval: 0.50, 0.71). Moreover, the relative hazards for disease progression prior to treatment were elevated (csRH = 1.71, 95% confidence interval: 1.37, 2.13 and sdRH = 2.01, 95% confidence interval: 1.62, 2.51). Methods for competing risks should be used by epidemiologists, with the choice of method guided by the scientific question. PMID:19494242
Cyclodextrin promotes atherosclerosis regression via macrophage reprogramming.
Zimmer, Sebastian; Grebe, Alena; Bakke, Siril S; Bode, Niklas; Halvorsen, Bente; Ulas, Thomas; Skjelland, Mona; De Nardo, Dominic; Labzin, Larisa I; Kerksiek, Anja; Hempel, Chris; Heneka, Michael T; Hawxhurst, Victoria; Fitzgerald, Michael L; Trebicka, Jonel; Björkhem, Ingemar; Gustafsson, Jan-Åke; Westerterp, Marit; Tall, Alan R; Wright, Samuel D; Espevik, Terje; Schultze, Joachim L; Nickenig, Georg; Lütjohann, Dieter; Latz, Eicke
2016-04-01
Atherosclerosis is an inflammatory disease linked to elevated blood cholesterol concentrations. Despite ongoing advances in the prevention and treatment of atherosclerosis, cardiovascular disease remains the leading cause of death worldwide. Continuous retention of apolipoprotein B-containing lipoproteins in the subendothelial space causes a local overabundance of free cholesterol. Because cholesterol accumulation and deposition of cholesterol crystals (CCs) trigger a complex inflammatory response, we tested the efficacy of the cyclic oligosaccharide 2-hydroxypropyl-β-cyclodextrin (CD), a compound that increases cholesterol solubility in preventing and reversing atherosclerosis. We showed that CD treatment of murine atherosclerosis reduced atherosclerotic plaque size and CC load and promoted plaque regression even with a continued cholesterol-rich diet. Mechanistically, CD increased oxysterol production in both macrophages and human atherosclerotic plaques and promoted liver X receptor (LXR)-mediated transcriptional reprogramming to improve cholesterol efflux and exert anti-inflammatory effects. In vivo, this CD-mediated LXR agonism was required for the antiatherosclerotic and anti-inflammatory effects of CD as well as for augmented reverse cholesterol transport. Because CD treatment in humans is safe and CD beneficially affects key mechanisms of atherogenesis, it may therefore be used clinically to prevent or treat human atherosclerosis. PMID:27053774
Flexible regression models over river networks
O’Donnell, David; Rushworth, Alastair; Bowman, Adrian W; Marian Scott, E; Hallard, Mark
2014-01-01
Many statistical models are available for spatial data but the vast majority of these assume that spatial separation can be measured by Euclidean distance. Data which are collected over river networks constitute a notable and commonly occurring exception, where distance must be measured along complex paths and, in addition, account must be taken of the relative flows of water into and out of confluences. Suitable models for this type of data have been constructed based on covariance functions. The aim of the paper is to place the focus on underlying spatial trends by adopting a regression formulation and using methods which allow smooth but flexible patterns. Specifically, kernel methods and penalized splines are investigated, with the latter proving more suitable from both computational and modelling perspectives. In addition to their use in a purely spatial setting, penalized splines also offer a convenient route to the construction of spatiotemporal models, where data are available over time as well as over space. Models which include main effects and spatiotemporal interactions, as well as seasonal terms and interactions, are constructed for data on nitrate pollution in the River Tweed. The results give valuable insight into the changes in water quality in both space and time. PMID:25653460
Cyclodextrin promotes atherosclerosis regression via macrophage reprogramming
Zimmer, Sebastian; Grebe, Alena; Bakke, Siril S.; Bode, Niklas; Halvorsen, Bente; Ulas, Thomas; Skjelland, Mona; De Nardo, Dominic; Labzin, Larisa I.; Kerksiek, Anja; Hempel, Chris; Heneka, Michael T.; Hawxhurst, Victoria; Fitzgerald, Michael L; Trebicka, Jonel; Gustafsson, Jan-Åke; Westerterp, Marit; Tall, Alan R.; Wright, Samuel D.; Espevik, Terje; Schultze, Joachim L.; Nickenig, Georg; Lütjohann, Dieter; Latz, Eicke
2016-01-01
Atherosclerosis is an inflammatory disease linked to elevated blood cholesterol levels. Despite ongoing advances in the prevention and treatment of atherosclerosis, cardiovascular disease remains the leading cause of death worldwide. Continuous retention of apolipoprotein B-containing lipoproteins in the subendothelial space causes a local overabundance of free cholesterol. Since cholesterol accumulation and deposition of cholesterol crystals (CCs) triggers a complex inflammatory response, we tested the efficacy of the cyclic oligosaccharide 2-hydroxypropyl-β-cyclodextrin (CD), a compound that increases cholesterol solubility, in preventing and reversing atherosclerosis. Here we show that CD treatment of murine atherosclerosis reduced atherosclerotic plaque size and CC load, and promoted plaque regression even with a continued cholesterol-rich diet. Mechanistically, CD increased oxysterol production in both macrophages and human atherosclerotic plaques, and promoted liver X receptor (LXR)-mediated transcriptional reprogramming to improve cholesterol efflux and exert anti-inflammatory effects. In vivo, this CD-mediated LXR agonism was required for the anti-atherosclerotic and anti-inflammatory effects of CD as well as for augmented reverse cholesterol transport. Since CD treatment in humans is safe and CD beneficially affects key mechanisms of atherogenesis, it may therefore be used clinically to prevent or treat human atherosclerosis. PMID:27053774
Father regression. Clinical narratives and theoretical reflections.
Stein, Ruth
2006-08-01
The author deals with love-hate enthrallment and submission to a primitive paternal object. This is a father-son relationship that extends through increasing degrees of 'primitiveness' or extremeness, and is illustrated through three different constellations that constitute a continuum. One pole of the continuum encompasses certain male patients who show a loving, de-individuated connection to a father experienced as trustworthy, soft, and in need of protection. Further along the continuum is the case of a transsexual patient whose analysis revealed an intense 'God-transference', a bondage to an idealized, feared, and ostensibly protective father-God introject. A great part of this patient's analysis consisted in a fierce struggle to liberate himself from this figure. The other end of the continuum is occupied by religious terrorists, who exemplify the most radical thralldom to a persecutory, godly object, a regressive submission that banishes woman and enthrones a cruel superego, and that ends in destruction and self-destruction. Psychoanalytic thinking has traditionally dealt with the oedipal father and recently with the nurturing father, but there is a gap in thinking about the phallic, archaic father, and his relations with his son(s). The author aims at filling this gap, at the same time as she also raises the very question of 'What is a father?' linking it with literary and religious themes. PMID:16877249
Ultrafast Thermal Nonlinearity
Khurgin, Jacob B.; Sun, Greg; Chen, Wei Ting; Tsai, Wei-Yi; Tsai, Din Ping
2015-01-01
Third order nonlinear optical phenomena explored in the last half century have been predicted to find wide range of applications in many walks of life, such as all-optical switching, routing, and others, yet this promise has not been fulfilled primarily because the strength of nonlinear effects is too low when they are to occur on the picosecond scale required in today’s signal processing applications. The strongest of the third-order nonlinearities, engendered by thermal effects, is considered to be too slow for the above applications. In this work we show that when optical fields are concentrated into the volumes on the scale of few tens of nanometers, the speed of the thermo-optical effects approaches picosecond scale. Such a sub-diffraction limit concentration of field can be accomplished with the use of plasmonic effects in metal nanoparticles impregnating the thermo-optic dielectric (e.g. amorphous Si) and leads to phase shifts sufficient for all optical switching on ultrafast scale. PMID:26644322
NASA Astrophysics Data System (ADS)
Milgrom, Mordehai
2002-02-01
I investigate the properties of forces on bodies in theories governed by the generalized Poisson equation μ(|ϕ| /a0)ϕ] ∝ Gρ, for the potential ϕ produced by a distribution of sources ρ. This equation describes, inter alia, media with a response coefficient, μ, that depends on the field strength, such as in nonlinear, dielectric or diamagnetic, media; nonlinear transport problems with field-strength-dependent conductivity or diffusion coefficient; nonlinear electrostatics, as in the Born-Infeld theory; certain stationary potential flows in compressible fluids, in which case the forces act on sources or obstacles in the flow. The expressions for the force on a point charge are derived exactly for the limits of very low and very high charge. The force on an arbitrary body in an external field of asymptotically constant gradient, -g0, is shown to be F = Qg0, where Q is the total effective charge of the body. The corollary Q = 0 → F = 0 is a generalization of d'Alembert's paradox. I show that for G > 0 (as in Newtonian gravity) two point charges of the same (opposite) sign still attract (repel). The opposite is true for G < 0. I discuss its generalization to extended bodies and derive virial relations.
Ultrafast Thermal Nonlinearity.
Khurgin, Jacob B; Sun, Greg; Chen, Wei Ting; Tsai, Wei-Yi; Tsai, Din Ping
2015-01-01
Third order nonlinear optical phenomena explored in the last half century have been predicted to find wide range of applications in many walks of life, such as all-optical switching, routing, and others, yet this promise has not been fulfilled primarily because the strength of nonlinear effects is too low when they are to occur on the picosecond scale required in today's signal processing applications. The strongest of the third-order nonlinearities, engendered by thermal effects, is considered to be too slow for the above applications. In this work we show that when optical fields are concentrated into the volumes on the scale of few tens of nanometers, the speed of the thermo-optical effects approaches picosecond scale. Such a sub-diffraction limit concentration of field can be accomplished with the use of plasmonic effects in metal nanoparticles impregnating the thermo-optic dielectric (e.g. amorphous Si) and leads to phase shifts sufficient for all optical switching on ultrafast scale. PMID:26644322
Leitão, J C; Miotto, J M; Gerlach, M; Altmann, E G
2016-07-01
One of the most celebrated findings in complex systems in the last decade is that different indexes y (e.g. patents) scale nonlinearly with the population x of the cities in which they appear, i.e. y∼x (β) ,β≠1. More recently, the generality of this finding has been questioned in studies that used new databases and different definitions of city boundaries. In this paper, we investigate the existence of nonlinear scaling, using a probabilistic framework in which fluctuations are accounted for explicitly. In particular, we show that this allows not only to (i) estimate β and confidence intervals, but also to (ii) quantify the evidence in favour of β≠1 and (iii) test the hypothesis that the observations are compatible with the nonlinear scaling. We employ this framework to compare five different models to 15 different datasets and we find that the answers to points (i)-(iii) crucially depend on the fluctuations contained in the data, on how they are modelled, and on the fact that the city sizes are heavy-tailed distributed. PMID:27493764
Nonlinear gyrokinetic equations
Dubin, D.H.E.; Krommes, J.A.; Oberman, C.; Lee, W.W.
1983-03-01
Nonlinear gyrokinetic equations are derived from a systematic Hamiltonian theory. The derivation employs Lie transforms and a noncanonical perturbation theory first used by Littlejohn for the simpler problem of asymptotically small gyroradius. For definiteness, we emphasize the limit of electrostatic fluctuations in slab geometry; however, there is a straight-forward generalization to arbitrary field geometry and electromagnetic perturbations. An energy invariant for the nonlinear system is derived, and various of its limits are considered. The weak turbulence theory of the equations is examined. In particular, the wave kinetic equation of Galeev and Sagdeev is derived from an asystematic truncation of the equations, implying that this equation fails to consider all gyrokinetic effects. The equations are simplified for the case of small but finite gyroradius and put in a form suitable for efficient computer simulation. Although it is possible to derive the Terry-Horton and Hasegawa-Mima equations as limiting cases of our theory, several new nonlinear terms absent from conventional theories appear and are discussed.
2016-01-01
One of the most celebrated findings in complex systems in the last decade is that different indexes y (e.g. patents) scale nonlinearly with the population x of the cities in which they appear, i.e. y∼xβ,β≠1. More recently, the generality of this finding has been questioned in studies that used new databases and different definitions of city boundaries. In this paper, we investigate the existence of nonlinear scaling, using a probabilistic framework in which fluctuations are accounted for explicitly. In particular, we show that this allows not only to (i) estimate β and confidence intervals, but also to (ii) quantify the evidence in favour of β≠1 and (iii) test the hypothesis that the observations are compatible with the nonlinear scaling. We employ this framework to compare five different models to 15 different datasets and we find that the answers to points (i)–(iii) crucially depend on the fluctuations contained in the data, on how they are modelled, and on the fact that the city sizes are heavy-tailed distributed. PMID:27493764
Filamentation with nonlinear Bessel vortices.
Jukna, V; Milián, C; Xie, C; Itina, T; Dudley, J; Courvoisier, F; Couairon, A
2014-10-20
We present a new type of ring-shaped filaments featured by stationary nonlinear high-order Bessel solutions to the laser beam propagation equation. Two different regimes are identified by direct numerical simulations of the nonlinear propagation of axicon focused Gaussian beams carrying helicity in a Kerr medium with multiphoton absorption: the stable nonlinear propagation regime corresponds to a slow beam reshaping into one of the stationary nonlinear high-order Bessel solutions, called nonlinear Bessel vortices. The region of existence of nonlinear Bessel vortices is found semi-analytically. The influence of the Kerr nonlinearity and nonlinear losses on the beam shape is presented. Direct numerical simulations highlight the role of attractors played by nonlinear Bessel vortices in the stable propagation regime. Large input powers or small cone angles lead to the unstable propagation regime where nonlinear Bessel vortices break up into an helical multiple filament pattern or a more irregular structure. Nonlinear Bessel vortices are shown to be sufficiently intense to generate a ring-shaped filamentary ionized channel in the medium which is foreseen as opening the way to novel applications in laser material processing of transparent dielectrics. PMID:25401574
Rank-preserving regression: a more robust rank regression model against outliers.
Chen, Tian; Kowalski, Jeanne; Chen, Rui; Wu, Pan; Zhang, Hui; Feng, Changyong; Tu, Xin M
2016-08-30
Mean-based semi-parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon-score-based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional-response-model-based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank-preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26934999
Research in nonlinear structural and solid mechanics
NASA Technical Reports Server (NTRS)
Mccomb, H. G., Jr. (Compiler); Noor, A. K. (Compiler)
1980-01-01
Nonlinear analysis of building structures and numerical solution of nonlinear algebraic equations and Newton's method are discussed. Other topics include: nonlinear interaction problems; solution procedures for nonlinear problems; crash dynamics and advanced nonlinear applications; material characterization, contact problems, and inelastic response; and formulation aspects and special software for nonlinear analysis.
Nonlinear models for estimating GSFC travel requirements
NASA Technical Reports Server (NTRS)
Buffalano, C.; Hagan, F. J.
1974-01-01
A methodology is presented for estimating travel requirements for a particular period of time. Travel models were generated using nonlinear regression analysis techniques on a data base of FY-72 and FY-73 information from 79 GSFC projects. Although the subject matter relates to GSFX activities, the type of analysis used and the manner of selecting the relevant variables would be of interest to other NASA centers, government agencies, private corporations and, in general, any organization with a significant travel budget. Models were developed for each of six types of activity: flight projects (in-house and out-of-house), experiments on non-GSFC projects, international projects, ART/SRT, data analysis, advanced studies, tracking and data, and indirects.
NASA Astrophysics Data System (ADS)
Ahn, Kuk-Hyun; Palmer, Richard
2016-09-01
Despite wide use of regression-based regional flood frequency analysis (RFFA) methods, the majority are based on either ordinary least squares (OLS) or generalized least squares (GLS). This paper proposes 'spatial proximity' based RFFA methods using the spatial lagged model (SLM) and spatial error model (SEM). The proposed methods are represented by two frameworks: the quantile regression technique (QRT) and parameter regression technique (PRT). The QRT develops prediction equations for flooding quantiles in average recurrence intervals (ARIs) of 2, 5, 10, 20, and 100 years whereas the PRT provides prediction of three parameters for the selected distribution. The proposed methods are tested using data incorporating 30 basin characteristics from 237 basins in Northeastern United States. Results show that generalized extreme value (GEV) distribution properly represents flood frequencies in the study gages. Also, basin area, stream network, and precipitation seasonality are found to be the most effective explanatory variables in prediction modeling by the QRT and PRT. 'Spatial proximity' based RFFA methods provide reliable flood quantile estimates compared to simpler methods. Compared to the QRT, the PRT may be recommended due to its accuracy and computational simplicity. The results presented in this paper may serve as one possible guidepost for hydrologists interested in flood analysis at ungaged sites.
Comparison of Logistic Regression and Linear Regression in Modeling Percentage Data
Zhao, Lihui; Chen, Yuhuan; Schaffner, Donald W.
2001-01-01
Percentage is widely used to describe different results in food microbiology, e.g., probability of microbial growth, percent inactivated, and percent of positive samples. Four sets of percentage data, percent-growth-positive, germination extent, probability for one cell to grow, and maximum fraction of positive tubes, were obtained from our own experiments and the literature. These data were modeled using linear and logistic regression. Five methods were used to compare the goodness of fit of the two models: percentage of predictions closer to observations, range of the differences (predicted value minus observed value), deviation of the model, linear regression between the observed and predicted values, and bias and accuracy factors. Logistic regression was a better predictor of at least 78% of the observations in all four data sets. In all cases, the deviation of logistic models was much smaller. The linear correlation between observations and logistic predictions was always stronger. Validation (accomplished using part of one data set) also demonstrated that the logistic model was more accurate in predicting new data points. Bias and accuracy factors were found to be less informative when evaluating models developed for percentage data, since neither of these indices can compare predictions at zero. Model simplification for the logistic model was demonstrated with one data set. The simplified model was as powerful in making predictions as the full linear model, and it also gave clearer insight in determining the key experimental factors. PMID:11319091
Vacuum Rabi splitting effect in nanomechanical QED system with nonlinear resonator
NASA Astrophysics Data System (ADS)
Zhao, MingYue; Gao, YiBo
2016-08-01
Considering the intrinsic nonlinearity in a nanomechanical resonator coupled to a charge qubit, vacuum Rabi splitting effect is studied in a nanomechanical QED (qubit-resonator) system. A driven nonlinear Jaynes-Cummings model describes the dynamics of this qubit-resonator system. Using quantum regression theorem and master equation approach, we have calculated the two-time correlation spectrum analytically. In the weak driving limit, these analytical results clarify the influence of the driving strength and nonlinearity parameter on the correlation spectrum. Also, numerical calculations confirm these analytical results.
Deep Human Parsing with Active Template Regression.
Liang, Xiaodan; Liu, Si; Shen, Xiaohui; Yang, Jianchao; Liu, Luoqi; Dong, Jian; Lin, Liang; Yan, Shuicheng
2015-12-01
In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an active template regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches 64.38 percent by our ATR framework, significantly higher than 44.76 percent based on the state-of-the-art algorithm [28]. PMID:26539846
Non-Stationary Hydrologic Frequency Analysis using B-Splines Quantile Regression
NASA Astrophysics Data System (ADS)
Nasri, B.; St-Hilaire, A.; Bouezmarni, T.; Ouarda, T.
2015-12-01
Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic structures and water resources system under the assumption of stationarity. However, with increasing evidence of changing climate, it is possible that the assumption of stationarity would no longer be valid and the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extreme flows based on B-Splines quantile regression, which allows to model non-stationary data that have a dependence on covariates. Such covariates may have linear or nonlinear dependence. A Markov Chain Monte Carlo (MCMC) algorithm is used to estimate quantiles and their posterior distributions. A coefficient of determination for quantiles regression is proposed to evaluate the estimation of the proposed model for each quantile level. The method is applied on annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in these variables and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for annual maximum and minimum discharge with high annual non-exceedance probabilities. Keywords: Quantile regression, B-Splines functions, MCMC, Streamflow, Climate indices, non-stationarity.
NASA Astrophysics Data System (ADS)
Schlechtingen, Meik; Ferreira Santos, Ilmar
2011-07-01
This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal behavior model is compared to two artificial neural network based approaches, which are a full signal reconstruction and an autoregressive normal behavior model. Based on a real time series containing two generator bearing damages the capabilities of identifying the incipient fault prior to the actual failure are investigated. The period after the first bearing damage is used to develop the three normal behavior models. The developed or trained models are used to investigate how the second damage manifests in the prediction error. Furthermore the full signal reconstruction and the autoregressive approach are applied to further real time series containing gearbox bearing damages and stator temperature anomalies. The comparison revealed all three models being capable of detecting incipient faults. However, they differ in the effort required for model development and the remaining operational time after first indication of damage. The general nonlinear neural network approaches outperform the regression model. The remaining seasonality in the regression model prediction error makes it difficult to detect abnormality and leads to increased alarm levels and thus a shorter remaining operational period. For the bearing damages and the stator anomalies under investigation the full signal reconstruction neural network gave the best fault visibility and thus led to the highest confidence level.
Liu, Dawei; Lin, Xihong; Ghosh, Debashis
2007-12-01
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations. PMID:18078480
NASA Technical Reports Server (NTRS)
Patniak, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.
1998-01-01
Nonlinear mathematical-programming-based design optimization can be an elegant method. However, the calculations required to generate the merit function, constraints, and their gradients, which are frequently required, can make the process computational intensive. The computational burden can be greatly reduced by using approximating analyzers derived from an original analyzer utilizing neural networks and linear regression methods. The experience gained from using both of these approximation methods in the design optimization of a high speed civil transport aircraft is the subject of this paper. The Langley Research Center's Flight Optimization System was selected for the aircraft analysis. This software was exercised to generate a set of training data with which a neural network and a regression method were trained, thereby producing the two approximating analyzers. The derived analyzers were coupled to the Lewis Research Center's CometBoards test bed to provide the optimization capability. With the combined software, both approximation methods were examined for use in aircraft design optimization, and both performed satisfactorily. The CPU time for solution of the problem, which had been measured in hours, was reduced to minutes with the neural network approximation and to seconds with the regression method. Instability encountered in the aircraft analysis software at certain design points was also eliminated. On the other hand, there were costs and difficulties associated with training the approximating analyzers. The CPU time required to generate the input-output pairs and to train the approximating analyzers was seven times that required for solution of the problem.
Multiple regression technique for Pth degree polynominals with and without linear cross products
NASA Technical Reports Server (NTRS)
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Predictive Regression Models of Monthly Seismic Energy Emissions Induced by Longwall Mining
NASA Astrophysics Data System (ADS)
Jakubowski, Jacek; Tajduś, Antoni
2014-10-01
This article presents the development and validation of predictive regression models of longwall mining-induced seismicity, based on observations in 63 longwalls, in 12 seams, in the Bielszowice colliery in the Upper Silesian Coal Basin, which took place between 1992 and 2012. A predicted variable is the logarithm of the monthly sum of seismic energy induced in a longwall area. The set of predictors include seven quantitative and qualitative variables describing some mining and geological conditions and earlier seismicity in longwalls. Two machine learning methods have been used to develop the models: boosted regression trees and neural networks. Two types of model validation have been applied: on a random validation sample and on a time-based validation sample. The set of a few selected variables enabled nonlinear regression models to be built which gave relatively small prediction errors, taking the complex and strongly stochastic nature of the phenomenon into account. The article presents both the models of periodic forecasting for the following month as well as long-term forecasting.
Oil and gas pipeline construction cost analysis and developing regression models for cost estimation
NASA Astrophysics Data System (ADS)
Thaduri, Ravi Kiran
In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.
2014-01-01
Background In biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. Methods In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. Results If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar
Frequency domain nonlinear optics
NASA Astrophysics Data System (ADS)
Legare, Francois
2016-05-01
The universal dilemma of gain narrowing occurring in fs amplifiers prevents ultra-high power lasers from delivering few-cycle pulses. This problem is overcome by a new amplification concept: Frequency domain Optical Parametric Amplification - FOPA. It enables simultaneous up-scaling of peak power and amplified spectral bandwidth and can be performed at any wavelength range of conventional amplification schemes, however, with the capability to amplify single cycles of light. The key idea for amplification of octave-spanning spectra without loss of spectral bandwidth is to amplify the broad spectrum ``slice by slice'' in the frequency domain, i.e. in the Fourier plane of a 4f-setup. The striking advantages of this scheme, are its capability to amplify (more than) one octave of bandwidth without shorting the corresponding pulse duration. This is because ultrabroadband phase matching is not defined by the properties of the nonlinear crystal employed but the number of crystals employed. In the same manner, to increase the output energy one simply has to increase the spectral extension in the Fourier plane and to add one more crystal. Thus, increasing pulse energy and shortening its duration accompany each other. A proof of principle experiment was carried out at ALLS on the sub-two cycle IR beam line and yielded record breaking performance in the field of few-cycle IR lasers. 100 μJ two-cycle pulses from a hollow core fibre compression setup were amplified to 1.43mJ without distorting spatial or temporal properties. Pulse duration at the input of FOPA and after FOPA remains the same. Recently, we have started upgrading this system to be pumped by 250 mJ to reach 40 mJ two-cycle IR few-cycle pulses and latest results will be presented at the conference. Furthermore, the extension of the concept of FOPA to other nonlinear optical processes will be discussed. Frequency domain nonlinear optics.
Kepler AutoRegressive Planet Search
NASA Astrophysics Data System (ADS)
Caceres, Gabriel Antonio; Feigelson, Eric
2016-01-01
The Kepler AutoRegressive Planet Search (KARPS) project uses statistical methodology associated with autoregressive (AR) processes to model Kepler lightcurves in order to improve exoplanet transit detection in systems with high stellar variability. We also introduce a planet-search algorithm to detect transits in time-series residuals after application of the AR models. One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The variability displayed by many stars may have autoregressive properties, wherein later flux values are correlated with previous ones in some manner. Our analysis procedure consisting of three steps: pre-processing of the data to remove discontinuities, gaps and outliers; AR-type model selection and fitting; and transit signal search of the residuals using a new Transit Comb Filter (TCF) that replaces traditional box-finding algorithms. The analysis procedures of the project are applied to a portion of the publicly available Kepler light curve data for the full 4-year mission duration. Tests of the methods have been made on a subset of Kepler Objects of Interest (KOI) systems, classified both as planetary `candidates' and `false positives' by the Kepler Team, as well as a random sample of unclassified systems. We find that the ARMA-type modeling successfully reduces the stellar variability, by a factor of 10 or more in active stars and by smaller factors in more quiescent stars. A typical quiescent Kepler star has an interquartile range (IQR) of ~10 e-/sec, which may improve slightly after modeling, while those with IQR ranging from 20 to 50 e-/sec, have improvements from 20% up to 70%. High activity stars (IQR exceeding 100) markedly improve. A periodogram based on the TCF is constructed to concentrate the signal of these periodic spikes. When a periodic transit is found, the model is displayed on a standard period-folded averaged light curve. Our findings to date on real
Nonlinear waveform generation.
Goldstein, L J; Rypins, E B
1990-01-01
We developed three analog logic SPICE (Simulation Program with Integrated Circuit Emphasis, developed at the University of California, Berkeley, CA) subcircuits, a voltage comparator and a nonlinear waveform generator to compliment the previously derived functions (Goldstein and Rypins, Comput. Methods Programs Biomed. 29 (1989) 161-172) that simplify modeling of physiologic systems. The logic elements are the 'AND', 'OR' and 'NOT' Boolean functions. In addition, we derived a voltage comparator for use in our composite waveform generator. All the circuits are analog so they can be incorporated into existing analog circuits while performing digital functions. PMID:2364683
Chaos without nonlinear dynamics.
Corron, Ned J; Hayes, Scott T; Pethel, Shawn D; Blakely, Jonathan N
2006-07-14
A linear, second-order filter driven by randomly polarized pulses is shown to generate a waveform that is chaotic under time reversal. That is, the filter output exhibits determinism and a positive Lyapunov exponent when viewed backward in time. The filter is demonstrated experimentally using a passive electronic circuit, and the resulting waveform exhibits a Lorenz-like butterfly structure. This phenomenon suggests that chaos may be connected to physical theories whose underlying framework is not that of a traditional deterministic nonlinear dynamical system. PMID:16907450
Nonlinear electrodynamics at Cinvestav
NASA Astrophysics Data System (ADS)
Bretón, Nora
2012-02-01
After a brief introduction to the original aims of Nonlinear electrodynamics (NLED), a review on NLED research that has been developed in the Physics Department at Cinvestav-IPN is addressed: from the seminal work by Jerzy Plebañski, which was followed by S. Hacyan and S. Alarcón, afterwards by A. García and H. Salazar; and more recently by E. Ayón-Beato and N. Bretón. We conclude by pointing to the current streams of research.
Nonlinear methods for communications
NASA Astrophysics Data System (ADS)
1992-08-01
An innovative communication system has been developed. This system has the potential for improved secure communication for covert operations. By modulating data on the chaotic signal used to synchronize two nonlinear systems, they have created a Low Probability of Intercept (LPI) communications system. The researchers derived the equations which govern the system, made models of the system, and performed numerical simulations to test these models. The theoretical and numerical studies of this system have been validated by experiment. A recent design improvement has led to a system that synchronizes at 0 db Signal-to-Noise. This development holds the promise of a Low Probability of Detection (LPD) system.
Dr. Katja Lindenberg
2005-11-20
During the one-year period 2004-2005 our work continued to focus on nonlinear noisy systems, with special attention to spatially extended systems. There is a history of many decades of research in the sciences and engineering on the behavior of noninear noisy systems, but only in the past ten years or so has a theoretical understanding of spatially extended systems begun to emerge. This has been the outcome of a symbiosis of numerical simulations not possible until recently, laboratory experiments, and new analytic methods.
Limits on nonlinear electrodynamics
NASA Astrophysics Data System (ADS)
Fouché, M.; Battesti, R.; Rizzo, C.
2016-05-01
In this paper we set a framework in which experiments whose goal is to test QED predictions can be used in a more general way to test nonlinear electrodynamics (NLED) which contains low-energy QED as a special case. We review some of these experiments and we establish limits on the different free parameters by generalizing QED predictions in the framework of NLED. We finally discuss the implications of these limits on bound systems and isolated charged particles for which QED has been widely and successfully tested.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Nonlinear refraction in vitreous humor.
Rockwell, B A; Roach, W P; Rogers, M E; Mayo, M W; Toth, C A; Cain, C P; Noojin, G D
1993-11-01
We extend the application of the z-scan technique to determine the nonlinear refractive index (n(2)) for human and rabbit vitreous humor, water, and physiological saline. In these measurements there were nonlinear contributions to the measured signal from the aqueous samples and the quartz cell that held the sample. Measurements were made with 60-ps pulses at 532 nm. To our knowledge, this is the first measurement of the nonlinear refractive properties of biological material. PMID:19829406
Nonlinear heat conduction with combustion
Galaktionov, V.A.; Kurclyumov, S.P.; Samarskiv, A.A. )
1991-01-01
This paper deals with a study of the properties of high-intensity combustion of a solid nonlinear heat conducting medium which is described by the quasilinear parabolic-type equation for nonlinear heat conduction with a source. The paper summarizes a significant range of investigations dealing with the study of high-intensity thermal processes in solid nonlinear media carried out by the authors in the past decade.
Improved nonlinear prediction method
NASA Astrophysics Data System (ADS)
Adenan, Nur Hamiza; Md Noorani, Mohd Salmi
2014-06-01
The analysis and prediction of time series data have been addressed by researchers. Many techniques have been developed to be applied in various areas, such as weather forecasting, financial markets and hydrological phenomena involving data that are contaminated by noise. Therefore, various techniques to improve the method have been introduced to analyze and predict time series data. In respect of the importance of analysis and the accuracy of the prediction result, a study was undertaken to test the effectiveness of the improved nonlinear prediction method for data that contain noise. The improved nonlinear prediction method involves the formation of composite serial data based on the successive differences of the time series. Then, the phase space reconstruction was performed on the composite data (one-dimensional) to reconstruct a number of space dimensions. Finally the local linear approximation method was employed to make a prediction based on the phase space. This improved method was tested with data series Logistics that contain 0%, 5%, 10%, 20% and 30% of noise. The results show that by using the improved method, the predictions were found to be in close agreement with the observed ones. The correlation coefficient was close to one when the improved method was applied on data with up to 10% noise. Thus, an improvement to analyze data with noise without involving any noise reduction method was introduced to predict the time series data.
Nonlinear Attitude Filtering Methods
NASA Technical Reports Server (NTRS)
Markley, F. Landis; Crassidis, John L.; Cheng, Yang
2005-01-01
This paper provides a survey of modern nonlinear filtering methods for attitude estimation. Early applications relied mostly on the extended Kalman filter for attitude estimation. Since these applications, several new approaches have been developed that have proven to be superior to the extended Kalman filter. Several of these approaches maintain the basic structure of the extended Kalman filter, but employ various modifications in order to provide better convergence or improve other performance characteristics. Examples of such approaches include: filter QUEST, extended QUEST, the super-iterated extended Kalman filter, the interlaced extended Kalman filter, and the second-order Kalman filter. Filters that propagate and update a discrete set of sigma points rather than using linearized equations for the mean and covariance are also reviewed. A two-step approach is discussed with a first-step state that linearizes the measurement model and an iterative second step to recover the desired attitude states. These approaches are all based on the Gaussian assumption that the probability density function is adequately specified by its mean and covariance. Other approaches that do not require this assumption are reviewed, including particle filters and a Bayesian filter based on a non-Gaussian, finite-parameter probability density function on SO(3). Finally, the predictive filter, nonlinear observers and adaptive approaches are shown. The strengths and weaknesses of the various approaches are discussed.
Midorikawa, Katsumi
2010-10-08
We report nonlinear multiphoton processes in atoms and molecules by intense high harmonics and their applications to attosecond pulse characterization. Phase matched high harmonics by a loosely focusing geometry produce highly focusable intensity with fully spatiotemporal coherence, which is sufficient to induce nonlinear optical phenomena in the extreme ultraviolet and soft x-ray (XUV) region. With this XUV coherent light source, two-photon double ionization in He is demonstrated with 42-eV high harmonic photons. On the other hand, when intense high harmonics around 20 eV is subjected to N{sub 2} molecules, occurrence of Coulomb explosion following to two-photon double ionization is observed in attosecond temporal precision. Taking advantage of larger cross section of two-photon ionization in molecules, we successfully perform the interferometric autocorrelation of an attosecond pulse train with the ion signals produced by Coulomb explosion of nitrogen molecules. The result reveals the phase relation between attosecond pulses in the train.
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
ERIC Educational Resources Information Center
Mooijaart, Ab; Satorra, Albert
2009-01-01
In this paper, we show that for some structural equation models (SEM), the classical chi-square goodness-of-fit test is unable to detect the presence of nonlinear terms in the model. As an example, we consider a regression model with latent variables and interactions terms. Not only the model test has zero power against that type of…
Can Flexible Non-Linear Modeling Tell Us Anything New about Educational Productivity?
ERIC Educational Resources Information Center
Baker, Bruce D.
2001-01-01
Explores whether flexible nonlinear models (including neural networks and genetic algorithms) can reveal otherwise unexpected patterns of relationship in typical school-productivity data. Applying three types of algorithms alongside regression modeling to school-level data in 183 elementary schools proves the hypothesis and reveals new directions…
ERIC Educational Resources Information Center
Pek, Jolynn; Chalmers, R. Philip; Kok, Bethany E.; Losardo, Diane
2015-01-01
Structural equation mixture models (SEMMs), when applied as a semiparametric model (SPM), can adequately recover potentially nonlinear latent relationships without their specification. This SPM is useful for exploratory analysis when the form of the latent regression is unknown. The purpose of this article is to help users familiar with structural…
Nonlocal homogenization for nonlinear metamaterials
NASA Astrophysics Data System (ADS)
Gorlach, Maxim A.; Voytova, Tatiana A.; Lapine, Mikhail; Kivshar, Yuri S.; Belov, Pavel A.
2016-04-01
We present a consistent theoretical approach for calculating effective nonlinear susceptibilities of metamaterials taking into account both frequency and spatial dispersion. Employing the discrete dipole model, we demonstrate that effects of spatial dispersion become especially pronounced in the vicinity of effective permittivity resonance where nonlinear susceptibilities reach their maxima. In that case spatial dispersion may enable simultaneous generation of two harmonic signals with the same frequency and polarization but different wave vectors. We also prove that the derived expressions for nonlinear susceptibilities transform into the known form when spatial dispersion effects are negligible. In addition to revealing new physical phenomena, our results provide useful theoretical tools for analyzing resonant nonlinear metamaterials.
Nonlinear ptychographic coherent diffractive imaging.
Odstrcil, M; Baksh, P; Gawith, C; Vrcelj, R; Frey, J G; Brocklesby, W S
2016-09-01
Ptychographic Coherent diffractive imaging (PCDI) is a significant advance in imaging allowing the measurement of the full electric field at a sample without use of any imaging optics. So far it has been confined solely to imaging of linear optical responses. In this paper we show that because of the coherence-preserving nature of nonlinear optical interactions, PCDI can be generalised to nonlinear optical imaging. We demonstrate second harmonic generation PCDI, directly revealing phase information about the nonlinear coefficients, and showing the general applicability of PCDI to nonlinear interactions. PMID:27607631
Problems in nonlinear resistive MHD
Turnbull, A.D.; Strait, E.J.; La Haye, R.J.; Chu, M.S.; Miller, R.L.
1998-12-31
Two experimentally relevant problems can relatively easily be tackled by nonlinear MHD codes. Both problems require plasma rotation in addition to the nonlinear mode coupling and full geometry already incorporated into the codes, but no additional physics seems to be crucial. These problems discussed here are: (1) nonlinear coupling and interaction of multiple MHD modes near the B limit and (2) nonlinear coupling of the m/n = 1/1 sawtooth mode with higher n gongs and development of seed islands outside q = 1.
Spontaneous Regression of a Carcinoid Tumor following Pregnancy
Sewpaul, A.; Bargiela, D.; James, A.; Johnson, S. J.; French, J. J.
2014-01-01
We present a case of spontaneous regression of a neuroendocrine tumor following pregnancy in the absence of chemotherapy, radiotherapy, or alternative medicine (including herbal medicine). The diagnosis of a nonsecretory carcinoid tumor was confirmed using CT imaging, octreotide scan, and histology. Furthermore, serial imaging has demonstrated spontaneous regression of the carcinoid suggesting that pregnancy did not worsen the course of the disease but instead may have contributed to tumour regression. We discuss mechanisms underlying tumour regression and the possible effect of pregnancy on these processes. PMID:25587468
Compound Identification Using Penalized Linear Regression on Metabolomics
Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho
2014-01-01
Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844