Sample records for high-dimensional linear regression

  1. Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.

    PubMed

    Kong, Shengchun; Nan, Bin

    2014-01-01

    We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.

  2. Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso

    PubMed Central

    Kong, Shengchun; Nan, Bin

    2013-01-01

    We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328

  3. Incremental online learning in high dimensions.

    PubMed

    Vijayakumar, Sethu; D'Souza, Aaron; Schaal, Stefan

    2005-12-01

    Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high-dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally efficient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high-dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it (1) learns rapidly with second-order learning methods based on incremental training, (2) uses statistically sound stochastic leave-one-out cross validation for learning without the need to memorize training data, (3) adjusts its weighting kernels based on only local information in order to minimize the danger of negative interference of incremental learning, (4) has a computational complexity that is linear in the number of inputs, and (5) can deal with a large number of-possibly redundant-inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and efficiently operate in very high-dimensional spaces.

  4. AucPR: an AUC-based approach using penalized regression for disease prediction with high-dimensional omics data.

    PubMed

    Yu, Wenbao; Park, Taesung

    2014-01-01

    It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes. We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.

  5. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  6. Compound Identification Using Penalized Linear Regression on Metabolomics

    PubMed Central

    Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho

    2014-01-01

    Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894

  7. Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

    PubMed Central

    Huang, Jian; Zhang, Cun-Hui

    2013-01-01

    The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100

  8. Testing a single regression coefficient in high dimensional linear models

    PubMed Central

    Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling

    2017-01-01

    In linear regression models with high dimensional data, the classical z-test (or t-test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z-test to assess the significance of each covariate. Based on the p-value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively. PMID:28663668

  9. Testing a single regression coefficient in high dimensional linear models.

    PubMed

    Lan, Wei; Zhong, Ping-Shou; Li, Runze; Wang, Hansheng; Tsai, Chih-Ling

    2016-11-01

    In linear regression models with high dimensional data, the classical z -test (or t -test) for testing the significance of each single regression coefficient is no longer applicable. This is mainly because the number of covariates exceeds the sample size. In this paper, we propose a simple and novel alternative by introducing the Correlated Predictors Screening (CPS) method to control for predictors that are highly correlated with the target covariate. Accordingly, the classical ordinary least squares approach can be employed to estimate the regression coefficient associated with the target covariate. In addition, we demonstrate that the resulting estimator is consistent and asymptotically normal even if the random errors are heteroscedastic. This enables us to apply the z -test to assess the significance of each covariate. Based on the p -value obtained from testing the significance of each covariate, we further conduct multiple hypothesis testing by controlling the false discovery rate at the nominal level. Then, we show that the multiple hypothesis testing achieves consistent model selection. Simulation studies and empirical examples are presented to illustrate the finite sample performance and the usefulness of the proposed method, respectively.

  10. Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method

    ERIC Educational Resources Information Center

    Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev

    2018-01-01

    The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…

  11. High-Dimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps

    DOE PAGES

    Thimmisetty, Charanraj A.; Ghanem, Roger G.; White, Joshua A.; ...

    2017-10-10

    This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen high-dimensional interpolation by inferring non-linear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a low-dimensional manifold embedded in an ambient high-dimensional space associated with the data. Itmore » relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the high-dimensional ambient space. The proposed method is applicable to arbitrarily high-dimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of the geological structural properties is observed with the proposed method.« less

  12. High-Dimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thimmisetty, Charanraj A.; Ghanem, Roger G.; White, Joshua A.

    This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen high-dimensional interpolation by inferring non-linear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a low-dimensional manifold embedded in an ambient high-dimensional space associated with the data. Itmore » relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the high-dimensional ambient space. The proposed method is applicable to arbitrarily high-dimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of the geological structural properties is observed with the proposed method.« less

  13. Stratification for the propensity score compared with linear regression techniques to assess the effect of treatment or exposure.

    PubMed

    Senn, Stephen; Graf, Erika; Caputo, Angelika

    2007-12-30

    Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.

  14. High dimensional linear regression models under long memory dependence and measurement error

    NASA Astrophysics Data System (ADS)

    Kaul, Abhishek

    This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.

  15. Prediction of clinical depression scores and detection of changes in whole-brain using resting-state functional MRI data with partial least squares regression

    PubMed Central

    Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672

  16. Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

    DTIC Science & Technology

    1974-01-01

    REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans

  17. The comparison of robust partial least squares regression with robust principal component regression on a real

    NASA Astrophysics Data System (ADS)

    Polat, Esra; Gunay, Suleyman

    2013-10-01

    One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.

  18. Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

    PubMed

    Haoliang Yuan; Yuan Yan Tang

    2017-04-01

    Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

  19. Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models

    PubMed Central

    Jiang, Dingfeng; Huang, Jian

    2013-01-01

    Recent studies have demonstrated theoretical attractiveness of a class of concave penalties in variable selection, including the smoothly clipped absolute deviation and minimax concave penalties. The computation of the concave penalized solutions in high-dimensional models, however, is a difficult task. We propose a majorization minimization by coordinate descent (MMCD) algorithm for computing the concave penalized solutions in generalized linear models. In contrast to the existing algorithms that use local quadratic or local linear approximation to the penalty function, the MMCD seeks to majorize the negative log-likelihood by a quadratic loss, but does not use any approximation to the penalty. This strategy makes it possible to avoid the computation of a scaling factor in each update of the solutions, which improves the efficiency of coordinate descent. Under certain regularity conditions, we establish theoretical convergence property of the MMCD. We implement this algorithm for a penalized logistic regression model using the SCAD and MCP penalties. Simulation studies and a data example demonstrate that the MMCD works sufficiently fast for the penalized logistic regression in high-dimensional settings where the number of covariates is much larger than the sample size. PMID:25309048

  20. Sufficient Forecasting Using Factor Models

    PubMed Central

    Fan, Jianqing; Xue, Lingzhou; Yao, Jiawei

    2017-01-01

    We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables. PMID:29731537

  1. Effect of Contact Damage on the Strength of Ceramic Materials.

    DTIC Science & Technology

    1982-10-01

    variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F

  2. Hyper-Spectral Image Analysis With Partially Latent Regression and Spatial Markov Dependencies

    NASA Astrophysics Data System (ADS)

    Deleforge, Antoine; Forbes, Florence; Ba, Sileye; Horaud, Radu

    2015-09-01

    Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.

  3. SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *

    PubMed Central

    Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.

    2014-01-01

    The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844

  4. STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.

    PubMed

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2014-06-01

    Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.

  5. STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION

    PubMed Central

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2014-01-01

    Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560

  6. Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics.

    PubMed

    Fisher, Charles K; Mehta, Pankaj

    2015-06-01

    Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  7. Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time-to-Event Analysis.

    PubMed

    Gong, Xiajing; Hu, Meng; Zhao, Liang

    2018-05-01

    Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time-to-event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high-dimensional data featured by a large number of predictor variables. Our results showed that ML-based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high-dimensional data. The prediction performances of ML-based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML-based methods provide a powerful tool for time-to-event analysis, with a built-in capacity for high-dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  8. A High-Dimensional, Multivariate Copula Approach to Modeling Multivariate Agricultural Price Relationships and Tail Dependencies

    Treesearch

    Xuan Chi; Barry Goodwin

    2012-01-01

    Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...

  9. Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions.

    PubMed

    Drouard, Vincent; Horaud, Radu; Deleforge, Antoine; Ba, Sileye; Evangelidis, Georgios

    2017-03-01

    Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging, because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose to use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available data sets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.

  10. Some comparisons of complexity in dictionary-based and linear computational models.

    PubMed

    Gnecco, Giorgio; Kůrková, Věra; Sanguineti, Marcello

    2011-03-01

    Neural networks provide a more flexible approximation of functions than traditional linear regression. In the latter, one can only adjust the coefficients in linear combinations of fixed sets of functions, such as orthogonal polynomials or Hermite functions, while for neural networks, one may also adjust the parameters of the functions which are being combined. However, some useful properties of linear approximators (such as uniqueness, homogeneity, and continuity of best approximation operators) are not satisfied by neural networks. Moreover, optimization of parameters in neural networks becomes more difficult than in linear regression. Experimental results suggest that these drawbacks of neural networks are offset by substantially lower model complexity, allowing accuracy of approximation even in high-dimensional cases. We give some theoretical results comparing requirements on model complexity for two types of approximators, the traditional linear ones and so called variable-basis types, which include neural networks, radial, and kernel models. We compare upper bounds on worst-case errors in variable-basis approximation with lower bounds on such errors for any linear approximator. Using methods from nonlinear approximation and integral representations tailored to computational units, we describe some cases where neural networks outperform any linear approximator. Copyright © 2010 Elsevier Ltd. All rights reserved.

  11. Decomposition and model selection for large contingency tables.

    PubMed

    Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter

    2010-04-01

    Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.

  12. A Review on Dimension Reduction

    PubMed Central

    Ma, Yanyuan; Zhu, Liping

    2013-01-01

    Summary Summarizing the effect of many covariates through a few linear combinations is an effective way of reducing covariate dimension and is the backbone of (sufficient) dimension reduction. Because the replacement of high-dimensional covariates by low-dimensional linear combinations is performed with a minimum assumption on the specific regression form, it enjoys attractive advantages as well as encounters unique challenges in comparison with the variable selection approach. We review the current literature of dimension reduction with an emphasis on the two most popular models, where the dimension reduction affects the conditional distribution and the conditional mean, respectively. We discuss various estimation and inference procedures in different levels of detail, with the intention of focusing on their underneath idea instead of technicalities. We also discuss some unsolved problems in this area for potential future research. PMID:23794782

  13. Nonparametric regression applied to quantitative structure-activity relationships

    PubMed

    Constans; Hirst

    2000-03-01

    Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.

  14. [Correlation between gaseous exchange rate, body temperature, and mitochondrial protein content in the liver of mice].

    PubMed

    Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E

    2002-01-01

    Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.

  15. Linear Regression Links Transcriptomic Data and Cellular Raman Spectra.

    PubMed

    Kobayashi-Kirschvink, Koseki J; Nakaoka, Hidenori; Oda, Arisa; Kamei, Ken-Ichiro F; Nosho, Kazuki; Fukushima, Hiroko; Kanesaki, Yu; Yajima, Shunsuke; Masaki, Haruhiko; Ohta, Kunihiro; Wakamoto, Yuichi

    2018-06-08

    Raman microscopy is an imaging technique that has been applied to assess molecular compositions of living cells to characterize cell types and states. However, owing to the diverse molecular species in cells and challenges of assigning peaks to specific molecules, it has not been clear how to interpret cellular Raman spectra. Here, we provide firm evidence that cellular Raman spectra and transcriptomic profiles of Schizosaccharomyces pombe and Escherichia coli can be computationally connected and thus interpreted. We find that the dimensions of high-dimensional Raman spectra and transcriptomes measured by RNA sequencing can be reduced and connected linearly through a shared low-dimensional subspace. Accordingly, we were able to predict global gene expression profiles by applying the calculated transformation matrix to Raman spectra, and vice versa. Highly expressed non-coding RNAs contributed to the Raman-transcriptome linear correspondence more significantly than mRNAs in S. pombe. This demonstration of correspondence between cellular Raman spectra and transcriptomes is a promising step toward establishing spectroscopic live-cell omics studies. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  17. Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

    PubMed

    Yang, Xiaowei; Nie, Kun

    2008-03-15

    Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.

  18. HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION

    PubMed Central

    Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong

    2015-01-01

    In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645

  19. Low-rank separated representation surrogates of high-dimensional stochastic functions: Application in Bayesian inference

    NASA Astrophysics Data System (ADS)

    Validi, AbdoulAhad

    2014-03-01

    This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.

  20. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

    2015-05-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.

  1. Locally Linear Embedding of Local Orthogonal Least Squares Images for Face Recognition

    NASA Astrophysics Data System (ADS)

    Hafizhelmi Kamaru Zaman, Fadhlan

    2018-03-01

    Dimensionality reduction is very important in face recognition since it ensures that high-dimensionality data can be mapped to lower dimensional space without losing salient and integral facial information. Locally Linear Embedding (LLE) has been previously used to serve this purpose, however, the process of acquiring LLE features requires high computation and resources. To overcome this limitation, we propose a locally-applied Local Orthogonal Least Squares (LOLS) model can be used as initial feature extraction before the application of LLE. By construction of least squares regression under orthogonal constraints we can preserve more discriminant information in the local subspace of facial features while reducing the overall features into a more compact form that we called LOLS images. LLE can then be applied on the LOLS images to maps its representation into a global coordinate system of much lower dimensionality. Several experiments carried out using publicly available face datasets such as AR, ORL, YaleB, and FERET under Single Sample Per Person (SSPP) constraint demonstrates that our proposed method can reduce the time required to compute LLE features while delivering better accuracy when compared to when either LLE or OLS alone is used. Comparison against several other feature extraction methods and more recent feature-learning method such as state-of-the-art Convolutional Neural Networks (CNN) also reveal the superiority of the proposed method under SSPP constraint.

  2. Spectral Regression Discriminant Analysis for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Pan, Y.; Wu, J.; Huang, H.; Liu, J.

    2012-08-01

    Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for Hyperspectral Image Classification. The manifold learning methods are popular for dimensionality reduction, such as Locally Linear Embedding, Isomap, and Laplacian Eigenmap. However, a disadvantage of many manifold learning methods is that their computations usually involve eigen-decomposition of dense matrices which is expensive in both time and memory. In this paper, we introduce a new dimensionality reduction method, called Spectral Regression Discriminant Analysis (SRDA). SRDA casts the problem of learning an embedding function into a regression framework, which avoids eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizes can be naturally incorporated into our algorithm which makes it more flexible. It can make efficient use of data points to discover the intrinsic discriminant structure in the data. Experimental results on Washington DC Mall and AVIRIS Indian Pines hyperspectral data sets demonstrate the effectiveness of the proposed method.

  3. Robust estimation for partially linear models with large-dimensional covariates

    PubMed Central

    Zhu, LiPing; Li, RunZe; Cui, HengJian

    2014-01-01

    We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087

  4. Robust estimation for partially linear models with large-dimensional covariates.

    PubMed

    Zhu, LiPing; Li, RunZe; Cui, HengJian

    2013-10-01

    We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.

  5. Enhancement of partial robust M-regression (PRM) performance using Bisquare weight function

    NASA Astrophysics Data System (ADS)

    Mohamad, Mazni; Ramli, Norazan Mohamed; Ghani@Mamat, Nor Azura Md; Ahmad, Sanizah

    2014-09-01

    Partial Least Squares (PLS) regression is a popular regression technique for handling multicollinearity in low and high dimensional data which fits a linear relationship between sets of explanatory and response variables. Several robust PLS methods are proposed to accommodate the classical PLS algorithms which are easily affected with the presence of outliers. The recent one was called partial robust M-regression (PRM). Unfortunately, the use of monotonous weighting function in the PRM algorithm fails to assign appropriate and proper weights to large outliers according to their severity. Thus, in this paper, a modified partial robust M-regression is introduced to enhance the performance of the original PRM. A re-descending weight function, known as Bisquare weight function is recommended to replace the fair function in the PRM. A simulation study is done to assess the performance of the modified PRM and its efficiency is also tested in both contaminated and uncontaminated simulated data under various percentages of outliers, sample sizes and number of predictors.

  6. Sparse partial least squares regression for simultaneous dimension reduction and variable selection

    PubMed Central

    Chun, Hyonho; Keleş, Sündüz

    2010-01-01

    Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611

  7. Koopman Invariant Subspaces and Finite Linear Representations of Nonlinear Dynamical Systems for Control.

    PubMed

    Brunton, Steven L; Brunton, Bingni W; Proctor, Joshua L; Kutz, J Nathan

    2016-01-01

    In this wIn this work, we explore finite-dimensional linear representations of nonlinear dynamical systems by restricting the Koopman operator to an invariant subspace spanned by specially chosen observable functions. The Koopman operator is an infinite-dimensional linear operator that evolves functions of the state of a dynamical system. Dominant terms in the Koopman expansion are typically computed using dynamic mode decomposition (DMD). DMD uses linear measurements of the state variables, and it has recently been shown that this may be too restrictive for nonlinear systems. Choosing the right nonlinear observable functions to form an invariant subspace where it is possible to obtain linear reduced-order models, especially those that are useful for control, is an open challenge. Here, we investigate the choice of observable functions for Koopman analysis that enable the use of optimal linear control techniques on nonlinear problems. First, to include a cost on the state of the system, as in linear quadratic regulator (LQR) control, it is helpful to include these states in the observable subspace, as in DMD. However, we find that this is only possible when there is a single isolated fixed point, as systems with multiple fixed points or more complicated attractors are not globally topologically conjugate to a finite-dimensional linear system, and cannot be represented by a finite-dimensional linear Koopman subspace that includes the state. We then present a data-driven strategy to identify relevant observable functions for Koopman analysis by leveraging a new algorithm to determine relevant terms in a dynamical system by ℓ1-regularized regression of the data in a nonlinear function space; we also show how this algorithm is related to DMD. Finally, we demonstrate the usefulness of nonlinear observable subspaces in the design of Koopman operator optimal control laws for fully nonlinear systems using techniques from linear optimal control.ork, we explore finite-dimensional linear representations of nonlinear dynamical systems by restricting the Koopman operator to an invariant subspace spanned by specially chosen observable functions. The Koopman operator is an infinite-dimensional linear operator that evolves functions of the state of a dynamical system. Dominant terms in the Koopman expansion are typically computed using dynamic mode decomposition (DMD). DMD uses linear measurements of the state variables, and it has recently been shown that this may be too restrictive for nonlinear systems. Choosing the right nonlinear observable functions to form an invariant subspace where it is possible to obtain linear reduced-order models, especially those that are useful for control, is an open challenge. Here, we investigate the choice of observable functions for Koopman analysis that enable the use of optimal linear control techniques on nonlinear problems. First, to include a cost on the state of the system, as in linear quadratic regulator (LQR) control, it is helpful to include these states in the observable subspace, as in DMD. However, we find that this is only possible when there is a single isolated fixed point, as systems with multiple fixed points or more complicated attractors are not globally topologically conjugate to a finite-dimensional linear system, and cannot be represented by a finite-dimensional linear Koopman subspace that includes the state. We then present a data-driven strategy to identify relevant observable functions for Koopman analysis by leveraging a new algorithm to determine relevant terms in a dynamical system by ℓ1-regularized regression of the data in a nonlinear function space; we also show how this algorithm is related to DMD. Finally, we demonstrate the usefulness of nonlinear observable subspaces in the design of Koopman operator optimal control laws for fully nonlinear systems using techniques from linear optimal control.

  8. Optimal Wavelength Selection on Hyperspectral Data with Fused Lasso for Biomass Estimation of Tropical Rain Forest

    NASA Astrophysics Data System (ADS)

    Takayama, T.; Iwasaki, A.

    2016-06-01

    Above-ground biomass prediction of tropical rain forest using remote sensing data is of paramount importance to continuous large-area forest monitoring. Hyperspectral data can provide rich spectral information for the biomass prediction; however, the prediction accuracy is affected by a small-sample-size problem, which widely exists as overfitting in using high dimensional data where the number of training samples is smaller than the dimensionality of the samples due to limitation of require time, cost, and human resources for field surveys. A common approach to addressing this problem is reducing the dimensionality of dataset. Also, acquired hyperspectral data usually have low signal-to-noise ratio due to a narrow bandwidth and local or global shifts of peaks due to instrumental instability or small differences in considering practical measurement conditions. In this work, we propose a methodology based on fused lasso regression that select optimal bands for the biomass prediction model with encouraging sparsity and grouping, which solves the small-sample-size problem by the dimensionality reduction from the sparsity and the noise and peak shift problem by the grouping. The prediction model provided higher accuracy with root-mean-square error (RMSE) of 66.16 t/ha in the cross-validation than other methods; multiple linear analysis, partial least squares regression, and lasso regression. Furthermore, fusion of spectral and spatial information derived from texture index increased the prediction accuracy with RMSE of 62.62 t/ha. This analysis proves efficiency of fused lasso and image texture in biomass estimation of tropical forests.

  9. CyTOF workflow: differential discovery in high-throughput high-dimensional cytometry datasets

    PubMed Central

    Nowicka, Malgorzata; Krieg, Carsten; Weber, Lukas M.; Hartmann, Felix J.; Guglietta, Silvia; Becher, Burkhard; Levesque, Mitchell P.; Robinson, Mark D.

    2017-01-01

    High dimensional mass and flow cytometry (HDCyto) experiments have become a method of choice for high throughput interrogation and characterization of cell populations.Here, we present an R-based pipeline for differential analyses of HDCyto data, largely based on Bioconductor packages. We computationally define cell populations using FlowSOM clustering, and facilitate an optional but reproducible strategy for manual merging of algorithm-generated clusters. Our workflow offers different analysis paths, including association of cell type abundance with a phenotype or changes in signaling markers within specific subpopulations, or differential analyses of aggregated signals. Importantly, the differential analyses we show are based on regression frameworks where the HDCyto data is the response; thus, we are able to model arbitrary experimental designs, such as those with batch effects, paired designs and so on. In particular, we apply generalized linear mixed models to analyses of cell population abundance or cell-population-specific analyses of signaling markers, allowing overdispersion in cell count or aggregated signals across samples to be appropriately modeled. To support the formal statistical analyses, we encourage exploratory data analysis at every step, including quality control (e.g. multi-dimensional scaling plots), reporting of clustering results (dimensionality reduction, heatmaps with dendrograms) and differential analyses (e.g. plots of aggregated signals). PMID:28663787

  10. Comprehensive Chemical Fingerprinting of High-Quality Cocoa at Early Stages of Processing: Effectiveness of Combined Untargeted and Targeted Approaches for Classification and Discrimination.

    PubMed

    Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara

    2017-08-02

    This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.

  11. Inverse regression-based uncertainty quantification algorithms for high-dimensional models: Theory and practice

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Weixuan; Lin, Guang; Li, Bing

    2016-09-01

    A well-known challenge in uncertainty quantification (UQ) is the "curse of dimensionality". However, many high-dimensional UQ problems are essentially low-dimensional, because the randomness of the quantity of interest (QoI) is caused only by uncertain parameters varying within a low-dimensional subspace, known as the sufficient dimension reduction (SDR) subspace. Motivated by this observation, we propose and demonstrate in this paper an inverse regression-based UQ approach (IRUQ) for high-dimensional problems. Specifically, we use an inverse regression procedure to estimate the SDR subspace and then convert the original problem to a low-dimensional one, which can be efficiently solved by building a response surface model such as a polynomial chaos expansion. The novelty and advantages of the proposed approach is seen in its computational efficiency and practicality. Comparing with Monte Carlo, the traditionally preferred approach for high-dimensional UQ, IRUQ with a comparable cost generally gives much more accurate solutions even for high-dimensional problems, and even when the dimension reduction is not exactly sufficient. Theoretically, IRUQ is proved to converge twice as fast as the approach it uses seeking the SDR subspace. For example, while a sliced inverse regression method converges to the SDR subspace at the rate ofmore » $$O(n^{-1/2})$$, the corresponding IRUQ converges at $$O(n^{-1})$$. IRUQ also provides several desired conveniences in practice. It is non-intrusive, requiring only a simulator to generate realizations of the QoI, and there is no need to compute the high-dimensional gradient of the QoI. Finally, error bars can be derived for the estimation results reported by IRUQ.« less

  12. A novel multi-target regression framework for time-series prediction of drug efficacy.

    PubMed

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-18

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task.

  13. A novel multi-target regression framework for time-series prediction of drug efficacy

    PubMed Central

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-01

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task. PMID:28098186

  14. Three-Dimensional City Determinants of the Urban Heat Island: A Statistical Approach

    NASA Astrophysics Data System (ADS)

    Chun, Bum Seok

    There is no doubt that the Urban Heat Island (UHI) is a mounting problem in built-up environments, due to the energy retention by the surface materials of dense buildings, leading to increased temperatures, air pollution, and energy consumption. Much of the earlier research on the UHI has used two-dimensional (2-D) information, such as land uses and the distribution of vegetation. In the case of homogeneous land uses, it is possible to predict surface temperatures with reasonable accuracy with 2-D information. However, three-dimensional (3-D) information is necessary to analyze more complex sites, including dense building clusters. Recent research on the UHI has started to consider multi-dimensional models. The purpose of this research is to explore the urban determinants of the UHI, using 2-D/3-D urban information with statistical modeling. The research includes the following stages: (a) estimating urban temperature, using satellite images, (b) developing a 3-D city model by LiDAR data, (c) generating geometric parameters with regard to 2-/3-D geospatial information, and (d) conducting different statistical analyses: OLS and spatial regressions. The research area is part of the City of Columbus, Ohio. To effectively and systematically analyze the UHI, hierarchical grid scales (480m, 240m, 120m, 60m, and 30m) are proposed, together with linear and the log-linear regression models. The non-linear OLS models with Log(AST) as dependent variable have the highest R2 among all the OLS-estimated models. However, both SAR and GSM models are estimated for the 480m, 240m, 120m, and 60m grids to reduce their spatial dependency. Most GSM models have R2s higher than 0.9, except for the 240m grid. Overall, the urban characteristics having high impacts in all grids are embodied in solar radiation, 3-D open space, greenery, and water streams. These results demonstrate that it is possible to mitigate the UHI, providing guidelines for policies aiming to reduce the UHI.

  15. Analytical three-point Dixon method: With applications for spiral water-fat imaging.

    PubMed

    Wang, Dinghui; Zwart, Nicholas R; Li, Zhiqiang; Schär, Michael; Pipe, James G

    2016-02-01

    The goal of this work is to present a new three-point analytical approach with flexible even or uneven echo increments for water-fat separation and to evaluate its feasibility with spiral imaging. Two sets of possible solutions of water and fat are first found analytically. Then, two field maps of the B0 inhomogeneity are obtained by linear regression. The initial identification of the true solution is facilitated by the root-mean-square error of the linear regression and the incorporation of a fat spectrum model. The resolved field map after a region-growing algorithm is refined iteratively for spiral imaging. The final water and fat images are recalculated using a joint water-fat separation and deblurring algorithm. Successful implementations were demonstrated with three-dimensional gradient-echo head imaging and single breathhold abdominal imaging. Spiral, high-resolution T1 -weighted brain images were shown with comparable sharpness to the reference Cartesian images. With appropriate choices of uneven echo increments, it is feasible to resolve the aliasing of the field map voxel-wise. High-quality water-fat spiral imaging can be achieved with the proposed approach. © 2015 Wiley Periodicals, Inc.

  16. Analysis and generation of groundwater concentration time series

    NASA Astrophysics Data System (ADS)

    Crăciun, Maria; Vamoş, Călin; Suciu, Nicolae

    2018-01-01

    Concentration time series are provided by simulated concentrations of a nonreactive solute transported in groundwater, integrated over the transverse direction of a two-dimensional computational domain and recorded at the plume center of mass. The analysis of a statistical ensemble of time series reveals subtle features that are not captured by the first two moments which characterize the approximate Gaussian distribution of the two-dimensional concentration fields. The concentration time series exhibit a complex preasymptotic behavior driven by a nonstationary trend and correlated fluctuations with time-variable amplitude. Time series with almost the same statistics are generated by successively adding to a time-dependent trend a sum of linear regression terms, accounting for correlations between fluctuations around the trend and their increments in time, and terms of an amplitude modulated autoregressive noise of order one with time-varying parameter. The algorithm generalizes mixing models used in probability density function approaches. The well-known interaction by exchange with the mean mixing model is a special case consisting of a linear regression with constant coefficients.

  17. Bayesian Analysis of High Dimensional Classification

    NASA Astrophysics Data System (ADS)

    Mukhopadhyay, Subhadeep; Liang, Faming

    2009-12-01

    Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.

  18. TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis

    NASA Astrophysics Data System (ADS)

    Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.

    2016-02-01

    In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.

  19. Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.

    PubMed

    Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin

    We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.

  20. Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification

    PubMed Central

    Feng, Yang; Jiang, Jiancheng; Tong, Xin

    2015-01-01

    We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing. PMID:27185970

  1. Viscoelastic Parameters for Quantifying Liver Fibrosis: Three-Dimensional Multifrequency MR Elastography Study on Thin Liver Rat Slices

    PubMed Central

    Ronot, Maxime; Lambert, Simon A.; Wagner, Mathilde; Garteiser, Philippe; Doblas, Sabrina; Albuquerque, Miguel; Paradis, Valérie; Vilgrain, Valérie; Sinkus, Ralph; Van Beers, Bernard E.

    2014-01-01

    Objective To assess in a high-resolution model of thin liver rat slices which viscoelastic parameter at three-dimensional multifrequency MR elastography has the best diagnostic performance for quantifying liver fibrosis. Materials and Methods The study was approved by the ethics committee for animal care of our institution. Eight normal rats and 42 rats with carbon tetrachloride induced liver fibrosis were used in the study. The rats were sacrificed, their livers were resected and three-dimensional MR elastography of 5±2 mm liver slices was performed at 7T with mechanical frequencies of 500, 600 and 700 Hz. The complex shear, storage and loss moduli, and the coefficient of the frequency power law were calculated. At histopathology, fibrosis and inflammation were assessed with METAVIR score, fibrosis was further quantified with morphometry. The diagnostic value of the viscoelastic parameters for assessing fibrosis severity was evaluated with simple and multiple linear regressions, receiver operating characteristic analysis and Obuchowski measures. Results At simple regression, the shear, storage and loss moduli were associated with the severity of fibrosis. At multiple regression, the storage modulus at 600 Hz was the only parameter associated with fibrosis severity (r = 0.86, p<0.0001). This parameter had an Obuchowski measure of 0.89+/−0.03. This measure was significantly larger than that of the loss modulus (0.78+/−0.04, p = 0.028), but not than that of the complex shear modulus (0.88+/−0.03, p = 0.84). Conclusion Our high resolution, three-dimensional multifrequency MR elastography study of thin liver slices shows that the storage modulus is the viscoelastic parameter that has the best association with the severity of liver fibrosis. However, its diagnostic performance does not differ significantly from that of the complex shear modulus. PMID:24722733

  2. A simple approach to quantitative analysis using three-dimensional spectra based on selected Zernike moments.

    PubMed

    Zhai, Hong Lin; Zhai, Yue Yuan; Li, Pei Zhen; Tian, Yue Li

    2013-01-21

    A very simple approach to quantitative analysis is proposed based on the technology of digital image processing using three-dimensional (3D) spectra obtained by high-performance liquid chromatography coupled with a diode array detector (HPLC-DAD). As the region-based shape features of a grayscale image, Zernike moments with inherently invariance property were employed to establish the linear quantitative models. This approach was applied to the quantitative analysis of three compounds in mixed samples using 3D HPLC-DAD spectra, and three linear models were obtained, respectively. The correlation coefficients (R(2)) for training and test sets were more than 0.999, and the statistical parameters and strict validation supported the reliability of established models. The analytical results suggest that the Zernike moment selected by stepwise regression can be used in the quantitative analysis of target compounds. Our study provides a new idea for quantitative analysis using 3D spectra, which can be extended to the analysis of other 3D spectra obtained by different methods or instruments.

  3. A geometric approach to non-linear correlations with intrinsic scatter

    NASA Astrophysics Data System (ADS)

    Pihajoki, Pauli

    2017-12-01

    We propose a new mathematical model for n - k-dimensional non-linear correlations with intrinsic scatter in n-dimensional data. The model is based on Riemannian geometry and is naturally symmetric with respect to the measured variables and invariant under coordinate transformations. We combine the model with a Bayesian approach for estimating the parameters of the correlation relation and the intrinsic scatter. A side benefit of the approach is that censored and truncated data sets and independent, arbitrary measurement errors can be incorporated. We also derive analytic likelihoods for the typical astrophysical use case of linear relations in n-dimensional Euclidean space. We pay particular attention to the case of linear regression in two dimensions and compare our results to existing methods. Finally, we apply our methodology to the well-known MBH-σ correlation between the mass of a supermassive black hole in the centre of a galactic bulge and the corresponding bulge velocity dispersion. The main result of our analysis is that the most likely slope of this correlation is ∼6 for the data sets used, rather than the values in the range of ∼4-5 typically quoted in the literature for these data.

  4. High-resolution proxies for wood density variations in Terminalia superba

    PubMed Central

    De Ridder, Maaike; Van den Bulcke, Jan; Vansteenkiste, Dries; Van Loo, Denis; Dierick, Manuel; Masschaele, Bert; De Witte, Yoni; Mannes, David; Lehmann, Eberhard; Beeckman, Hans; Van Hoorebeke, Luc; Van Acker, Joris

    2011-01-01

    Background and Aims Density is a crucial variable in forest and wood science and is evaluated by a multitude of methods. Direct gravimetric methods are mostly destructive and time-consuming. Therefore, faster and semi- to non-destructive indirect methods have been developed. Methods Profiles of wood density variations with a resolution of approx. 50 µm were derived from one-dimensional resistance drillings, two-dimensional neutron scans, and three-dimensional neutron and X-ray scans. All methods were applied on Terminalia superba Engl. & Diels, an African pioneer species which sometimes exhibits a brown heart (limba noir). Key Results The use of X-ray tomography combined with a reference material permitted direct estimates of wood density. These X-ray-derived densities overestimated gravimetrically determined densities non-significantly and showed high correlation (linear regression, R2 = 0·995). When comparing X-ray densities with the attenuation coefficients of neutron scans and the amplitude of drilling resistance, a significant linear relation was found with the neutron attenuation coefficient (R2 = 0·986) yet a weak relation with drilling resistance (R2 = 0·243). When density patterns are compared, all three methods are capable of revealing the same trends. Differences are mainly due to the orientation of tree rings and the different characteristics of the indirect methods. Conclusions High-resolution X-ray computed tomography is a promising technique for research on wood cores and will be explored further on other temperate and tropical species. Further study on limba noir is necessary to reveal the causes of density variations and to determine how resistance drillings can be further refined. PMID:21131386

  5. Hybrid Support Vector Regression and Autoregressive Integrated Moving Average Models Improved by Particle Swarm Optimization for Property Crime Rates Forecasting with Economic Indicators

    PubMed Central

    Alwee, Razana; Hj Shamsuddin, Siti Mariyam; Sallehuddin, Roselina

    2013-01-01

    Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models. PMID:23766729

  6. Hybrid support vector regression and autoregressive integrated moving average models improved by particle swarm optimization for property crime rates forecasting with economic indicators.

    PubMed

    Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina

    2013-01-01

    Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.

  7. High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

    PubMed

    Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

    2018-05-30

    NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Reliable two-dimensional phase unwrapping method using region growing and local linear estimation.

    PubMed

    Zhou, Kun; Zaitsev, Maxim; Bao, Shanglian

    2009-10-01

    In MRI, phase maps can provide useful information about parameters such as field inhomogeneity, velocity of blood flow, and the chemical shift between water and fat. As phase is defined in the (-pi,pi] range, however, phase wraps often occur, which complicates image analysis and interpretation. This work presents a two-dimensional phase unwrapping algorithm that uses quality-guided region growing and local linear estimation. The quality map employs the variance of the second-order partial derivatives of the phase as the quality criterion. Phase information from unwrapped neighboring pixels is used to predict the correct phase of the current pixel using a linear regression method. The algorithm was tested on both simulated and real data, and is shown to successfully unwrap phase images that are corrupted by noise and have rapidly changing phase. (c) 2009 Wiley-Liss, Inc.

  9. Koopman Invariant Subspaces and Finite Linear Representations of Nonlinear Dynamical Systems for Control

    PubMed Central

    Brunton, Steven L.; Brunton, Bingni W.; Proctor, Joshua L.; Kutz, J. Nathan

    2016-01-01

    In this work, we explore finite-dimensional linear representations of nonlinear dynamical systems by restricting the Koopman operator to an invariant subspace spanned by specially chosen observable functions. The Koopman operator is an infinite-dimensional linear operator that evolves functions of the state of a dynamical system. Dominant terms in the Koopman expansion are typically computed using dynamic mode decomposition (DMD). DMD uses linear measurements of the state variables, and it has recently been shown that this may be too restrictive for nonlinear systems. Choosing the right nonlinear observable functions to form an invariant subspace where it is possible to obtain linear reduced-order models, especially those that are useful for control, is an open challenge. Here, we investigate the choice of observable functions for Koopman analysis that enable the use of optimal linear control techniques on nonlinear problems. First, to include a cost on the state of the system, as in linear quadratic regulator (LQR) control, it is helpful to include these states in the observable subspace, as in DMD. However, we find that this is only possible when there is a single isolated fixed point, as systems with multiple fixed points or more complicated attractors are not globally topologically conjugate to a finite-dimensional linear system, and cannot be represented by a finite-dimensional linear Koopman subspace that includes the state. We then present a data-driven strategy to identify relevant observable functions for Koopman analysis by leveraging a new algorithm to determine relevant terms in a dynamical system by ℓ1-regularized regression of the data in a nonlinear function space; we also show how this algorithm is related to DMD. Finally, we demonstrate the usefulness of nonlinear observable subspaces in the design of Koopman operator optimal control laws for fully nonlinear systems using techniques from linear optimal control. PMID:26919740

  10. Internet gaming disorder in early adolescence: Associations with parental and adolescent mental health.

    PubMed

    Wartberg, L; Kriston, L; Kramer, M; Schwedler, A; Lincoln, T M; Kammerl, R

    2017-06-01

    Internet gaming disorder (IGD) has been included in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). Currently, associations between IGD in early adolescence and mental health are largely unexplained. In the present study, the relation of IGD with adolescent and parental mental health was investigated for the first time. We surveyed 1095 family dyads (an adolescent aged 12-14 years and a related parent) with a standardized questionnaire for IGD as well as for adolescent and parental mental health. We conducted linear (dimensional approach) and logistic (categorical approach) regression analyses. Both with dimensional and categorical approaches, we observed statistically significant associations between IGD and male gender, a higher degree of adolescent antisocial behavior, anger control problems, emotional distress, self-esteem problems, hyperactivity/inattention and parental anxiety (linear regression model: corrected R 2 =0.41, logistic regression model: Nagelkerke's R 2 =0.41). IGD appears to be associated with internalizing and externalizing problems in adolescents. Moreover, the findings of the present study provide first evidence that not only adolescent but also parental mental health is relevant to IGD in early adolescence. Adolescent and parental mental health should be considered in prevention and intervention programs for IGD in adolescence. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  11. Inverse regression-based uncertainty quantification algorithms for high-dimensional models: Theory and practice

    NASA Astrophysics Data System (ADS)

    Li, Weixuan; Lin, Guang; Li, Bing

    2016-09-01

    Many uncertainty quantification (UQ) approaches suffer from the curse of dimensionality, that is, their computational costs become intractable for problems involving a large number of uncertainty parameters. In these situations, the classic Monte Carlo often remains the preferred method of choice because its convergence rate O (n - 1 / 2), where n is the required number of model simulations, does not depend on the dimension of the problem. However, many high-dimensional UQ problems are intrinsically low-dimensional, because the variation of the quantity of interest (QoI) is often caused by only a few latent parameters varying within a low-dimensional subspace, known as the sufficient dimension reduction (SDR) subspace in the statistics literature. Motivated by this observation, we propose two inverse regression-based UQ algorithms (IRUQ) for high-dimensional problems. Both algorithms use inverse regression to convert the original high-dimensional problem to a low-dimensional one, which is then efficiently solved by building a response surface for the reduced model, for example via the polynomial chaos expansion. The first algorithm, which is for the situations where an exact SDR subspace exists, is proved to converge at rate O (n-1), hence much faster than MC. The second algorithm, which doesn't require an exact SDR, employs the reduced model as a control variate to reduce the error of the MC estimate. The accuracy gain could still be significant, depending on how well the reduced model approximates the original high-dimensional one. IRUQ also provides several additional practical advantages: it is non-intrusive; it does not require computing the high-dimensional gradient of the QoI; and it reports an error bar so the user knows how reliable the result is.

  12. The effect of biological movement variability on the performance of the golf swing in high- and low-handicapped players.

    PubMed

    Bradshaw, Elizabeth J; Keogh, Justin W L; Hume, Patria A; Maulder, Peter S; Nortje, Jacques; Marnewick, Michel

    2009-06-01

    The purpose of this study was to examine the role of neuromotor noise on golf swing performance in high- and low-handicap players. Selected two-dimensional kinematic measures of 20 male golfers (n=10 per high- or low-handicap group) performing 10 golf swings with a 5-iron club was obtained through video analysis. Neuromotor noise was calculated by deducting the standard error of the measurement from the coefficient of variation obtained from intra-individual analysis. Statistical methods included linear regression analysis and one-way analysis of variance using SPSS. Absolute invariance in the key technical positions (e.g., at the top of the backswing) of the golf swing appears to be a more favorable technique for skilled performance.

  13. Optical conductivity of three and two dimensional topological nodal-line semimetals

    NASA Astrophysics Data System (ADS)

    Barati, Shahin; Abedinpour, Saeed H.

    2017-10-01

    The peculiar shape of the Fermi surface of topological nodal-line semimetals at low carrier concentrations results in their unusual optical and transport properties. We analytically investigate the linear optical responses of three- and two-dimensional nodal-line semimetals using the Kubo formula. The optical conductivity of a three-dimensional nodal-line semimetal is anisotropic. Along the axial direction (i.e., the direction perpendicular to the nodal-ring plane), the Drude weight has a linear dependence on the chemical potential at both low and high carrier dopings. For the radial direction (i.e., the direction parallel to the nodal-ring plane), this dependence changes from linear into quadratic in the transition from low into high carrier concentration. The interband contribution into optical conductivity is also anisotropic. In particular, at large frequencies, it saturates to a constant value for the axial direction and linearly increases with frequency along the radial direction. In two-dimensional nodal-line semimetals, no interband optical transition could be induced and the only contribution to the optical conductivity arises from the intraband excitations. The corresponding Drude weight is independent of the carrier density at low carrier concentrations and linearly increases with chemical potential at high carrier doping.

  14. Differences in Student Evaluations of Limited-Term Lecturers and Full-Time Faculty

    ERIC Educational Resources Information Center

    Cho, Jeong-Il; Otani, Koichiro; Kim, B. Joon

    2014-01-01

    This study compared student evaluations of teaching (SET) for limited-term lecturers (LTLs) and full-time faculty (FTF) using a Likert-scaled survey administered to students (N = 1,410) at the end of university courses. Data were analyzed using a general linear regression model to investigate the influence of multi-dimensional evaluation items on…

  15. A computer program for uncertainty analysis integrating regression and Bayesian methods

    USGS Publications Warehouse

    Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary

    2014-01-01

    This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.

  16. GLOBALLY ADAPTIVE QUANTILE REGRESSION WITH ULTRA-HIGH DIMENSIONAL DATA

    PubMed Central

    Zheng, Qi; Peng, Limin; He, Xuming

    2015-01-01

    Quantile regression has become a valuable tool to analyze heterogeneous covaraite-response associations that are often encountered in practice. The development of quantile regression methodology for high dimensional covariates primarily focuses on examination of model sparsity at a single or multiple quantile levels, which are typically prespecified ad hoc by the users. The resulting models may be sensitive to the specific choices of the quantile levels, leading to difficulties in interpretation and erosion of confidence in the results. In this article, we propose a new penalization framework for quantile regression in the high dimensional setting. We employ adaptive L1 penalties, and more importantly, propose a uniform selector of the tuning parameter for a set of quantile levels to avoid some of the potential problems with model selection at individual quantile levels. Our proposed approach achieves consistent shrinkage of regression quantile estimates across a continuous range of quantiles levels, enhancing the flexibility and robustness of the existing penalized quantile regression methods. Our theoretical results include the oracle rate of uniform convergence and weak convergence of the parameter estimators. We also use numerical studies to confirm our theoretical findings and illustrate the practical utility of our proposal. PMID:26604424

  17. Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Tang, Kunkun; Congedo, Pietro; Abgrall, Remi

    2014-11-01

    Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.

  18. Approximating high-dimensional dynamics by barycentric coordinates with linear programming

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hirata, Yoshito, E-mail: yoshito@sat.t.u-tokyo.ac.jp; Aihara, Kazuyuki; Suzuki, Hideyuki

    The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics ofmore » the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data.« less

  19. Approximating high-dimensional dynamics by barycentric coordinates with linear programming.

    PubMed

    Hirata, Yoshito; Shiro, Masanori; Takahashi, Nozomu; Aihara, Kazuyuki; Suzuki, Hideyuki; Mas, Paloma

    2015-01-01

    The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics of the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data.

  20. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    ERIC Educational Resources Information Center

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  1. Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time‐to‐Event Analysis

    PubMed Central

    Gong, Xiajing; Hu, Meng

    2018-01-01

    Abstract Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time‐to‐event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high‐dimensional data featured by a large number of predictor variables. Our results showed that ML‐based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high‐dimensional data. The prediction performances of ML‐based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML‐based methods provide a powerful tool for time‐to‐event analysis, with a built‐in capacity for high‐dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. PMID:29536640

  2. Modeling vocalization with ECoG cortical activity recorded during vocal production in the macaque monkey.

    PubMed

    Fukushima, Makoto; Saunders, Richard C; Fujii, Naotaka; Averbeck, Bruno B; Mishkin, Mortimer

    2014-01-01

    Vocal production is an example of controlled motor behavior with high temporal precision. Previous studies have decoded auditory evoked cortical activity while monkeys listened to vocalization sounds. On the other hand, there have been few attempts at decoding motor cortical activity during vocal production. Here we recorded cortical activity during vocal production in the macaque with a chronically implanted electrocorticographic (ECoG) electrode array. The array detected robust activity in motor cortex during vocal production. We used a nonlinear dynamical model of the vocal organ to reduce the dimensionality of `Coo' calls produced by the monkey. We then used linear regression to evaluate the information in motor cortical activity for this reduced representation of calls. This simple linear model accounted for circa 65% of the variance in the reduced sound representations, supporting the feasibility of using the dynamical model of the vocal organ for decoding motor cortical activity during vocal production.

  3. Estimation of Human Body Volume (BV) from Anthropometric Measurements Based on Three-Dimensional (3D) Scan Technique.

    PubMed

    Liu, Xingguo; Niu, Jianwei; Ran, Linghua; Liu, Taijie

    2017-08-01

    This study aimed to develop estimation formulae for the total human body volume (BV) of adult males using anthropometric measurements based on a three-dimensional (3D) scanning technique. Noninvasive and reliable methods to predict the total BV from anthropometric measurements based on a 3D scan technique were addressed in detail. A regression analysis of BV based on four key measurements was conducted for approximately 160 adult male subjects. Eight total models of human BV show that the predicted results fitted by the regression models were highly correlated with the actual BV (p < 0.001). Two metrics, the mean value of the absolute difference between the actual and predicted BV (V error ) and the mean value of the ratio between V error and actual BV (RV error ), were calculated. The linear model based on human weight was recommended as the most optimal due to its simplicity and high efficiency. The proposed estimation formulae are valuable for estimating total body volume in circumstances in which traditional underwater weighing or air displacement plethysmography is not applicable or accessible. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266.

  4. Quantifying Melt Ponds in the Beaufort MIZ using Linear Support Vector Machines from High Resolution Panchromatic Images

    NASA Astrophysics Data System (ADS)

    Ortiz, M.; Graber, H. C.; Wilkinson, J.; Nyman, L. M.; Lund, B.

    2017-12-01

    Much work has been done on determining changes in summer ice albedo and morphological properties of melt ponds such as depth, shape and distribution using in-situ measurements and satellite-based sensors. Although these studies have dedicated much pioneering work in this area, there still lacks sufficient spatial and temporal scales. We present a prototype algorithm using Linear Support Vector Machines (LSVMs) designed to quantify the evolution of melt pond fraction from a recently government-declassified high-resolution panchromatic optical dataset. The study area of interest lies within the Beaufort marginal ice zone (MIZ), where several in-situ instruments were deployed by the British Antarctic Survey in joint with the MIZ Program, from April-September, 2014. The LSVM uses four dimensional feature data from the intensity image itself, and from various textures calculated from a modified first-order histogram technique using probability density of occurrences. We explore both the temporal evolution of melt ponds and spatial statistics such as pond fraction, pond area, and number pond density, to name a few. We also introduce a linear regression model that can potentially be used to estimate average pond area by ingesting several melt pond statistics and shape parameters.

  5. Ganglion cell-inner plexiform layer and retinal nerve fiber layer thickness according to myopia and optic disc area: a quantitative and three-dimensional analysis.

    PubMed

    Seo, Sam; Lee, Chong Eun; Jeong, Jae Hoon; Park, Ki Ho; Kim, Dong Myung; Jeoung, Jin Wook

    2017-03-11

    To determine the influences of myopia and optic disc size on ganglion cell-inner plexiform layer (GCIPL) and peripapillary retinal nerve fiber layer (RNFL) thickness profiles obtained by spectral domain optical coherence tomography (OCT). One hundred and sixty-eight eyes of 168 young myopic subjects were recruited and assigned to one of three groups according to their spherical equivalent (SE) values and optic disc area. All underwent Cirrus HD-OCT imaging. The influences of myopia and optic disc size on the GCIPL and RNFL thickness profiles were evaluated by multiple comparisons and linear regression analysis. Three-dimensional surface plots of GCIPL and RNFL thickness corresponding to different combinations of myopia and optic disc size were constructed. Each of the quadrant RNFL thicknesses and their overall average were significantly thinner in high myopia compared to low myopia, except for the temporal quadrant (all Ps ≤0.003). The average and all-sectors GCIPL were significantly thinner in high myopia than in moderate- and/or low-myopia (all Ps ≤0.002). The average OCT RNFL thickness was correlated significantly with SE (0.81 μm/diopter, P < 0.001), axial length (-1.44 μm/mm, P < 0.001), and optic disc area (5.35 μm/mm 2 , P < 0.001) by linear regression analysis. As for the OCT GCIPL parameters, average GCIPL thickness showed a significant correlation with SE (0.84 μm/diopter, P < 0.001) and axial length (-1.65 μm/mm, P < 0.001). There was no significant correlation of average GCIPL thickness with optic disc area. Three-dimensional curves showed that larger optic discs were associated with increased average RNFL thickness and that more-myopic eyes were associated with decreased average GCIPL and RNFL thickness. Myopia can significantly affect GCIPL and RNFL thickness profiles, and optic disc size has a significant influence on RNFL thickness. The current OCT maps employed in the evaluation of glaucoma should be analyzed in consideration of refractive status and optic disc size.

  6. Influence of specific obsessive-compulsive symptom dimensions on strategic planning in patients with obsessive-compulsive disorder.

    PubMed

    Pinto, Paula Sanders Pereira; Iego, Sandro; Nunes, Samantha; Menezes, Hemanny; Mastrorosa, Rosana Sávio; Oliveira, Irismar Reis de; Rosário, Maria Conceição do

    2011-03-01

    This study investigates obsessive-compulsive disorder patients in terms of strategic planning and its association with specific obsessive-compulsive symptom dimensions. We evaluated 32 obsessive-compulsive disorder patients. Strategic planning was assessed by the Rey-Osterrieth Complex Figure Test, and the obsessive-compulsive dimensions were assessed by the Dimensional Yale-Brown Obsessive-Compulsive Scale. In the statistical analyses, the level of significance was set at 5%. We employed linear regression, including age, intelligence quotient, number of comorbidities, the Yale-Brown Obsessive-Compulsive Scale score, and the Dimensional Yale-Brown Obsessive-Compulsive Scale. The Dimensional Yale-Brown Obsessive-Compulsive Scale "worst-ever" score correlated significantly with the planning score on the copy portion of the Rey-Osterrieth Complex Figure Test (r = 0.4, p = 0.04) and was the only variable to show a significant association after linear regression (β = 0.55, t = 2.1, p = 0.04). Compulsive hoarding correlated positively with strategic planning (r = 0.44, p = 0.03). None of the remaining symptom dimensions presented any significant correlations with strategic planning. We found the severity of obsessive-compulsive symptoms to be associated with strategic planning. In addition, there was a significant positive association between the planning score on the copy portion of the Rey-Osterrieth Complex Figure Test copy score and the hoarding dimension score on the Dimensional Yale-Brown Obsessive-Compulsive Scale. Our results underscore the idea that obsessive-compulsive disorder is a heterogeneous disorder and suggest that the hoarding dimension has a specific neuropsychological profile. Therefore, it is important to assess the peculiarities of each obsessive-compulsive symptom dimension.

  7. Effect of removing the common mode errors on linear regression analysis of noise amplitudes in position time series of a regional GPS network & a case study of GPS stations in Southern California

    NASA Astrophysics Data System (ADS)

    Jiang, Weiping; Ma, Jun; Li, Zhao; Zhou, Xiaohui; Zhou, Boye

    2018-05-01

    The analysis of the correlations between the noise in different components of GPS stations has positive significance to those trying to obtain more accurate uncertainty of velocity with respect to station motion. Previous research into noise in GPS position time series focused mainly on single component evaluation, which affects the acquisition of precise station positions, the velocity field, and its uncertainty. In this study, before and after removing the common-mode error (CME), we performed one-dimensional linear regression analysis of the noise amplitude vectors in different components of 126 GPS stations with a combination of white noise, flicker noise, and random walking noise in Southern California. The results show that, on the one hand, there are above-moderate degrees of correlation between the white noise amplitude vectors in all components of the stations before and after removal of the CME, while the correlations between flicker noise amplitude vectors in horizontal and vertical components are enhanced from un-correlated to moderately correlated by removing the CME. On the other hand, the significance tests show that, all of the obtained linear regression equations, which represent a unique function of the noise amplitude in any two components, are of practical value after removing the CME. According to the noise amplitude estimates in two components and the linear regression equations, more accurate noise amplitudes can be acquired in the two components.

  8. Polarization-dependent plasmonic photocurrents in two-dimensional electron systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Popov, V. V., E-mail: popov-slava@yahoo.co.uk; Saratov State University, Saratov 410012; Saratov Scientific Center of the Russian Academy of Sciences, Saratov 410028

    2016-06-27

    Plasmonic polarization dependent photocurrents in a homogeneous two-dimensional electron system are studied. Those effects are completely different from the photon drag and electronic photogalvanic effects as well as from the plasmonic ratchet effect in a density modulated two-dimensional electron system. Linear and helicity-dependent contributions to the photocurrent are found. The linear contribution can be interpreted as caused by the longitudinal and transverse plasmon drag effect. The helicity-dependent contribution originates from the non-linear electron convection and changes its sign with reversing the plasmonic field helicity. It is shown that the helicity-dependent component of the photocurrent can exceed the linear one bymore » several orders of magnitude in high-mobility two-dimensional electron systems. The results open possibilities for all-electronic detection of the radiation polarization states by exciting the plasmonic photocurrents in two-dimensional electron systems.« less

  9. Comparative study of Poincaré plot analysis using short electroencephalogram signals during anaesthesia with spectral edge frequency 95 and bispectral index.

    PubMed

    Hayashi, K; Yamada, T; Sawa, T

    2015-03-01

    The return or Poincaré plot is a non-linear analytical approach in a two-dimensional plane, where a timed signal is plotted against itself after a time delay. Its scatter pattern reflects the randomness and variability in the signals. Quantification of a Poincaré plot of the electroencephalogram has potential to determine anaesthesia depth. We quantified the degree of dispersion (i.e. standard deviation, SD) along the diagonal line of the electroencephalogram-Poincaré plot (named as SD1/SD2), and compared SD1/SD2 values with spectral edge frequency 95 (SEF95) and bispectral index values. The regression analysis showed a tight linear regression equation with a coefficient of determination (R(2) ) value of 0.904 (p < 0.0001) between the Poincaré index (SD1/SD2) and SEF95, and a moderate linear regression equation between SD1/SD2 and bispectral index (R(2)  = 0.346, p < 0.0001). Quantification of the Poincaré plot tightly correlates with SEF95, reflecting anaesthesia-dependent changes in electroencephalogram oscillation. © 2014 The Association of Anaesthetists of Great Britain and Ireland.

  10. Cognitive flexibility correlates with gambling severity in young adults.

    PubMed

    Leppink, Eric W; Redden, Sarah A; Chamberlain, Samuel R; Grant, Jon E

    2016-10-01

    Although gambling disorder (GD) is often characterized as a problem of impulsivity, compulsivity has recently been proposed as a potentially important feature of addictive disorders. The present analysis assessed the neurocognitive and clinical relationship between compulsivity on gambling behavior. A sample of 552 non-treatment seeking gamblers age 18-29 was recruited from the community for a study on gambling in young adults. Gambling severity levels included both casual and disordered gamblers. All participants completed the Intra/Extra-Dimensional Set Shift (IED) task, from which the total adjusted errors were correlated with gambling severity measures, and linear regression modeling was used to assess three error measures from the task. The present analysis found significant positive correlations between problems with cognitive flexibility and gambling severity (reflected by the number of DSM-5 criteria, gambling frequency, amount of money lost in the past year, and gambling urge/behavior severity). IED errors also showed a positive correlation with self-reported compulsive behavior scores. A significant correlation was also found between IED errors and non-planning impulsivity from the BIS. Linear regression models based on total IED errors, extra-dimensional (ED) shift errors, or pre-ED shift errors indicated that these factors accounted for a significant portion of the variance noted in several variables. These findings suggest that cognitive flexibility may be an important consideration in the assessment of gamblers. Results from correlational and linear regression analyses support this possibility, but the exact contributions of both impulsivity and cognitive flexibility remain entangled. Future studies will ideally be able to assess the longitudinal relationships between gambling, compulsivity, and impulsivity, helping to clarify the relative contributions of both impulsive and compulsive features. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Visual exploration of high-dimensional data through subspace analysis and dynamic projections

    DOE PAGES

    Liu, S.; Wang, B.; Thiagarajan, J. J.; ...

    2015-06-01

    Here, we introduce a novel interactive framework for visualizing and exploring high-dimensional datasets based on subspace analysis and dynamic projections. We assume the high-dimensional dataset can be represented by a mixture of low-dimensional linear subspaces with mixed dimensions, and provide a method to reliably estimate the intrinsic dimension and linear basis of each subspace extracted from the subspace clustering. Subsequently, we use these bases to define unique 2D linear projections as viewpoints from which to visualize the data. To understand the relationships among the different projections and to discover hidden patterns, we connect these projections through dynamic projections that createmore » smooth animated transitions between pairs of projections. We introduce the view transition graph, which provides flexible navigation among these projections to facilitate an intuitive exploration. Finally, we provide detailed comparisons with related systems, and use real-world examples to demonstrate the novelty and usability of our proposed framework.« less

  12. Visual Exploration of High-Dimensional Data through Subspace Analysis and Dynamic Projections

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, S.; Wang, B.; Thiagarajan, Jayaraman J.

    2015-06-01

    We introduce a novel interactive framework for visualizing and exploring high-dimensional datasets based on subspace analysis and dynamic projections. We assume the high-dimensional dataset can be represented by a mixture of low-dimensional linear subspaces with mixed dimensions, and provide a method to reliably estimate the intrinsic dimension and linear basis of each subspace extracted from the subspace clustering. Subsequently, we use these bases to define unique 2D linear projections as viewpoints from which to visualize the data. To understand the relationships among the different projections and to discover hidden patterns, we connect these projections through dynamic projections that create smoothmore » animated transitions between pairs of projections. We introduce the view transition graph, which provides flexible navigation among these projections to facilitate an intuitive exploration. Finally, we provide detailed comparisons with related systems, and use real-world examples to demonstrate the novelty and usability of our proposed framework.« less

  13. Controls/CFD Interdisciplinary Research Software Generates Low-Order Linear Models for Control Design From Steady-State CFD Results

    NASA Technical Reports Server (NTRS)

    Melcher, Kevin J.

    1997-01-01

    The NASA Lewis Research Center is developing analytical methods and software tools to create a bridge between the controls and computational fluid dynamics (CFD) disciplines. Traditionally, control design engineers have used coarse nonlinear simulations to generate information for the design of new propulsion system controls. However, such traditional methods are not adequate for modeling the propulsion systems of complex, high-speed vehicles like the High Speed Civil Transport. To properly model the relevant flow physics of high-speed propulsion systems, one must use simulations based on CFD methods. Such CFD simulations have become useful tools for engineers that are designing propulsion system components. The analysis techniques and software being developed as part of this effort are an attempt to evolve CFD into a useful tool for control design as well. One major aspect of this research is the generation of linear models from steady-state CFD results. CFD simulations, often used during the design of high-speed inlets, yield high resolution operating point data. Under a NASA grant, the University of Akron has developed analytical techniques and software tools that use these data to generate linear models for control design. The resulting linear models have the same number of states as the original CFD simulation, so they are still very large and computationally cumbersome. Model reduction techniques have been successfully applied to reduce these large linear models by several orders of magnitude without significantly changing the dynamic response. The result is an accurate, easy to use, low-order linear model that takes less time to generate than those generated by traditional means. The development of methods for generating low-order linear models from steady-state CFD is most complete at the one-dimensional level, where software is available to generate models with different kinds of input and output variables. One-dimensional methods have been extended somewhat so that linear models can also be generated from two- and three-dimensional steady-state results. Standard techniques are adequate for reducing the order of one-dimensional CFD-based linear models. However, reduction of linear models based on two- and three-dimensional CFD results is complicated by very sparse, ill-conditioned matrices. Some novel approaches are being investigated to solve this problem.

  14. Prediction of high-dimensional states subject to respiratory motion: a manifold learning approach

    NASA Astrophysics Data System (ADS)

    Liu, Wenyang; Sawant, Amit; Ruan, Dan

    2016-07-01

    The development of high-dimensional imaging systems in image-guided radiotherapy provides important pathways to the ultimate goal of real-time full volumetric motion monitoring. Effective motion management during radiation treatment usually requires prediction to account for system latency and extra signal/image processing time. It is challenging to predict high-dimensional respiratory motion due to the complexity of the motion pattern combined with the curse of dimensionality. Linear dimension reduction methods such as PCA have been used to construct a linear subspace from the high-dimensional data, followed by efficient predictions on the lower-dimensional subspace. In this study, we extend such rationale to a more general manifold and propose a framework for high-dimensional motion prediction with manifold learning, which allows one to learn more descriptive features compared to linear methods with comparable dimensions. Specifically, a kernel PCA is used to construct a proper low-dimensional feature manifold, where accurate and efficient prediction can be performed. A fixed-point iterative pre-image estimation method is used to recover the predicted value in the original state space. We evaluated and compared the proposed method with a PCA-based approach on level-set surfaces reconstructed from point clouds captured by a 3D photogrammetry system. The prediction accuracy was evaluated in terms of root-mean-squared-error. Our proposed method achieved consistent higher prediction accuracy (sub-millimeter) for both 200 ms and 600 ms lookahead lengths compared to the PCA-based approach, and the performance gain was statistically significant.

  15. Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models.

    PubMed

    Shah, A A; Xing, W W; Triantafyllidis, V

    2017-04-01

    In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.

  16. Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models

    PubMed Central

    Xing, W. W.; Triantafyllidis, V.

    2017-01-01

    In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327

  17. Factor Analysis of Linear Type Traits and Their Relation with Longevity in Brazilian Holstein Cattle

    PubMed Central

    Kern, Elisandra Lurdes; Cobuci, Jaime Araújo; Costa, Cláudio Napolis; Pimentel, Concepta Margaret McManus

    2014-01-01

    In this study we aimed to evaluate the reduction in dimensionality of 20 linear type traits and more final score in 14,943 Holstein cows in Brazil using factor analysis, and indicate their relationship with longevity and 305 d first lactation milk production. Low partial correlations (−0.19 to 0.38), the medium to high Kaiser sampling mean (0.79) and the significance of the Bartlett sphericity test (p<0.001), indicated correlations between type traits and the suitability of these data for a factor analysis, after the elimination of seven traits. Two factors had autovalues greater than one. The first included width and height of posterior udder, udder texture, udder cleft, loin strength, bone quality and final score. The second included stature, top line, chest width, body depth, fore udder attachment, angularity and final score. The linear regression of the factors on several measures of longevity and 305 d milk production showed that selection considering only the first factor should lead to improvements in longevity and 305 milk production. PMID:25050015

  18. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  19. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  20. Isotropic-resolution linear-array-based photoacoustic computed tomography through inverse Radon transform

    NASA Astrophysics Data System (ADS)

    Li, Guo; Xia, Jun; Li, Lei; Wang, Lidai; Wang, Lihong V.

    2015-03-01

    Linear transducer arrays are readily available for ultrasonic detection in photoacoustic computed tomography. They offer low cost, hand-held convenience, and conventional ultrasonic imaging. However, the elevational resolution of linear transducer arrays, which is usually determined by the weak focus of the cylindrical acoustic lens, is about one order of magnitude worse than the in-plane axial and lateral spatial resolutions. Therefore, conventional linear scanning along the elevational direction cannot provide high-quality three-dimensional photoacoustic images due to the anisotropic spatial resolutions. Here we propose an innovative method to achieve isotropic resolutions for three-dimensional photoacoustic images through combined linear and rotational scanning. In each scan step, we first elevationally scan the linear transducer array, and then rotate the linear transducer array along its center in small steps, and scan again until 180 degrees have been covered. To reconstruct isotropic three-dimensional images from the multiple-directional scanning dataset, we use the standard inverse Radon transform originating from X-ray CT. We acquired a three-dimensional microsphere phantom image through the inverse Radon transform method and compared it with a single-elevational-scan three-dimensional image. The comparison shows that our method improves the elevational resolution by up to one order of magnitude, approaching the in-plane lateral-direction resolution. In vivo rat images were also acquired.

  1. Spectral Reconstruction Based on Svm for Cross Calibration

    NASA Astrophysics Data System (ADS)

    Gao, H.; Ma, Y.; Liu, W.; He, H.

    2017-05-01

    Chinese HY-1C/1D satellites will use a 5nm/10nm-resolutional visible-near infrared(VNIR) hyperspectral sensor with the solar calibrator to cross-calibrate with other sensors. The hyperspectral radiance data are composed of average radiance in the sensor's passbands and bear a spectral smoothing effect, a transform from the hyperspectral radiance data to the 1-nm-resolution apparent spectral radiance by spectral reconstruction need to be implemented. In order to solve the problem of noise cumulation and deterioration after several times of iteration by the iterative algorithm, a novel regression method based on SVM is proposed, which can approach arbitrary complex non-linear relationship closely and provide with better generalization capability by learning. In the opinion of system, the relationship between the apparent radiance and equivalent radiance is nonlinear mapping introduced by spectral response function(SRF), SVM transform the low-dimensional non-linear question into high-dimensional linear question though kernel function, obtaining global optimal solution by virtue of quadratic form. The experiment is performed using 6S-simulated spectrums considering the SRF and SNR of the hyperspectral sensor, measured reflectance spectrums of water body and different atmosphere conditions. The contrastive result shows: firstly, the proposed method is with more reconstructed accuracy especially to the high-frequency signal; secondly, while the spectral resolution of the hyperspectral sensor reduces, the proposed method performs better than the iterative method; finally, the root mean square relative error(RMSRE) which is used to evaluate the difference of the reconstructed spectrum and the real spectrum over the whole spectral range is calculated, it decreses by one time at least by proposed method.

  2. Confounder Detection in High-Dimensional Linear Models Using First Moments of Spectral Measures.

    PubMed

    Liu, Furui; Chan, Laiwan

    2018-06-12

    In this letter, we study the confounder detection problem in the linear model, where the target variable [Formula: see text] is predicted using its [Formula: see text] potential causes [Formula: see text]. Based on an assumption of a rotation-invariant generating process of the model, recent study shows that the spectral measure induced by the regression coefficient vector with respect to the covariance matrix of [Formula: see text] is close to a uniform measure in purely causal cases, but it differs from a uniform measure characteristically in the presence of a scalar confounder. Analyzing spectral measure patterns could help to detect confounding. In this letter, we propose to use the first moment of the spectral measure for confounder detection. We calculate the first moment of the regression vector-induced spectral measure and compare it with the first moment of a uniform spectral measure, both defined with respect to the covariance matrix of [Formula: see text]. The two moments coincide in nonconfounding cases and differ from each other in the presence of confounding. This statistical causal-confounding asymmetry can be used for confounder detection. Without the need to analyze the spectral measure pattern, our method avoids the difficulty of metric choice and multiple parameter optimization. Experiments on synthetic and real data show the performance of this method.

  3. Reduction of time-resolved space-based CCD photometry developed for MOST Fabry Imaging data*

    NASA Astrophysics Data System (ADS)

    Reegen, P.; Kallinger, T.; Frast, D.; Gruberbauer, M.; Huber, D.; Matthews, J. M.; Punz, D.; Schraml, S.; Weiss, W. W.; Kuschnig, R.; Moffat, A. F. J.; Walker, G. A. H.; Guenther, D. B.; Rucinski, S. M.; Sasselov, D.

    2006-04-01

    The MOST (Microvariability and Oscillations of Stars) satellite obtains ultraprecise photometry from space with high sampling rates and duty cycles. Astronomical photometry or imaging missions in low Earth orbits, like MOST, are especially sensitive to scattered light from Earthshine, and all these missions have a common need to extract target information from voluminous data cubes. They consist of upwards of hundreds of thousands of two-dimensional CCD frames (or subrasters) containing from hundreds to millions of pixels each, where the target information, superposed on background and instrumental effects, is contained only in a subset of pixels (Fabry Images, defocused images, mini-spectra). We describe a novel reduction technique for such data cubes: resolving linear correlations of target and background pixel intensities. This step-wise multiple linear regression removes only those target variations which are also detected in the background. The advantage of regression analysis versus background subtraction is the appropriate scaling, taking into account that the amount of contamination may differ from pixel to pixel. The multivariate solution for all pairs of target/background pixels is minimally invasive of the raw photometry while being very effective in reducing contamination due to, e.g. stray light. The technique is tested and demonstrated with both simulated oscillation signals and real MOST photometry.

  4. Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Park, Trevor

    2017-01-01

    A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…

  5. Stress Regression Analysis of Asphalt Concrete Deck Pavement Based on Orthogonal Experimental Design and Interlayer Contact

    NASA Astrophysics Data System (ADS)

    Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei

    2018-03-01

    A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.

  6. Adaptive surrogate modeling by ANOVA and sparse polynomial dimensional decomposition for global sensitivity analysis in fluid simulation

    NASA Astrophysics Data System (ADS)

    Tang, Kunkun; Congedo, Pietro M.; Abgrall, Rémi

    2016-06-01

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.

  7. The cross-validated AUC for MCP-logistic regression with high-dimensional data.

    PubMed

    Jiang, Dingfeng; Huang, Jian; Zhang, Ying

    2013-10-01

    We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

  8. Fractal Dimensionality of Pore and Grain Volume of a Siliciclastic Marine Sand

    NASA Astrophysics Data System (ADS)

    Reed, A. H.; Pandey, R. B.; Lavoie, D. L.

    Three-dimensional (3D) spatial distributions of pore and grain volumes were determined from high-resolution computer tomography (CT) images of resin-impregnated marine sands. Using a linear gradient extrapolation method, cubic three-dimensional samples were constructed from two-dimensional CT images. Image porosity (0.37) was found to be consistent with the estimate of porosity by water weight loss technique (0.36). Scaling of the pore volume (Vp) with the linear size (L), V~LD provides the fractal dimensionalities of the pore volume (D=2.74+/-0.02) and grain volume (D=2.90+/-0.02) typical for sedimentary materials.

  9. Variable screening via quantile partial correlation

    PubMed Central

    Ma, Shujie; Tsai, Chih-Ling

    2016-01-01

    In quantile linear regression with ultra-high dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. PMID:28943683

  10. Local structure-based image decomposition for feature extraction with applications to face recognition.

    PubMed

    Qian, Jianjun; Yang, Jian; Xu, Yong

    2013-09-01

    This paper presents a robust but simple image feature extraction method, called image decomposition based on local structure (IDLS). It is assumed that in the local window of an image, the macro-pixel (patch) of the central pixel, and those of its neighbors, are locally linear. IDLS captures the local structural information by describing the relationship between the central macro-pixel and its neighbors. This relationship is represented with the linear representation coefficients determined using ridge regression. One image is actually decomposed into a series of sub-images (also called structure images) according to a local structure feature vector. All the structure images, after being down-sampled for dimensionality reduction, are concatenated into one super-vector. Fisher linear discriminant analysis is then used to provide a low-dimensional, compact, and discriminative representation for each super-vector. The proposed method is applied to face recognition and examined using our real-world face image database, NUST-RWFR, and five popular, publicly available, benchmark face image databases (AR, Extended Yale B, PIE, FERET, and LFW). Experimental results show the performance advantages of IDLS over state-of-the-art algorithms.

  11. Estimating the expected value of partial perfect information in health economic evaluations using integrated nested Laplace approximation.

    PubMed

    Heath, Anna; Manolopoulou, Ioanna; Baio, Gianluca

    2016-10-15

    The Expected Value of Perfect Partial Information (EVPPI) is a decision-theoretic measure of the 'cost' of parametric uncertainty in decision making used principally in health economic decision making. Despite this decision-theoretic grounding, the uptake of EVPPI calculations in practice has been slow. This is in part due to the prohibitive computational time required to estimate the EVPPI via Monte Carlo simulations. However, recent developments have demonstrated that the EVPPI can be estimated by non-parametric regression methods, which have significantly decreased the computation time required to approximate the EVPPI. Under certain circumstances, high-dimensional Gaussian Process (GP) regression is suggested, but this can still be prohibitively expensive. Applying fast computation methods developed in spatial statistics using Integrated Nested Laplace Approximations (INLA) and projecting from a high-dimensional into a low-dimensional input space allows us to decrease the computation time for fitting these high-dimensional GP, often substantially. We demonstrate that the EVPPI calculated using our method for GP regression is in line with the standard GP regression method and that despite the apparent methodological complexity of this new method, R functions are available in the package BCEA to implement it simply and efficiently. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  12. Variable importance in nonlinear kernels (VINK): classification of digitized histopathology.

    PubMed

    Ginsburg, Shoshana; Ali, Sahirzeeshan; Lee, George; Basavanhally, Ajay; Madabhushi, Anant

    2013-01-01

    Quantitative histomorphometry is the process of modeling appearance of disease morphology on digitized histopathology images via image-based features (e.g., texture, graphs). Due to the curse of dimensionality, building classifiers with large numbers of features requires feature selection (which may require a large training set) or dimensionality reduction (DR). DR methods map the original high-dimensional features in terms of eigenvectors and eigenvalues, which limits the potential for feature transparency or interpretability. Although methods exist for variable selection and ranking on embeddings obtained via linear DR schemes (e.g., principal components analysis (PCA)), similar methods do not yet exist for nonlinear DR (NLDR) methods. In this work we present a simple yet elegant method for approximating the mapping between the data in the original feature space and the transformed data in the kernel PCA (KPCA) embedding space; this mapping provides the basis for quantification of variable importance in nonlinear kernels (VINK). We show how VINK can be implemented in conjunction with the popular Isomap and Laplacian eigenmap algorithms. VINK is evaluated in the contexts of three different problems in digital pathology: (1) predicting five year PSA failure following radical prostatectomy, (2) predicting Oncotype DX recurrence risk scores for ER+ breast cancers, and (3) distinguishing good and poor outcome p16+ oropharyngeal tumors. We demonstrate that subsets of features identified by VINK provide similar or better classification or regression performance compared to the original high dimensional feature sets.

  13. Reduced order surrogate modelling (ROSM) of high dimensional deterministic simulations

    NASA Astrophysics Data System (ADS)

    Mitry, Mina

    Often, computationally expensive engineering simulations can prohibit the engineering design process. As a result, designers may turn to a less computationally demanding approximate, or surrogate, model to facilitate their design process. However, owing to the the curse of dimensionality, classical surrogate models become too computationally expensive for high dimensional data. To address this limitation of classical methods, we develop linear and non-linear Reduced Order Surrogate Modelling (ROSM) techniques. Two algorithms are presented, which are based on a combination of linear/kernel principal component analysis and radial basis functions. These algorithms are applied to subsonic and transonic aerodynamic data, as well as a model for a chemical spill in a channel. The results of this thesis show that ROSM can provide a significant computational benefit over classical surrogate modelling, sometimes at the expense of a minor loss in accuracy.

  14. An overview of techniques for linking high-dimensional molecular data to time-to-event endpoints by risk prediction models.

    PubMed

    Binder, Harald; Porzelius, Christine; Schumacher, Martin

    2011-03-01

    Analysis of molecular data promises identification of biomarkers for improving prognostic models, thus potentially enabling better patient management. For identifying such biomarkers, risk prediction models can be employed that link high-dimensional molecular covariate data to a clinical endpoint. In low-dimensional settings, a multitude of statistical techniques already exists for building such models, e.g. allowing for variable selection or for quantifying the added value of a new biomarker. We provide an overview of techniques for regularized estimation that transfer this toward high-dimensional settings, with a focus on models for time-to-event endpoints. Techniques for incorporating specific covariate structure are discussed, as well as techniques for dealing with more complex endpoints. Employing gene expression data from patients with diffuse large B-cell lymphoma, some typical modeling issues from low-dimensional settings are illustrated in a high-dimensional application. First, the performance of classical stepwise regression is compared to stage-wise regression, as implemented by a component-wise likelihood-based boosting approach. A second issues arises, when artificially transforming the response into a binary variable. The effects of the resulting loss of efficiency and potential bias in a high-dimensional setting are illustrated, and a link to competing risks models is provided. Finally, we discuss conditions for adequately quantifying the added value of high-dimensional gene expression measurements, both at the stage of model fitting and when performing evaluation. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Bayesian propensity scores for high-dimensional causal inference: A comparison of drug-eluting to bare-metal coronary stents.

    PubMed

    Spertus, Jacob V; Normand, Sharon-Lise T

    2018-04-23

    High-dimensional data provide many potential confounders that may bolster the plausibility of the ignorability assumption in causal inference problems. Propensity score methods are powerful causal inference tools, which are popular in health care research and are particularly useful for high-dimensional data. Recent interest has surrounded a Bayesian treatment of propensity scores in order to flexibly model the treatment assignment mechanism and summarize posterior quantities while incorporating variance from the treatment model. We discuss methods for Bayesian propensity score analysis of binary treatments, focusing on modern methods for high-dimensional Bayesian regression and the propagation of uncertainty. We introduce a novel and simple estimator for the average treatment effect that capitalizes on conjugacy of the beta and binomial distributions. Through simulations, we show the utility of horseshoe priors and Bayesian additive regression trees paired with our new estimator, while demonstrating the importance of including variance from the treatment regression model. An application to cardiac stent data with almost 500 confounders and 9000 patients illustrates approaches and facilitates comparison with existing alternatives. As measured by a falsifiability endpoint, we improved confounder adjustment compared with past observational research of the same problem. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

    2017-12-01

    Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

  17. Research on parallel load sharing principle of piezoelectric six-dimensional heavy force/torque sensor

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Li, Ying-jun; Jia, Zhen-yuan; Zhang, Jun; Qian, Min

    2011-01-01

    In working process of huge heavy-load manipulators, such as the free forging machine, hydraulic die-forging press, forging manipulator, heavy grasping manipulator, large displacement manipulator, measurement of six-dimensional heavy force/torque and real-time force feedback of the operation interface are basis to realize coordinate operation control and force compliance control. It is also an effective way to raise the control accuracy and achieve highly efficient manufacturing. Facing to solve dynamic measurement problem on six-dimensional time-varying heavy load in extremely manufacturing process, the novel principle of parallel load sharing on six-dimensional heavy force/torque is put forward. The measuring principle of six-dimensional force sensor is analyzed, and the spatial model is built and decoupled. The load sharing ratios are analyzed and calculated in vertical and horizontal directions. The mapping relationship between six-dimensional heavy force/torque value to be measured and output force value is built. The finite element model of parallel piezoelectric six-dimensional heavy force/torque sensor is set up, and its static characteristics are analyzed by ANSYS software. The main parameters, which affect load sharing ratio, are analyzed. The experiments for load sharing with different diameters of parallel axis are designed. The results show that the six-dimensional heavy force/torque sensor has good linearity. Non-linearity errors are less than 1%. The parallel axis makes good effect of load sharing. The larger the diameter is, the better the load sharing effect is. The results of experiments are in accordance with the FEM analysis. The sensor has advantages of large measuring range, good linearity, high inherent frequency, and high rigidity. It can be widely used in extreme environments for real-time accurate measurement of six-dimensional time-varying huge loads on manipulators.

  18. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  19. Estimation of the velocity and trajectory of three-dimensional reaching movements from non-invasive magnetoencephalography signals

    NASA Astrophysics Data System (ADS)

    Yeom, Hong Gi; Sic Kim, June; Chung, Chun Kee

    2013-04-01

    Objective. Studies on the non-invasive brain-machine interface that controls prosthetic devices via movement intentions are at their very early stages. Here, we aimed to estimate three-dimensional arm movements using magnetoencephalography (MEG) signals with high accuracy. Approach. Whole-head MEG signals were acquired during three-dimensional reaching movements (center-out paradigm). For movement decoding, we selected 68 MEG channels in motor-related areas, which were band-pass filtered using four subfrequency bands (0.5-8, 9-22, 25-40 and 57-97 Hz). After the filtering, the signals were resampled, and 11 data points preceding the current data point were used as features for estimating velocity. Multiple linear regressions were used to estimate movement velocities. Movement trajectories were calculated by integrating estimated velocities. We evaluated our results by calculating correlation coefficients (r) between real and estimated velocities. Main results. Movement velocities could be estimated from the low-frequency MEG signals (0.5-8 Hz) with significant and considerably high accuracy (p <0.001, mean r > 0.7). We also showed that preceding (60-140 ms) MEG signals are important to estimate current movement velocities and the intervals of brain signals of 200-300 ms are sufficient for movement estimation. Significance. These results imply that disabled people will be able to control prosthetic devices without surgery in the near future.

  20. Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

    PubMed Central

    Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian

    2015-01-01

    In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213

  1. Handy elementary algebraic properties of the geometry of entanglement

    NASA Astrophysics Data System (ADS)

    Blair, Howard A.; Alsing, Paul M.

    2013-05-01

    The space of separable states of a quantum system is a hyperbolic surface in a high dimensional linear space, which we call the separation surface, within the exponentially high dimensional linear space containing the quantum states of an n component multipartite quantum system. A vector in the linear space is representable as an n-dimensional hypermatrix with respect to bases of the component linear spaces. A vector will be on the separation surface iff every determinant of every 2-dimensional, 2-by-2 submatrix of the hypermatrix vanishes. This highly rigid constraint can be tested merely in time asymptotically proportional to d, where d is the dimension of the state space of the system due to the extreme interdependence of the 2-by-2 submatrices. The constraint on 2-by-2 determinants entails an elementary closed formformula for a parametric characterization of the entire separation surface with d-1 parameters in the char- acterization. The state of a factor of a partially separable state can be calculated in time asymptotically proportional to the dimension of the state space of the component. If all components of the system have approximately the same dimension, the time complexity of calculating a component state as a function of the parameters is asymptotically pro- portional to the time required to sort the basis. Metric-based entanglement measures of pure states are characterized in terms of the separation hypersurface.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Kunkun, E-mail: ktg@illinois.edu; Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence; Congedo, Pietro M.

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable formore » real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.« less

  3. Simulation of multi-stage nonlinear bone remodeling induced by fixed partial dentures of different configurations: a comparative clinical and numerical study.

    PubMed

    Liao, Zhipeng; Yoda, Nobuhiro; Chen, Junning; Zheng, Keke; Sasaki, Keiichi; Swain, Michael V; Li, Qing

    2017-04-01

    This paper aimed to develop a clinically validated bone remodeling algorithm by integrating bone's dynamic properties in a multi-stage fashion based on a four-year clinical follow-up of implant treatment. The configurational effects of fixed partial dentures (FPDs) were explored using a multi-stage remodeling rule. Three-dimensional real-time occlusal loads during maximum voluntary clenching were measured with a piezoelectric force transducer and were incorporated into a computerized tomography-based finite element mandibular model. Virtual X-ray images were generated based on simulation and statistically correlated with clinical data using linear regressions. The strain energy density-driven remodeling parameters were regulated over the time frame considered. A linear single-stage bone remodeling algorithm, with a single set of constant remodeling parameters, was found to poorly fit with clinical data through linear regression (low [Formula: see text] and R), whereas a time-dependent multi-stage algorithm better simulated the remodeling process (high [Formula: see text] and R) against the clinical results. The three-implant-supported and distally cantilevered FPDs presented noticeable and continuous bone apposition, mainly adjacent to the cervical and apical regions. The bridged and mesially cantilevered FPDs showed bone resorption or no visible bone formation in some areas. Time-dependent variation of bone remodeling parameters is recommended to better correlate remodeling simulation with clinical follow-up. The position of FPD pontics plays a critical role in mechanobiological functionality and bone remodeling. Caution should be exercised when selecting the cantilever FPD due to the risk of overloading bone resorption.

  4. Perinatal Medical Variables Predict Executive Function within a Sample of Preschoolers Born Very Low Birth Weight

    PubMed Central

    Duvall, Susanne W.; Erickson, Sarah J.; MacLean, Peggy; Lowe, Jean R.

    2014-01-01

    The goal was to identify perinatal predictors of early executive dysfunction in preschoolers born very low birth weight. Fifty-seven preschoolers completed three executive function tasks (Dimensional Change Card Sort-Separated (inhibition, working memory and cognitive flexibility), Bear Dragon (inhibition and working memory) and Gift Delay Open (inhibition)). Relationships between executive function and perinatal medical severity factors (gestational age, days on ventilation, size for gestational age, maternal steroids and number of surgeries), and chronological age were investigated by multiple linear regression and logistic regression. Different perinatal medical severity factors were predictive of executive function tasks, with gestational age predicting Bear Dragon and Gift Open; and number of surgeries and maternal steroids predicting performance on Dimensional Change Card Sort-Separated. By understanding the relationship between perinatal medical severity factors and preschool executive outcomes, we may be able to identify children at highest risk for future executive dysfunction, thereby focusing targeted early intervention services. PMID:25117418

  5. Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

    PubMed Central

    Wang, Lan; Wu, Yichao

    2012-01-01

    Ultra-high dimensional data often display heterogeneity due to either heteroscedastic variance or other forms of non-location-scale covariate effects. To accommodate heterogeneity, we advocate a more general interpretation of sparsity which assumes that only a small number of covariates influence the conditional distribution of the response variable given all candidate covariates; however, the sets of relevant covariates may differ when we consider different segments of the conditional distribution. In this framework, we investigate the methodology and theory of nonconvex penalized quantile regression in ultra-high dimension. The proposed approach has two distinctive features: (1) it enables us to explore the entire conditional distribution of the response variable given the ultra-high dimensional covariates and provides a more realistic picture of the sparsity pattern; (2) it requires substantially weaker conditions compared with alternative methods in the literature; thus, it greatly alleviates the difficulty of model checking in the ultra-high dimension. In theoretic development, it is challenging to deal with both the nonsmooth loss function and the nonconvex penalty function in ultra-high dimensional parameter space. We introduce a novel sufficient optimality condition which relies on a convex differencing representation of the penalized loss function and the subdifferential calculus. Exploring this optimality condition enables us to establish the oracle property for sparse quantile regression in the ultra-high dimension under relaxed conditions. The proposed method greatly enhances existing tools for ultra-high dimensional data analysis. Monte Carlo simulations demonstrate the usefulness of the proposed procedure. The real data example we analyzed demonstrates that the new approach reveals substantially more information compared with alternative methods. PMID:23082036

  6. Folded concave penalized learning in identifying multimodal MRI marker for Parkinson’s disease

    PubMed Central

    Liu, Hongcheng; Du, Guangwei; Zhang, Lijun; Lewis, Mechelle M.; Wang, Xue; Yao, Tao; Li, Runze; Huang, Xuemei

    2016-01-01

    Background Brain MRI holds promise to gauge different aspects of Parkinson’s disease (PD)-related pathological changes. Its analysis, however, is hindered by the high-dimensional nature of the data. New method This study introduces folded concave penalized (FCP) sparse logistic regression to identify biomarkers for PD from a large number of potential factors. The proposed statistical procedures target the challenges of high-dimensionality with limited data samples acquired. The maximization problem associated with the sparse logistic regression model is solved by local linear approximation. The proposed procedures then are applied to the empirical analysis of multimodal MRI data. Results From 45 features, the proposed approach identified 15 MRI markers and the UPSIT, which are known to be clinically relevant to PD. By combining the MRI and clinical markers, we can enhance substantially the specificity and sensitivity of the model, as indicated by the ROC curves. Comparison to existing methods We compare the folded concave penalized learning scheme with both the Lasso penalized scheme and the principle component analysis-based feature selection (PCA) in the Parkinson’s biomarker identification problem that takes into account both the clinical features and MRI markers. The folded concave penalty method demonstrates a substantially better clinical potential than both the Lasso and PCA in terms of specificity and sensitivity. Conclusions For the first time, we applied the FCP learning method to MRI biomarker discovery in PD. The proposed approach successfully identified MRI markers that are clinically relevant. Combining these biomarkers with clinical features can substantially enhance performance. PMID:27102045

  7. High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

    PubMed Central

    Daye, Z. John; Chen, Jinbo; Li, Hongzhe

    2011-01-01

    Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833

  8. Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.

    PubMed

    Liu, Han; Wang, Lie; Zhao, Tuo

    2015-08-01

    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.

  9. Predicting arsenic in drinking water wells of the Central Valley, California

    USGS Publications Warehouse

    Ayotte, Joseph; Nolan, Bernard T.; Gronberg, JoAnn M.

    2016-01-01

    Probabilities of arsenic in groundwater at depths used for domestic and public supply in the Central Valley of California are predicted using weak-learner ensemble models (boosted regression trees, BRT) and more traditional linear models (logistic regression, LR). Both methods captured major processes that affect arsenic concentrations, such as the chemical evolution of groundwater, redox differences, and the influence of aquifer geochemistry. Inferred flow-path length was the most important variable but near-surface-aquifer geochemical data also were significant. A unique feature of this study was that previously predicted nitrate concentrations in three dimensions were themselves predictive of arsenic and indicated an important redox effect at >10 μg/L, indicating low arsenic where nitrate was high. Additionally, a variable representing three-dimensional aquifer texture from the Central Valley Hydrologic Model was an important predictor, indicating high arsenic associated with fine-grained aquifer sediment. BRT outperformed LR at the 5 μg/L threshold in all five predictive performance measures and at 10 μg/L in four out of five measures. BRT yielded higher prediction sensitivity (39%) than LR (18%) at the 10 μg/L threshold–a useful outcome because a major objective of the modeling was to improve our ability to predict high arsenic areas.

  10. Homogeneity Pursuit

    PubMed Central

    Ke, Tracy; Fan, Jianqing; Wu, Yichao

    2014-01-01

    This paper explores the homogeneity of coefficients in high-dimensional regression, which extends the sparsity concept and is more general and suitable for many applications. Homogeneity arises when regression coefficients corresponding to neighboring geographical regions or a similar cluster of covariates are expected to be approximately the same. Sparsity corresponds to a special case of homogeneity with a large cluster of known atom zero. In this article, we propose a new method called clustering algorithm in regression via data-driven segmentation (CARDS) to explore homogeneity. New mathematics are provided on the gain that can be achieved by exploring homogeneity. Statistical properties of two versions of CARDS are analyzed. In particular, the asymptotic normality of our proposed CARDS estimator is established, which reveals better estimation accuracy for homogeneous parameters than that without homogeneity exploration. When our methods are combined with sparsity exploration, further efficiency can be achieved beyond the exploration of sparsity alone. This provides additional insights into the power of exploring low-dimensional structures in high-dimensional regression: homogeneity and sparsity. Our results also shed lights on the properties of the fussed Lasso. The newly developed method is further illustrated by simulation studies and applications to real data. Supplementary materials for this article are available online. PMID:26085701

  11. Reduced rank regression via adaptive nuclear norm penalization

    PubMed Central

    Chen, Kun; Dong, Hongbo; Chan, Kung-Sik

    2014-01-01

    Summary We propose an adaptive nuclear norm penalization approach for low-rank matrix approximation, and use it to develop a new reduced rank estimation method for high-dimensional multivariate regression. The adaptive nuclear norm is defined as the weighted sum of the singular values of the matrix, and it is generally non-convex under the natural restriction that the weight decreases with the singular value. However, we show that the proposed non-convex penalized regression method has a global optimal solution obtained from an adaptively soft-thresholded singular value decomposition. The method is computationally efficient, and the resulting solution path is continuous. The rank consistency of and prediction/estimation performance bounds for the estimator are established for a high-dimensional asymptotic regime. Simulation studies and an application in genetics demonstrate its efficacy. PMID:25045172

  12. Linear and non-linear infrared response of one-dimensional vibrational Holstein polarons in the anti-adiabatic limit: Optical and acoustical phonon models

    NASA Astrophysics Data System (ADS)

    Falvo, Cyril

    2018-02-01

    The theory of linear and non-linear infrared response of vibrational Holstein polarons in one-dimensional lattices is presented in order to identify the spectral signatures of self-trapping phenomena. Using a canonical transformation, the optical response is computed from the small polaron point of view which is valid in the anti-adiabatic limit. Two types of phonon baths are considered: optical phonons and acoustical phonons, and simple expressions are derived for the infrared response. It is shown that for the case of optical phonons, the linear response can directly probe the polaron density of states. The model is used to interpret the experimental spectrum of crystalline acetanilide in the C=O range. For the case of acoustical phonons, it is shown that two bound states can be observed in the two-dimensional infrared spectrum at low temperature. At high temperature, analysis of the time-dependence of the two-dimensional infrared spectrum indicates that bath mediated correlations slow down spectral diffusion. The model is used to interpret the experimental linear-spectroscopy of model α-helix and β-sheet polypeptides. This work shows that the Davydov Hamiltonian cannot explain the observations in the NH stretching range.

  13. Landsat D Thematic Mapper image dimensionality reduction and geometric correction accuracy

    NASA Technical Reports Server (NTRS)

    Ford, G. E.

    1986-01-01

    To characterize and quantify the performance of the Landsat thematic mapper (TM), techniques for dimensionality reduction by linear transformation have been studied and evaluated and the accuracy of the correction of geometric errors in TM images analyzed. Theoretical evaluations and comparisons for existing methods for the design of linear transformation for dimensionality reduction are presented. These methods include the discrete Karhunen Loeve (KL) expansion, Multiple Discriminant Analysis (MDA), Thematic Mapper (TM)-Tasseled Cap Linear Transformation and Singular Value Decomposition (SVD). A unified approach to these design problems is presented in which each method involves optimizing an objective function with respect to the linear transformation matrix. From these studies, four modified methods are proposed. They are referred to as the Space Variant Linear Transformation, the KL Transform-MDA hybrid method, and the First and Second Version of the Weighted MDA method. The modifications involve the assignment of weights to classes to achieve improvements in the class conditional probability of error for classes with high weights. Experimental evaluations of the existing and proposed methods have been performed using the six reflective bands of the TM data. It is shown that in terms of probability of classification error and the percentage of the cumulative eigenvalues, the six reflective bands of the TM data require only a three dimensional feature space. It is shown experimentally as well that for the proposed methods, the classes with high weights have improvements in class conditional probability of error estimates as expected.

  14. CALIBRATING NON-CONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION.

    PubMed

    Wang, Lan; Kim, Yongdai; Li, Runze

    2013-10-01

    We investigate high-dimensional non-convex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of non-convex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.

  15. CALIBRATING NON-CONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION

    PubMed Central

    Wang, Lan; Kim, Yongdai; Li, Runze

    2014-01-01

    We investigate high-dimensional non-convex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of non-convex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis. PMID:24948843

  16. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    PubMed Central

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  17. Mixed kernel function support vector regression for global sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

    2017-11-01

    Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.

  18. Feature extraction with deep neural networks by a generalized discriminant analysis.

    PubMed

    Stuhlsatz, André; Lippel, Jens; Zielke, Thomas

    2012-04-01

    We present an approach to feature extraction that is a generalization of the classical linear discriminant analysis (LDA) on the basis of deep neural networks (DNNs). As for LDA, discriminative features generated from independent Gaussian class conditionals are assumed. This modeling has the advantages that the intrinsic dimensionality of the feature space is bounded by the number of classes and that the optimal discriminant function is linear. Unfortunately, linear transformations are insufficient to extract optimal discriminative features from arbitrarily distributed raw measurements. The generalized discriminant analysis (GerDA) proposed in this paper uses nonlinear transformations that are learnt by DNNs in a semisupervised fashion. We show that the feature extraction based on our approach displays excellent performance on real-world recognition and detection tasks, such as handwritten digit recognition and face detection. In a series of experiments, we evaluate GerDA features with respect to dimensionality reduction, visualization, classification, and detection. Moreover, we show that GerDA DNNs can preprocess truly high-dimensional input data to low-dimensional representations that facilitate accurate predictions even if simple linear predictors or measures of similarity are used.

  19. Local polynomial chaos expansion for linear differential equations with high dimensional random inputs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yi; Jakeman, John; Gittelson, Claude

    2015-01-08

    In this paper we present a localized polynomial chaos expansion for partial differential equations (PDE) with random inputs. In particular, we focus on time independent linear stochastic problems with high dimensional random inputs, where the traditional polynomial chaos methods, and most of the existing methods, incur prohibitively high simulation cost. Furthermore, the local polynomial chaos method employs a domain decomposition technique to approximate the stochastic solution locally. In each subdomain, a subdomain problem is solved independently and, more importantly, in a much lower dimensional random space. In a postprocesing stage, accurate samples of the original stochastic problems are obtained frommore » the samples of the local solutions by enforcing the correct stochastic structure of the random inputs and the coupling conditions at the interfaces of the subdomains. Overall, the method is able to solve stochastic PDEs in very large dimensions by solving a collection of low dimensional local problems and can be highly efficient. In our paper we present the general mathematical framework of the methodology and use numerical examples to demonstrate the properties of the method.« less

  20. Spatio-temporal water quality mapping from satellite images using geographically and temporally weighted regression

    NASA Astrophysics Data System (ADS)

    Chu, Hone-Jay; Kong, Shish-Jeng; Chang, Chih-Hua

    2018-03-01

    The turbidity (TB) of a water body varies with time and space. Water quality is traditionally estimated via linear regression based on satellite images. However, estimating and mapping water quality require a spatio-temporal nonstationary model, while TB mapping necessitates the use of geographically and temporally weighted regression (GTWR) and geographically weighted regression (GWR) models, both of which are more precise than linear regression. Given the temporal nonstationary models for mapping water quality, GTWR offers the best option for estimating regional water quality. Compared with GWR, GTWR provides highly reliable information for water quality mapping, boasts a relatively high goodness of fit, improves the explanation of variance from 44% to 87%, and shows a sufficient space-time explanatory power. The seasonal patterns of TB and the main spatial patterns of TB variability can be identified using the estimated TB maps from GTWR and by conducting an empirical orthogonal function (EOF) analysis.

  1. A statistical methodology for estimating transport parameters: Theory and applications to one-dimensional advectivec-dispersive systems

    USGS Publications Warehouse

    Wagner, Brian J.; Gorelick, Steven M.

    1986-01-01

    A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.

  2. Stable long-time semiclassical description of zero-point energy in high-dimensional molecular systems.

    PubMed

    Garashchuk, Sophya; Rassolov, Vitaly A

    2008-07-14

    Semiclassical implementation of the quantum trajectory formalism [J. Chem. Phys. 120, 1181 (2004)] is further developed to give a stable long-time description of zero-point energy in anharmonic systems of high dimensionality. The method is based on a numerically cheap linearized quantum force approach; stabilizing terms compensating for the linearization errors are added into the time-evolution equations for the classical and nonclassical components of the momentum operator. The wave function normalization and energy are rigorously conserved. Numerical tests are performed for model systems of up to 40 degrees of freedom.

  3. Smoothing two-dimensional Malaysian mortality data using P-splines indexed by age and year

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Halim Shukri; Ismail, Noriszura

    2014-06-01

    Nonparametric regression implements data to derive the best coefficient of a model from a large class of flexible functions. Eilers and Marx (1996) introduced P-splines as a method of smoothing in generalized linear models, GLMs, in which the ordinary B-splines with a difference roughness penalty on coefficients is being used in a single dimensional mortality data. Modeling and forecasting mortality rate is a problem of fundamental importance in insurance company calculation in which accuracy of models and forecasts are the main concern of the industry. The original idea of P-splines is extended to two dimensional mortality data. The data indexed by age of death and year of death, in which the large set of data will be supplied by Department of Statistics Malaysia. The extension of this idea constructs the best fitted surface and provides sensible prediction of the underlying mortality rate in Malaysia mortality case.

  4. The application of dimensional analysis to the problem of solar wind-magnetosphere energy coupling

    NASA Technical Reports Server (NTRS)

    Bargatze, L. F.; Mcpherron, R. L.; Baker, D. N.; Hones, E. W., Jr.

    1984-01-01

    The constraints imposed by dimensional analysis are used to find how the solar wind-magnetosphere energy transfer rate depends upon interplanetary parameters. The analyses assume that only magnetohydrodynamic processes are important in controlling the rate of energy transfer. The study utilizes ISEE-3 solar wind observations, the AE index, and UT from three 10-day intervals during the International Magnetospheric Study. Simple linear regression and histogram techniques are used to find the value of the magnetohydrodynamic coupling exponent, alpha, which is consistent with observations of magnetospheric response. Once alpha is estimated, the form of the solar wind energy transfer rate is obtained by substitution into an equation of the interplanetary variables whose exponents depend upon alpha.

  5. High-Dimensional Quantum Information Processing with Linear Optics

    NASA Astrophysics Data System (ADS)

    Fitzpatrick, Casey A.

    Quantum information processing (QIP) is an interdisciplinary field concerned with the development of computers and information processing systems that utilize quantum mechanical properties of nature to carry out their function. QIP systems have become vastly more practical since the turn of the century. Today, QIP applications span imaging, cryptographic security, computation, and simulation (quantum systems that mimic other quantum systems). Many important strategies improve quantum versions of classical information system hardware, such as single photon detectors and quantum repeaters. Another more abstract strategy engineers high-dimensional quantum state spaces, so that each successful event carries more information than traditional two-level systems allow. Photonic states in particular bring the added advantages of weak environmental coupling and data transmission near the speed of light, allowing for simpler control and lower system design complexity. In this dissertation, numerous novel, scalable designs for practical high-dimensional linear-optical QIP systems are presented. First, a correlated photon imaging scheme using orbital angular momentum (OAM) states to detect rotational symmetries in objects using measurements, as well as building images out of those interactions is reported. Then, a statistical detection method using chains of OAM superpositions distributed according to the Fibonacci sequence is established and expanded upon. It is shown that the approach gives rise to schemes for sorting, detecting, and generating the recursively defined high-dimensional states on which some quantum cryptographic protocols depend. Finally, an ongoing study based on a generalization of the standard optical multiport for applications in quantum computation and simulation is reported upon. The architecture allows photons to reverse momentum inside the device. This in turn enables realistic implementation of controllable linear-optical scattering vertices for carrying out quantum walks on arbitrary graph structures, a powerful tool for any quantum computer. It is shown that the novel architecture provides new, efficient capabilities for the optical quantum simulation of Hamiltonians and topologically protected states. Further, these simulations use exponentially fewer resources than feedforward techniques, scale linearly to higher-dimensional systems, and use only linear optics, thus offering a concrete experimentally achievable implementation of graphical models of discrete-time quantum systems.

  6. Mathematical Techniques for Nonlinear System Theory.

    DTIC Science & Technology

    1981-09-01

    This report deals with research results obtained in the following areas: (1) Finite-dimensional linear system theory by algebraic methods--linear...Infinite-dimensional linear systems--realization theory of infinite-dimensional linear systems; (3) Nonlinear system theory --basic properties of

  7. Detecting influential observations in nonlinear regression modeling of groundwater flow

    USGS Publications Warehouse

    Yager, Richard M.

    1998-01-01

    Nonlinear regression is used to estimate optimal parameter values in models of groundwater flow to ensure that differences between predicted and observed heads and flows do not result from nonoptimal parameter values. Parameter estimates can be affected, however, by observations that disproportionately influence the regression, such as outliers that exert undue leverage on the objective function. Certain statistics developed for linear regression can be used to detect influential observations in nonlinear regression if the models are approximately linear. This paper discusses the application of Cook's D, which measures the effect of omitting a single observation on a set of estimated parameter values, and the statistical parameter DFBETAS, which quantifies the influence of an observation on each parameter. The influence statistics were used to (1) identify the influential observations in the calibration of a three-dimensional, groundwater flow model of a fractured-rock aquifer through nonlinear regression, and (2) quantify the effect of omitting influential observations on the set of estimated parameter values. Comparison of the spatial distribution of Cook's D with plots of model sensitivity shows that influential observations correspond to areas where the model heads are most sensitive to certain parameters, and where predicted groundwater flow rates are largest. Five of the six discharge observations were identified as influential, indicating that reliable measurements of groundwater flow rates are valuable data in model calibration. DFBETAS are computed and examined for an alternative model of the aquifer system to identify a parameterization error in the model design that resulted in overestimation of the effect of anisotropy on horizontal hydraulic conductivity.

  8. On the modeling of the bottom particles segregation with non-linear diffusion equations: application to the marine sand ripples

    NASA Astrophysics Data System (ADS)

    Tiguercha, Djlalli; Bennis, Anne-claire; Ezersky, Alexander

    2015-04-01

    The elliptical motion in surface waves causes an oscillating motion of the sand grains leading to the formation of ripple patterns on the bottom. Investigation how the grains with different properties are distributed inside the ripples is a difficult task because of the segration of particle. The work of Fernandez et al. (2003) was extended from one-dimensional to two-dimensional case. A new numerical model, based on these non-linear diffusion equations, was developed to simulate the grain distribution inside the marine sand ripples. The one and two-dimensional models are validated on several test cases where segregation appears. Starting from an homogeneous mixture of grains, the two-dimensional simulations demonstrate different segregation patterns: a) formation of zones with high concentration of light and heavy particles, b) formation of «cat's eye» patterns, c) appearance of inverse Brazil nut effect. Comparisons of numerical results with the new set of field data and wave flume experiments show that the two-dimensional non-linear diffusion equations allow us to reproduce qualitatively experimental results on particles segregation.

  9. Measurement of hydroxyapatite density and Knoop hardness in sound human enamel and a correlational analysis between them.

    PubMed

    He, Bing; Huang, Shengbin; Jing, Junjun; Hao, Yuqing

    2010-02-01

    The aim of this study was to measure the hydroxyapatite (HAP) density and Knoop hardness (KHN) of enamel slabs and to analyse the relationship between them. Twenty enamel slabs (10 lingual sides and 10 buccal sides) were prepared and scanned with micro-CT. Tomographic images of each slab from dental cusp to dentinoenamel junction (DEJ) were reconstructed. On these three-dimensional (3D) images, regions of interest (ROIs) were defined at an interval of 50 microm, and the HAP density for each ROI was calculated. Then the polished surfaces were indented from cusp to DEJ at intervals of 50 microm with a Knoop indenter. Finally, the data were analysed with one-way ANOVA, Student's t-test, and linear regression analysis. The HAP density and KHN decreased from the dental cusp to DEJ. Both HAP density and KHN in the outer-layer enamel were significantly higher than those in the middle- or inner-layer enamel (P<0.05). The HAP density showed no significant difference between the buccal and lingual sides for enamel in the outer, middle and inner layers, respectively (P>0.05). The KHN in the outer-layer enamel of the lingual sides was significantly lower than that of the buccal sides (P<0.05); there was no significant difference between the lingual and buccal sides in the middle or inner layer. Linear regression analysis revealed a linear relationship between the mean KHN and the mean HAP density (r=0.87). Both HAP density and KHN decrease simultaneously from dental cusp to DEJ, and the two properties are highly correlated. Copyright 2009 Elsevier Ltd. All rights reserved.

  10. On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.

    PubMed

    Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing

    2018-03-19

    In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.

  11. Discovering biclusters in gene expression data based on high-dimensional linear geometries

    PubMed Central

    Gan, Xiangchao; Liew, Alan Wee-Chung; Yan, Hong

    2008-01-01

    Background In DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification. However, in many situations a subset of genes only exhibits consistent pattern over a subset of conditions. Conventional clustering algorithms that deal with the entire row or column in an expression matrix would therefore fail to detect these useful patterns in the data. Recently, biclustering has been proposed to detect a subset of genes exhibiting consistent pattern over a subset of conditions. However, most existing biclustering algorithms are based on searching for sub-matrices within a data matrix by optimizing certain heuristically defined merit functions. Moreover, most of these algorithms can only detect a restricted set of bicluster patterns. Results In this paper, we present a novel geometric perspective for the biclustering problem. The biclustering process is interpreted as the detection of linear geometries in a high dimensional data space. Such a new perspective views biclusters with different patterns as hyperplanes in a high dimensional space, and allows us to handle different types of linear patterns simultaneously by matching a specific set of linear geometries. This geometric viewpoint also inspires us to propose a generic bicluster pattern, i.e. the linear coherent model that unifies the seemingly incompatible additive and multiplicative bicluster models. As a particular realization of our framework, we have implemented a Hough transform-based hyperplane detection algorithm. The experimental results on human lymphoma gene expression dataset show that our algorithm can find biologically significant subsets of genes. Conclusion We have proposed a novel geometric interpretation of the biclustering problem. We have shown that many common types of bicluster are just different spatial arrangements of hyperplanes in a high dimensional data space. An implementation of the geometric framework using the Fast Hough transform for hyperplane detection can be used to discover biologically significant subsets of genes under subsets of conditions for microarray data analysis. PMID:18433477

  12. Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data

    NASA Astrophysics Data System (ADS)

    Pathak, Jaideep; Lu, Zhixin; Hunt, Brian R.; Girvan, Michelle; Ott, Edward

    2017-12-01

    We use recent advances in the machine learning area known as "reservoir computing" to formulate a method for model-free estimation from data of the Lyapunov exponents of a chaotic process. The technique uses a limited time series of measurements as input to a high-dimensional dynamical system called a "reservoir." After the reservoir's response to the data is recorded, linear regression is used to learn a large set of parameters, called the "output weights." The learned output weights are then used to form a modified autonomous reservoir designed to be capable of producing an arbitrarily long time series whose ergodic properties approximate those of the input signal. When successful, we say that the autonomous reservoir reproduces the attractor's "climate." Since the reservoir equations and output weights are known, we can compute the derivatives needed to determine the Lyapunov exponents of the autonomous reservoir, which we then use as estimates of the Lyapunov exponents for the original input generating system. We illustrate the effectiveness of our technique with two examples, the Lorenz system and the Kuramoto-Sivashinsky (KS) equation. In the case of the KS equation, we note that the high dimensional nature of the system and the large number of Lyapunov exponents yield a challenging test of our method, which we find the method successfully passes.

  13. Evaluation of Deep Learning Representations of Spatial Storm Data

    NASA Astrophysics Data System (ADS)

    Gagne, D. J., II; Haupt, S. E.; Nychka, D. W.

    2017-12-01

    The spatial structure of a severe thunderstorm and its surrounding environment provide useful information about the potential for severe weather hazards, including tornadoes, hail, and high winds. Statistics computed over the area of a storm or from the pre-storm environment can provide descriptive information but fail to capture structural information. Because the storm environment is a complex, high-dimensional space, identifying methods to encode important spatial storm information in a low-dimensional form should aid analysis and prediction of storms by statistical and machine learning models. Principal component analysis (PCA), a more traditional approach, transforms high-dimensional data into a set of linearly uncorrelated, orthogonal components ordered by the amount of variance explained by each component. The burgeoning field of deep learning offers two potential approaches to this problem. Convolutional Neural Networks are a supervised learning method for transforming spatial data into a hierarchical set of feature maps that correspond with relevant combinations of spatial structures in the data. Generative Adversarial Networks (GANs) are an unsupervised deep learning model that uses two neural networks trained against each other to produce encoded representations of spatial data. These different spatial encoding methods were evaluated on the prediction of severe hail for a large set of storm patches extracted from the NCAR convection-allowing ensemble. Each storm patch contains information about storm structure and the near-storm environment. Logistic regression and random forest models were trained using the PCA and GAN encodings of the storm data and were compared against the predictions from a convolutional neural network. All methods showed skill over climatology at predicting the probability of severe hail. However, the verification scores among the methods were very similar and the predictions were highly correlated. Further evaluations are being performed to determine how the choice of input variables affects the results.

  14. Pulmonary tumor measurements from x-ray computed tomography in one, two, and three dimensions.

    PubMed

    Villemaire, Lauren; Owrangi, Amir M; Etemad-Rezai, Roya; Wilson, Laura; O'Riordan, Elaine; Keller, Harry; Driscoll, Brandon; Bauman, Glenn; Fenster, Aaron; Parraga, Grace

    2011-11-01

    We evaluated the accuracy and reproducibility of three-dimensional (3D) measurements of lung phantoms and patient tumors from x-ray computed tomography (CT) and compared these to one-dimensional (1D) and two-dimensional (2D) measurements. CT images of three spherical and three irregularly shaped tumor phantoms were evaluated by three observers who performed five repeated measurements. Additionally, three observers manually segmented 29 patient lung tumors five times each. Follow-up imaging was performed for 23 tumors and response criteria were compared. For a single subject, imaging was performed on nine occasions over 2 years to evaluate multidimensional tumor response. To evaluate measurement accuracy, we compared imaging measurements to ground truth using analysis of variance. For estimates of precision, intraobserver and interobserver coefficients of variation and intraclass correlations (ICC) were used. Linear regression and Pearson correlations were used to evaluate agreement and tumor response was descriptively compared. For spherical shaped phantoms, all measurements were highly accurate, but for irregularly shaped phantoms, only 3D measurements were in high agreement with ground truth measurements. All phantom and patient measurements showed high intra- and interobserver reproducibility (ICC >0.900). Over a 2-year period for a single patient, there was disagreement between tumor response classifications based on 3D measurements and those generated using 1D and 2D measurements. Tumor volume measurements were highly reproducible and accurate for irregular, spherical phantoms and patient tumors with nonuniform dimensions. Response classifications obtained from multidimensional measurements suggest that 3D measurements provide higher sensitivity to tumor response. Copyright © 2011 AUR. Published by Elsevier Inc. All rights reserved.

  15. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  16. Piece-wise quadratic approximations of arbitrary error functions for fast and robust machine learning.

    PubMed

    Gorban, A N; Mirkes, E M; Zinovyev, A

    2016-12-01

    Most of machine learning approaches have stemmed from the application of minimizing the mean squared distance principle, based on the computationally efficient quadratic optimization methods. However, when faced with high-dimensional and noisy data, the quadratic error functionals demonstrated many weaknesses including high sensitivity to contaminating factors and dimensionality curse. Therefore, a lot of recent applications in machine learning exploited properties of non-quadratic error functionals based on L 1 norm or even sub-linear potentials corresponding to quasinorms L p (0

  17. Visual analysis of mass cytometry data by hierarchical stochastic neighbour embedding reveals rare cell types.

    PubMed

    van Unen, Vincent; Höllt, Thomas; Pezzotti, Nicola; Li, Na; Reinders, Marcel J T; Eisemann, Elmar; Koning, Frits; Vilanova, Anna; Lelieveldt, Boudewijn P F

    2017-11-23

    Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.

  18. Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino

    2018-07-01

    Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.

  19. High-frequency toneburst-evoked ABR latency-intensity functions in sensorineural hearing-impaired humans.

    PubMed

    Fausti, S A; Olson, D J; Frey, R H; Henry, J A; Schaffer, H I; Phillips, D S

    1995-01-01

    The latency-intensity functions (LIFs) of ABRs elicited by high-frequency (8, 10, 12, and 14 kHz) toneburst stimuli were evaluated in 20 subjects with confirmed 'moderate' high-frequency sensorineural hearing loss. Wave V results from clicks and tonebursts revealed all intra- and intersession data to be reliable (p > 0.05). Linear regression curves were highly significant (p < or = 0.0001), indicating linear relationships for all stimuli analyzed. Comparisons between the linear regression curves from a previously reported normal-hearing subject group and this sensorineural hearing-impaired group showed no significant differences. This study demonstrated that tonebursts at 8, 10, and 12 kHz evoked ABRs which decreased in latency as a function of increasing intensity and that these LIFs were consistent and orderly (14 kHz was not determinable). These results will contribute information to facilitate the establishment of change criteria used to predict change in hearing during treatment with ototoxic medications.

  20. Prognostic value of three-dimensional ultrasound for fetal hydronephrosis

    PubMed Central

    WANG, JUNMEI; YING, WEIWEN; TANG, DAXING; YANG, LIMING; LIU, DONGSHENG; LIU, YUANHUI; PAN, JIAOE; XIE, XING

    2015-01-01

    The present study evaluated the prognostic value of three-dimensional ultrasound for fetal hydronephrosis. Pregnant females with fetal hydronephrosis were enrolled and a novel three-dimensional ultrasound indicator, renal parenchymal volume/kidney volume, was introduced to predict the postnatal prognosis of fetal hydronephrosis in comparison with commonly used ultrasound indicators. All ultrasound indicators of fetal hydronephrosis could predict whether postnatal surgery was required for fetal hydronephrosis; however, the predictive performance of renal parenchymal volume/kidney volume measurements as an individual indicator was the highest. In conclusion, ultrasound is important in predicting whether postnatal surgery is required for fetal hydronephrosis, and the three-dimensional ultrasound indicator renal parenchymal volume/kidney volume has a high predictive performance. Furthermore, the majority of cases of fetal hydronephrosis spontaneously regress subsequent to birth, and the regression time is closely associated with ultrasound indicators. PMID:25667626

  1. Predicting U.S. Army Reserve Unit Manning Using Market Demographics

    DTIC Science & Technology

    2015-06-01

    develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S

  2. Quadratic Polynomial Regression using Serial Observation Processing:Implementation within DART

    NASA Astrophysics Data System (ADS)

    Hodyss, D.; Anderson, J. L.; Collins, N.; Campbell, W. F.; Reinecke, P. A.

    2017-12-01

    Many Ensemble-Based Kalman ltering (EBKF) algorithms process the observations serially. Serial observation processing views the data assimilation process as an iterative sequence of scalar update equations. What is useful about this data assimilation algorithm is that it has very low memory requirements and does not need complex methods to perform the typical high-dimensional inverse calculation of many other algorithms. Recently, the push has been towards the prediction, and therefore the assimilation of observations, for regions and phenomena for which high-resolution is required and/or highly nonlinear physical processes are operating. For these situations, a basic hypothesis is that the use of the EBKF is sub-optimal and performance gains could be achieved by accounting for aspects of the non-Gaussianty. To this end, we develop here a new component of the Data Assimilation Research Testbed [DART] to allow for a wide-variety of users to test this hypothesis. This new version of DART allows one to run several variants of the EBKF as well as several variants of the quadratic polynomial lter using the same forecast model and observations. Dierences between the results of the two systems will then highlight the degree of non-Gaussianity in the system being examined. We will illustrate in this work the differences between the performance of linear versus quadratic polynomial regression in a hierarchy of models from Lorenz-63 to a simple general circulation model.

  3. Distributed Monitoring of the R(sup 2) Statistic for Linear Regression

    NASA Technical Reports Server (NTRS)

    Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.

    2011-01-01

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  4. Postmolar gestational trophoblastic neoplasia: beyond the traditional risk factors.

    PubMed

    Bakhtiyari, Mahmood; Mirzamoradi, Masoumeh; Kimyaiee, Parichehr; Aghaie, Abbas; Mansournia, Mohammd Ali; Ashrafi-Vand, Sepideh; Sarfjoo, Fatemeh Sadat

    2015-09-01

    To investigate the slope of linear regression of postevacuation serum hCG as an independent risk factor for postmolar gestational trophoblastic neoplasia (GTN). Multicenter retrospective cohort study. Academic referral health care centers. All subjects with confirmed hydatidiform mole and at least four measurements of β-hCG titer. None. Type and magnitude of the relationship between the slope of linear regression of β-hCG as a new risk factor and GTN using Bayesian logistic regression with penalized log-likelihood estimation. Among the high-risk and low-risk molar pregnancy cases, 11 (18.6%) and 19 cases (13.3%) had GTN, respectively. No significant relationship was found between the components of a high-risk pregnancy and GTN. The β-hCG return slope was higher in the spontaneous cure group. However, the initial level of this hormone in the first measurement was higher in the GTN group compared with in the spontaneous recovery group. The average time for diagnosing GTN in the high-risk molar pregnancy group was 2 weeks less than that of the low-risk molar pregnancy group. In addition to slope of linear regression of β-hCG (odds ratio [OR], 12.74, confidence interval [CI], 5.42-29.2), abortion history (OR, 2.53; 95% CI, 1.27-5.04) and large uterine height for gestational age (OR, 1.26; CI, 1.04-1.54) had the maximum effects on GTN outcome, respectively. The slope of linear regression of β-hCG was introduced as an independent risk factor, which could be used for clinical decision making based on records of β-hCG titer and subsequent prevention program. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  5. Linear stability theory and three-dimensional boundary layer transition

    NASA Technical Reports Server (NTRS)

    Spall, Robert E.; Malik, Mujeeb R.

    1992-01-01

    The viewgraphs and discussion of linear stability theory and three dimensional boundary layer transition are provided. The ability to predict, using analytical tools, the location of boundary layer transition over aircraft-type configurations is of great importance to designers interested in laminar flow control (LFC). The e(sup N) method has proven to be fairly effective in predicting, in a consistent manner, the location of the onset of transition for simple geometries in low disturbance environments. This method provides a correlation between the most amplified single normal mode and the experimental location of the onset of transition. Studies indicate that values of N between 8 and 10 correlate well with the onset of transition. For most previous calculations, the mean flows were restricted to two-dimensional or axisymmetric cases, or have employed simple three-dimensional mean flows (e.g., rotating disk, infinite swept wing, or tapered swept wing with straight isobars). Unfortunately, for flows over general wing configurations, and for nearly all flows over fuselage-type bodies at incidence, the analysis of fully three-dimensional flow fields is required. Results obtained for the linear stability of fully three-dimensional boundary layers formed over both wing and fuselage-type geometries, and for both high and low speed flows are discussed. When possible, transition estimates form the e(sup N) method are compared to experimentally determined locations. The stability calculations are made using a modified version of the linear stability code COSAL. Mean flows were computed using both Navier Stokes and boundary-layer codes.

  6. Local field potential spectral tuning in motor cortex during reaching.

    PubMed

    Heldman, Dustin A; Wang, Wei; Chan, Sherwin S; Moran, Daniel W

    2006-06-01

    In this paper, intracortical local field potentials (LFPs) and single units were recorded from the motor cortices of monkeys (Macaca fascicularis) while they preformed a standard three-dimensional (3-D) center-out reaching task. During the center-out task, the subjects held their hands at the location of a central target and then reached to one of eight peripheral targets forming the corners of a virtual cube. The spectral amplitudes of the recorded LFPs were calculated, with the high-frequency LFP (HF-LFP) defined as the average spectral amplitude change from baseline from 60 to 200 Hz. A 3-D linear regression across the eight center-out targets revealed that approximately 6% of the beta LFPs (18-26 Hz) and 18% of the HF-LFPs were tuned for velocity (p-value < 0.05), while 10% of the beta LFPs and 15% of the HF-LFPs were tuned for position. These results suggest that a multidegree-of-freedom brain-machine interface is possible using high-frequency LFP recordings in motor cortex.

  7. Dimensional analysis of detrimental ozone generation by positive wire-to-plate corona discharge in air

    NASA Astrophysics Data System (ADS)

    Bo, Z.; Chen, J. H.

    2010-02-01

    The dimensional analysis technique is used to formulate a correlation between ozone generation rate and various parameters that are important in the design and operation of positive wire-to-plate corona discharges in indoor air. The dimensionless relation is determined by linear regression analysis based on the results from 36 laboratory-scale experiments. The derived equation is validated by experimental data and a numerical model published in the literature. Applications of such derived equation are illustrated through an example selection of the appropriate set of operating conditions in the design/operation of a photocopier to follow the federal regulations of ozone emission. Finally, a new current-voltage characteristic equation is proposed for positive wire-to-plate corona discharges based on the derived dimensionless equation.

  8. Geographical variation of cerebrovascular disease in New York State: the correlation with income

    PubMed Central

    Han, Daikwon; Carrow, Shannon S; Rogerson, Peter A; Munschauer, Frederick E

    2005-01-01

    Background Income is known to be associated with cerebrovascular disease; however, little is known about the more detailed relationship between cerebrovascular disease and income. We examined the hypothesis that the geographical distribution of cerebrovascular disease in New York State may be predicted by a nonlinear model using income as a surrogate socioeconomic risk factor. Results We used spatial clustering methods to identify areas with high and low prevalence of cerebrovascular disease at the ZIP code level after smoothing rates and correcting for edge effects; geographic locations of high and low clusters of cerebrovascular disease in New York State were identified with and without income adjustment. To examine effects of income, we calculated the excess number of cases using a non-linear regression with cerebrovascular disease rates taken as the dependent variable and income and income squared taken as independent variables. The resulting regression equation was: excess rate = 32.075 - 1.22*10-4(income) + 8.068*10-10(income2), and both income and income squared variables were significant at the 0.01 level. When income was included as a covariate in the non-linear regression, the number and size of clusters of high cerebrovascular disease prevalence decreased. Some 87 ZIP codes exceeded the critical value of the local statistic yielding a relative risk of 1.2. The majority of low cerebrovascular disease prevalence geographic clusters disappeared when the non-linear income effect was included. For linear regression, the excess rate of cerebrovascular disease falls with income; each $10,000 increase in median income of each ZIP code resulted in an average reduction of 3.83 observed cases. The significant nonlinear effect indicates a lessening of this income effect with increasing income. Conclusion Income is a non-linear predictor of excess cerebrovascular disease rates, with both low and high observed cerebrovascular disease rate areas associated with higher income. Income alone explains a significant amount of the geographical variance in cerebrovascular disease across New York State since both high and low clusters of cerebrovascular disease dissipate or disappear with income adjustment. Geographical modeling, including non-linear effects of income, may allow for better identification of other non-traditional risk factors. PMID:16242043

  9. Computational process to study the wave propagation In a non-linear medium by quasi- linearization

    NASA Astrophysics Data System (ADS)

    Sharath Babu, K.; Venkata Brammam, J.; Baby Rani, CH

    2018-03-01

    Two objects having distinct velocities come into contact an impact can occur. The impact study i.e., in the displacement of the objects after the impact, the impact force is function of time‘t’ which is behaves similar to compression force. The impact tenure is very short so impulses must be generated subsequently high stresses are generated. In this work we are examined the wave propagation inside the object after collision and measured the object non-linear behavior in the one-dimensional case. Wave transmission is studied by means of material acoustic parameter value. The objective of this paper is to present a computational study of propagating pulsation and harmonic waves in nonlinear media using quasi-linearization and subsequently utilized the central difference scheme. This study gives focus on longitudinal, one- dimensional wave propagation. In the finite difference scheme Non-linear system is reduced to a linear system by applying quasi-linearization method. The computed results exhibit good agreement on par with the selected non-liner wave propagation.

  10. Optimized multiple linear mappings for single image super-resolution

    NASA Astrophysics Data System (ADS)

    Zhang, Kaibing; Li, Jie; Xiong, Zenggang; Liu, Xiuping; Gao, Xinbo

    2017-12-01

    Learning piecewise linear regression has been recognized as an effective way for example learning-based single image super-resolution (SR) in literature. In this paper, we employ an expectation-maximization (EM) algorithm to further improve the SR performance of our previous multiple linear mappings (MLM) based SR method. In the training stage, the proposed method starts with a set of linear regressors obtained by the MLM-based method, and then jointly optimizes the clustering results and the low- and high-resolution subdictionary pairs for regression functions by using the metric of the reconstruction errors. In the test stage, we select the optimal regressor for SR reconstruction by accumulating the reconstruction errors of m-nearest neighbors in the training set. Thorough experimental results carried on six publicly available datasets demonstrate that the proposed SR method can yield high-quality images with finer details and sharper edges in terms of both quantitative and perceptual image quality assessments.

  11. Bivariate least squares linear regression: Towards a unified analytic formalism. I. Functional models

    NASA Astrophysics Data System (ADS)

    Caimmi, R.

    2011-08-01

    Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both heteroscedastic and homoscedastic data. Conversely, samples related to different methods produce discrepant results, due to the presence of (still undetected) systematic errors, which implies no definitive statement can be made at present. A comparison is also made between different expressions of regression line slope and intercept variance estimators, where fractional discrepancies are found to be not exceeding a few percent, which grows up to about 20% in the presence of large dispersion data. An extension of the formalism to structural models is left to a forthcoming paper.

  12. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    PubMed

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  13. Engineering two-photon high-dimensional states through quantum interference

    PubMed Central

    Zhang, Yingwen; Roux, Filippus S.; Konrad, Thomas; Agnew, Megan; Leach, Jonathan; Forbes, Andrew

    2016-01-01

    Many protocols in quantum science, for example, linear optical quantum computing, require access to large-scale entangled quantum states. Such systems can be realized through many-particle qubits, but this approach often suffers from scalability problems. An alternative strategy is to consider a lesser number of particles that exist in high-dimensional states. The spatial modes of light are one such candidate that provides access to high-dimensional quantum states, and thus they increase the storage and processing potential of quantum information systems. We demonstrate the controlled engineering of two-photon high-dimensional states entangled in their orbital angular momentum through Hong-Ou-Mandel interference. We prepare a large range of high-dimensional entangled states and implement precise quantum state filtering. We characterize the full quantum state before and after the filter, and are thus able to determine that only the antisymmetric component of the initial state remains. This work paves the way for high-dimensional processing and communication of multiphoton quantum states, for example, in teleportation beyond qubits. PMID:26933685

  14. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  15. Locality-preserving sparse representation-based classification in hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting

    2016-10-01

    This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.

  16. A manifold learning approach to target detection in high-resolution hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Ziemann, Amanda K.

    Imagery collected from airborne platforms and satellites provide an important medium for remotely analyzing the content in a scene. In particular, the ability to detect a specific material within a scene is of high importance to both civilian and defense applications. This may include identifying "targets" such as vehicles, buildings, or boats. Sensors that process hyperspectral images provide the high-dimensional spectral information necessary to perform such analyses. However, for a d-dimensional hyperspectral image, it is typical for the data to inherently occupy an m-dimensional space, with m << d. In the remote sensing community, this has led to a recent increase in the use of manifold learning, which aims to characterize the embedded lower-dimensional, non-linear manifold upon which the hyperspectral data inherently lie. Classic hyperspectral data models include statistical, linear subspace, and linear mixture models, but these can place restrictive assumptions on the distribution of the data; this is particularly true when implementing traditional target detection approaches, and the limitations of these models are well-documented. With manifold learning based approaches, the only assumption is that the data reside on an underlying manifold that can be discretely modeled by a graph. The research presented here focuses on the use of graph theory and manifold learning in hyperspectral imagery. Early work explored various graph-building techniques with application to the background model of the Topological Anomaly Detection (TAD) algorithm, which is a graph theory based approach to anomaly detection. This led towards a focus on target detection, and in the development of a specific graph-based model of the data and subsequent dimensionality reduction using manifold learning. An adaptive graph is built on the data, and then used to implement an adaptive version of locally linear embedding (LLE). We artificially induce a target manifold and incorporate it into the adaptive LLE transformation; the artificial target manifold helps to guide the separation of the target data from the background data in the new, lower-dimensional manifold coordinates. Then, target detection is performed in the manifold space.

  17. Evaluation of gridding procedures for air temperature over Southern Africa

    NASA Astrophysics Data System (ADS)

    Eiselt, Kai-Uwe; Kaspar, Frank; Mölg, Thomas; Krähenmann, Stefan; Posada, Rafael; Riede, Jens O.

    2017-06-01

    Africa is considered to be highly vulnerable to climate change, yet the availability of observational data and derived products is limited. As one element of the SASSCAL initiative (Southern African Science Service Centre for Climate Change and Adaptive Land Management), a cooperation of Angola, Botswana, Namibia, Zambia, South Africa and Germany, networks of automatic weather stations have been installed or improved (http://www.sasscalweathernet.org). The increased availability of meteorological observations improves the quality of gridded products for the region. Here we compare interpolation methods for monthly minimum and maximum temperatures which were calculated from hourly measurements. Due to a lack of longterm records we focused on data ranging from September 2014 to August 2016. The best interpolation results have been achieved combining multiple linear regression (elevation, a continentality index and latitude as predictors) with three dimensional inverse distance weighted interpolation.

  18. Accounting for autocorrelation in multi-drug resistant tuberculosis predictors using a set of parsimonious orthogonal eigenvectors aggregated in geographic space.

    PubMed

    Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J

    2010-05-01

    Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.

  19. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    PubMed

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  20. Generalized reduced rank latent factor regression for high dimensional tensor fields, and neuroimaging-genetic applications

    PubMed Central

    Tao, Chenyang; Nichols, Thomas E.; Hua, Xue; Ching, Christopher R.K.; Rolls, Edmund T.; Thompson, Paul M.; Feng, Jianfeng

    2017-01-01

    We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches. PMID:27666385

  1. Bearing Fault Diagnosis Based on Statistical Locally Linear Embedding

    PubMed Central

    Wang, Xiang; Zheng, Yuan; Zhao, Zhenzhou; Wang, Jinping

    2015-01-01

    Fault diagnosis is essentially a kind of pattern recognition. The measured signal samples usually distribute on nonlinear low-dimensional manifolds embedded in the high-dimensional signal space, so how to implement feature extraction, dimensionality reduction and improve recognition performance is a crucial task. In this paper a novel machinery fault diagnosis approach based on a statistical locally linear embedding (S-LLE) algorithm which is an extension of LLE by exploiting the fault class label information is proposed. The fault diagnosis approach first extracts the intrinsic manifold features from the high-dimensional feature vectors which are obtained from vibration signals that feature extraction by time-domain, frequency-domain and empirical mode decomposition (EMD), and then translates the complex mode space into a salient low-dimensional feature space by the manifold learning algorithm S-LLE, which outperforms other feature reduction methods such as PCA, LDA and LLE. Finally in the feature reduction space pattern classification and fault diagnosis by classifier are carried out easily and rapidly. Rolling bearing fault signals are used to validate the proposed fault diagnosis approach. The results indicate that the proposed approach obviously improves the classification performance of fault pattern recognition and outperforms the other traditional approaches. PMID:26153771

  2. Application of General Regression Neural Network to the Prediction of LOD Change

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Hong; Wang, Qi-Jie; Zhu, Jian-Jun; Zhang, Hao

    2012-01-01

    Traditional methods for predicting the change in length of day (LOD change) are mainly based on some linear models, such as the least square model and autoregression model, etc. However, the LOD change comprises complicated non-linear factors and the prediction effect of the linear models is always not so ideal. Thus, a kind of non-linear neural network — general regression neural network (GRNN) model is tried to make the prediction of the LOD change and the result is compared with the predicted results obtained by taking advantage of the BP (back propagation) neural network model and other models. The comparison result shows that the application of the GRNN to the prediction of the LOD change is highly effective and feasible.

  3. Validation of a selective ensemble-based classification scheme for myoelectric control using a three-dimensional Fitts' Law test.

    PubMed

    Scheme, Erik J; Englehart, Kevin B

    2013-07-01

    When controlling a powered upper limb prosthesis it is important not only to know how to move the device, but also when not to move. A novel approach to pattern recognition control, using a selective multiclass one-versus-one classification scheme has been shown to be capable of rejecting unintended motions. This method was shown to outperform other popular classification schemes when presented with muscle contractions that did not correspond to desired actions. In this work, a 3-D Fitts' Law test is proposed as a suitable alternative to using virtual limb environments for evaluating real-time myoelectric control performance. The test is used to compare the selective approach to a state-of-the-art linear discriminant analysis classification based scheme. The framework is shown to obey Fitts' Law for both control schemes, producing linear regression fittings with high coefficients of determination (R(2) > 0.936). Additional performance metrics focused on quality of control are discussed and incorporated in the evaluation. Using this framework the selective classification based scheme is shown to produce significantly higher efficiency and completion rates, and significantly lower overshoot and stopping distances, with no significant difference in throughput.

  4. A panning DLT procedure for three-dimensional videography.

    PubMed

    Yu, B; Koh, T J; Hay, J G

    1993-06-01

    The direct linear transformation (DLT) method [Abdel-Aziz and Karara, APS Symposium on Photogrammetry. American Society of Photogrammetry, Falls Church, VA (1971)] is widely used in biomechanics to obtain three-dimensional space coordinates from film and video records. This method has some major shortcomings when used to analyze events which take place over large areas. To overcome these shortcomings, a three-dimensional data collection method based on the DLT method, and making use of panning cameras, was developed. Several small single control volumes were combined to construct a large total control volume. For each single control volume, a regression equation (calibration equation) is developed to express each of the 11 DLT parameters as a function of camera orientation, so that the DLT parameters can then be estimated from arbitrary camera orientations. Once the DLT parameters are known for at least two cameras, and the associated two-dimensional film or video coordinates of the event are obtained, the desired three-dimensional space coordinates can be computed. In a laboratory test, five single control volumes (in a total control volume of 24.40 x 2.44 x 2.44 m3) were used to test the effect of the position of the single control volume on the accuracy of the computed three dimensional space coordinates. Linear and quadratic calibration equations were used to test the effect of the order of the equation on the accuracy of the computed three dimensional space coordinates. For four of the five single control volumes tested, the mean resultant errors associated with the use of the linear calibration equation were significantly larger than those associated with the use of the quadratic calibration equation. The position of the single control volume had no significant effect on the mean resultant errors in computed three dimensional coordinates when the quadratic calibration equation was used. Under the same data collection conditions, the mean resultant errors in the computed three dimensional coordinates associated with the panning and stationary DLT methods were 17 and 22 mm, respectively. The major advantages of the panning DLT method lie in the large image sizes obtained and in the ease with which the data can be collected. The method also has potential for use in a wide variety of contexts. The major shortcoming of the method is the large amount of digitizing necessary to calibrate the total control volume. Adaptations of the method to reduce the amount of digitizing required are being explored.

  5. Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

    PubMed Central

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-01-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882

  6. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    PubMed

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  7. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666

  8. A numerical algorithm for optimal feedback gains in high dimensional linear quadratic regulator problems

    NASA Technical Reports Server (NTRS)

    Banks, H. T.; Ito, K.

    1991-01-01

    A hybrid method for computing the feedback gains in linear quadratic regulator problem is proposed. The method, which combines use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite-dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantages of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed, and numerical evidence of the efficacy of these ideas is presented.

  9. Numerical and experimental investigation of the effect of geometry on combustion characteristics of solid-fuel ramjet

    NASA Astrophysics Data System (ADS)

    Gong, Lunkun; Chen, Xiong; Musa, Omer; Yang, Haitao; Zhou, Changsheng

    2017-12-01

    Numerical and experimental investigation on the solid-fuel ramjet was carried out to study the effect of geometry on combustion characteristics. The two-dimensional axisymmetric program developed in the present study adopted finite rate chemistry and second-order moment turbulence-chemistry models, together with k-ω shear stress transport (SST) turbulence model. Experimental data were obtained by burning cylindrical polyethylene using a connected pipe facility. The simulation results show that a fuel-rich zone near the solid fuel surface and an air-rich zone in the core exist in the chamber, and the chemical reactions occur mainly in the interface of this two regions; The physical reasons for the effect of geometry on regression rate is the variation of turbulent viscosity due to the geometry change. Port-to-inlet diameter ratio is the main parameter influencing the turbulent viscosity, and a linear relationship between port-to-inlet diameter and regression rate were obtained. The air mass flow rate and air-fuel ratio are the main influencing factors on ramjet performances. Based on the simulation results, the correlations between geometry and air-fuel ratio were obtained, and the effect of geometry on ramjet performances was analyzed according to the correlation. Three-dimensional regression rate contour obtained experimentally indicates that the regression rate which shows axisymmetric distribution due to the symmetry structure increases sharply, followed by slow decrease in axial direction. The radiation heat transfer in recirculation zone cannot be ignored. Compared with the experimental results, the deviations of calculated average regression rate and characteristic velocity are about 5%. Concerning the effect of geometry on air-fuel ratio, the deviations between experimental and theoretical results are less than 10%.

  10. Three-dimensional visualization maps of suspended-sediment concentrations during placement of dredged material in 21st Avenue West Channel Embayment, Duluth-Superior Harbor, Duluth, Minnesota, 2015

    USGS Publications Warehouse

    Groten, Joel T.; Ellison, Christopher A.; Mahoney, Mollie H.

    2016-06-30

    Excess sediment in rivers and estuaries poses serious environmental and economic challenges. The U.S. Army Corps of Engineers (USACE) routinely dredges sediment in Federal navigation channels to maintain commercial shipping operations. The USACE initiated a 3-year pilot project in 2013 to use navigation channel dredged material to aid in restoration of shoreline habitat in the 21st Avenue West Channel Embayment of the Duluth-Superior Harbor. Placing dredged material in the 21st Avenue West Channel Embayment supports the restoration of shallow bay aquatic habitat aiding in the delisting of the St. Louis River Estuary Area of Concern.The U.S. Geological Survey, in cooperation with the USACE, collected turbidity and suspended-sediment concentrations (SSCs) in 2014 and 2015 to measure the horizontal and vertical distribution of SSCs during placement operations of dredged materials. These data were collected to help the USACE evaluate the use of several best management practices, including various dredge material placement techniques and a silt curtain, to mitigate the dispersion of suspended sediment.Three-dimensional visualization maps are a valuable tool for assessing the spatial displacement of SSCs. Data collection was designed to coincide with four dredged placement configurations that included periods with and without a silt curtain as well as before and after placement of dredged materials. Approximately 230 SSC samples and corresponding turbidity values collected in 2014 and 2015 were used to develop a simple linear regression model between SSC and turbidity. Using the simple linear regression model, SSCs were estimated for approximately 3,000 turbidity values at approximately 100 sampling sites in the 21st Avenue West Channel Embayment of the Duluth-Superior Harbor. The estimated SSCs served as input for development of 12 three-dimensional visualization maps.

  11. A consensus embedding approach for segmentation of high resolution in vivo prostate magnetic resonance imagery

    NASA Astrophysics Data System (ADS)

    Viswanath, Satish; Rosen, Mark; Madabhushi, Anant

    2008-03-01

    Current techniques for localization of prostatic adenocarcinoma (CaP) via blinded trans-rectal ultrasound biopsy are associated with a high false negative detection rate. While high resolution endorectal in vivo Magnetic Resonance (MR) prostate imaging has been shown to have improved contrast and resolution for CaP detection over ultrasound, similarity in intensity characteristics between benign and cancerous regions on MR images contribute to a high false positive detection rate. In this paper, we present a novel unsupervised segmentation method that employs manifold learning via consensus schemes for detection of cancerous regions from high resolution 1.5 Tesla (T) endorectal in vivo prostate MRI. A significant contribution of this paper is a method to combine multiple weak, lower-dimensional representations of high dimensional feature data in a way analogous to classifier ensemble schemes, and hence create a stable and accurate reduced dimensional representation. After correcting for MR image intensity artifacts, such as bias field inhomogeneity and intensity non-standardness, our algorithm extracts over 350 3D texture features at every spatial location in the MR scene at multiple scales and orientations. Non-linear dimensionality reduction schemes such as Locally Linear Embedding (LLE) and Graph Embedding (GE) are employed to create multiple low dimensional data representations of this high dimensional texture feature space. Our novel consensus embedding method is used to average object adjacencies from within the multiple low dimensional projections so that class relationships are preserved. Unsupervised consensus clustering is then used to partition the objects in this consensus embedding space into distinct classes. Quantitative evaluation on 18 1.5 T prostate MR data against corresponding histology obtained from the multi-site ACRIN trials show a sensitivity of 92.65% and a specificity of 82.06%, which suggests that our method is successfully able to detect suspicious regions in the prostate.

  12. Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks

    NASA Astrophysics Data System (ADS)

    Kanevski, Mikhail

    2015-04-01

    The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press. With a CD: data, software, guides. (2009). 2. Kanevski M. Spatial Predictions of Soil Contamination Using General Regression Neural Networks. Systems Research and Information Systems, Volume 8, number 4, 1999. 3. Robert S., Foresti L., Kanevski M. Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks. International Journal of Climatology, 33 pp. 1793-1804, 2013.

  13. Application of 2D and 3D image technologies to characterise morphological attributes of grapevine clusters.

    PubMed

    Tello, Javier; Cubero, Sergio; Blasco, José; Tardaguila, Javier; Aleixos, Nuria; Ibáñez, Javier

    2016-10-01

    Grapevine cluster morphology influences the quality and commercial value of wine and table grapes. It is routinely evaluated by subjective and inaccurate methods that do not meet the requirements set by the food industry. Novel two-dimensional (2D) and three-dimensional (3D) machine vision technologies emerge as promising tools for its automatic and fast evaluation. The automatic evaluation of cluster length, width and elongation was successfully achieved by the analysis of 2D images, significant and strong correlations with the manual methods being found (r = 0.959, 0.861 and 0.852, respectively). The classification of clusters according to their shape can be achieved by evaluating their conicity in different sections of the cluster. The geometric reconstruction of the morphological volume of the cluster from 2D features worked better than the direct 3D laser scanning system, showing a high correlation (r = 0.956) with the manual approach (water displacement method). In addition, we constructed and validated a simple linear regression model for cluster compactness estimation. It showed a high predictive capacity for both the training and validation subsets of clusters (R(2)  = 84.5 and 71.1%, respectively). The methodologies proposed in this work provide continuous and accurate data for the fast and objective characterisation of cluster morphology. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  14. Fuzzy Regression Prediction and Application Based on Multi-Dimensional Factors of Freight Volume

    NASA Astrophysics Data System (ADS)

    Xiao, Mengting; Li, Cheng

    2018-01-01

    Based on the reality of the development of air cargo, the multi-dimensional fuzzy regression method is used to determine the influencing factors, and the three most important influencing factors of GDP, total fixed assets investment and regular flight route mileage are determined. The system’s viewpoints and analogy methods, the use of fuzzy numbers and multiple regression methods to predict the civil aviation cargo volume. In comparison with the 13th Five-Year Plan for China’s Civil Aviation Development (2016-2020), it is proved that this method can effectively improve the accuracy of forecasting and reduce the risk of forecasting. It is proved that this model predicts civil aviation freight volume of the feasibility, has a high practical significance and practical operation.

  15. Spectral (Finite) Volume Method for Conservation Laws on Unstructured Grids II: Extension to Two Dimensional Scalar Equation

    NASA Technical Reports Server (NTRS)

    Wang, Z. J.; Liu, Yen; Kwak, Dochan (Technical Monitor)

    2002-01-01

    The framework for constructing a high-order, conservative Spectral (Finite) Volume (SV) method is presented for two-dimensional scalar hyperbolic conservation laws on unstructured triangular grids. Each triangular grid cell forms a spectral volume (SV), and the SV is further subdivided into polygonal control volumes (CVs) to supported high-order data reconstructions. Cell-averaged solutions from these CVs are used to reconstruct a high order polynomial approximation in the SV. Each CV is then updated independently with a Godunov-type finite volume method and a high-order Runge-Kutta time integration scheme. A universal reconstruction is obtained by partitioning all SVs in a geometrically similar manner. The convergence of the SV method is shown to depend on how a SV is partitioned. A criterion based on the Lebesgue constant has been developed and used successfully to determine the quality of various partitions. Symmetric, stable, and convergent linear, quadratic, and cubic SVs have been obtained, and many different types of partitions have been evaluated. The SV method is tested for both linear and non-linear model problems with and without discontinuities.

  16. Soil sail content estimation in the yellow river delta with satellite hyperspectral data

    USGS Publications Warehouse

    Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang

    2008-01-01

    Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.

  17. Environmental factors and flow paths related to Escherichia coli concentrations at two beaches on Lake St. Clair, Michigan, 2002–2005

    USGS Publications Warehouse

    Holtschlag, David J.; Shively, Dawn; Whitman, Richard L.; Haack, Sheridan K.; Fogarty, Lisa R.

    2008-01-01

    Regression analyses and hydrodynamic modeling were used to identify environmental factors and flow paths associated with Escherichia coli (E. coli) concentrations at Memorial and Metropolitan Beaches on Lake St. Clair in Macomb County, Mich. Lake St. Clair is part of the binational waterway between the United States and Canada that connects Lake Huron with Lake Erie in the Great Lakes Basin. Linear regression, regression-tree, and logistic regression models were developed from E. coli concentration and ancillary environmental data. Linear regression models on log10 E. coli concentrations indicated that rainfall prior to sampling, water temperature, and turbidity were positively associated with bacteria concentrations at both beaches. Flow from Clinton River, changes in water levels, wind conditions, and log10 E. coli concentrations 2 days before or after the target bacteria concentrations were statistically significant at one or both beaches. In addition, various interaction terms were significant at Memorial Beach. Linear regression models for both beaches explained only about 30 percent of the variability in log10 E. coli concentrations. Regression-tree models were developed from data from both Memorial and Metropolitan Beaches but were found to have limited predictive capability in this study. The results indicate that too few observations were available to develop reliable regression-tree models. Linear logistic models were developed to estimate the probability of E. coli concentrations exceeding 300 most probable number (MPN) per 100 milliliters (mL). Rainfall amounts before bacteria sampling were positively associated with exceedance probabilities at both beaches. Flow of Clinton River, turbidity, and log10 E. coli concentrations measured before or after the target E. coli measurements were related to exceedances at one or both beaches. The linear logistic models were effective in estimating bacteria exceedances at both beaches. A receiver operating characteristic (ROC) analysis was used to determine cut points for maximizing the true positive rate prediction while minimizing the false positive rate. A two-dimensional hydrodynamic model was developed to simulate horizontal current patterns on Lake St. Clair in response to wind, flow, and water-level conditions at model boundaries. Simulated velocity fields were used to track hypothetical massless particles backward in time from the beaches along flow paths toward source areas. Reverse particle tracking for idealized steady-state conditions shows changes in expected flow paths and traveltimes with wind speeds and directions from 24 sectors. The results indicate that three to four sets of contiguous wind sectors have similar effects on flow paths in the vicinity of the beaches. In addition, reverse particle tracking was used for transient conditions to identify expected flow paths for 10 E. coli sampling events in 2004. These results demonstrate the ability to track hypothetical particles from the beaches, backward in time, to likely source areas. This ability, coupled with a greater frequency of bacteria sampling, may provide insight into changes in bacteria concentrations between source and sink areas.

  18. High resolution magnetic resonance imaging of the calcaneus: age-related changes in trabecular structure and comparison with dual X-ray absorptiometry measurements

    NASA Technical Reports Server (NTRS)

    Ouyang, X.; Selby, K.; Lang, P.; Engelke, K.; Klifa, C.; Fan, B.; Zucconi, F.; Hottya, G.; Chen, M.; Majumdar, S.; hide

    1997-01-01

    A high-resolution magnetic resonance imaging (MRI) protocol, together with specialized image processing techniques, was applied to the quantitative measurement of age-related changes in calcaneal trabecular structure. The reproducibility of the technique was assessed and the annual rates of change for several trabecular structure parameters were measured. The MR-derived trabecular parameters were compared with calcaneal bone mineral density (BMD), measured by dual X-ray absorptiometry (DXA) in the same subjects. Sagittal MR images were acquired at 1.5 T in 23 healthy women (mean age: 49.3 +/- 16.6 [SD]), using a three-dimensional gradient echo sequence. Image analysis procedures included internal gray-scale calibration, bone and marrow segmentation, and run-length methods. Three trabecular structure parameters, apparent bone volume (ABV/TV), intercept thickness (I.Th), and intercept separation (I.Sp) were calculated from the MR images. The short- and long-term precision errors (mean %CV) of these measured parameters were in the ranges 1-2% and 3-6%, respectively. Linear regression of the trabecular structure parameters vs. age showed significant correlation: ABV/TV (r2 = 33.7%, P < 0.0037), I.Th (r2 = 26.6%, P < 0.0118), I.Sp (r2 = 28.9%, P < 0.0081). These trends with age were also expressed as annual rates of change: ABV/TV (-0.52%/year), I.Th (-0.33%/year), and I.Sp (0.59%/year). Linear regression analysis also showed significant correlation between the MR-derived trabecular structure parameters and calcaneal BMD values. Although a larger group of subjects is needed to better define the age-related changes in trabecular structure parameters and their relation to BMD, these preliminary results demonstrate that high-resolution MRI may potentially be useful for the quantitative assessment of trabecular structure.

  19. Factors influencing superimposition error of 3D cephalometric landmarks by plane orientation method using 4 reference points: 4 point superimposition error regression model.

    PubMed

    Hwang, Jae Joon; Kim, Kee-Deog; Park, Hyok; Park, Chang Seo; Jeong, Ho-Gul

    2014-01-01

    Superimposition has been used as a method to evaluate the changes of orthodontic or orthopedic treatment in the dental field. With the introduction of cone beam CT (CBCT), evaluating 3 dimensional changes after treatment became possible by superimposition. 4 point plane orientation is one of the simplest ways to achieve superimposition of 3 dimensional images. To find factors influencing superimposition error of cephalometric landmarks by 4 point plane orientation method and to evaluate the reproducibility of cephalometric landmarks for analyzing superimposition error, 20 patients were analyzed who had normal skeletal and occlusal relationship and took CBCT for diagnosis of temporomandibular disorder. The nasion, sella turcica, basion and midpoint between the left and the right most posterior point of the lesser wing of sphenoidal bone were used to define a three-dimensional (3D) anatomical reference co-ordinate system. Another 15 reference cephalometric points were also determined three times in the same image. Reorientation error of each landmark could be explained substantially (23%) by linear regression model, which consists of 3 factors describing position of each landmark towards reference axes and locating error. 4 point plane orientation system may produce an amount of reorientation error that may vary according to the perpendicular distance between the landmark and the x-axis; the reorientation error also increases as the locating error and shift of reference axes viewed from each landmark increases. Therefore, in order to reduce the reorientation error, accuracy of all landmarks including the reference points is important. Construction of the regression model using reference points of greater precision is required for the clinical application of this model.

  20. Analysis on the multi-dimensional spectrum of the thrust force for the linear motor feed drive system in machine tools

    NASA Astrophysics Data System (ADS)

    Yang, Xiaojun; Lu, Dun; Ma, Chengfang; Zhang, Jun; Zhao, Wanhua

    2017-01-01

    The motor thrust force has lots of harmonic components due to the nonlinearity of drive circuit and motor itself in the linear motor feed drive system. What is more, in the motion process, these thrust force harmonics may vary with the position, velocity, acceleration and load, which affects the displacement fluctuation of the feed drive system. Therefore, in this paper, on the basis of the thrust force spectrum obtained by the Maxwell equation and the electromagnetic energy method, the multi-dimensional variation of each thrust harmonic is analyzed under different motion parameters. Then the model of the servo system is established oriented to the dynamic precision. The influence of the variation of the thrust force spectrum on the displacement fluctuation is discussed. At last the experiments are carried out to verify the theoretical analysis above. It can be found that the thrust harmonics show multi-dimensional spectrum characteristics under different motion parameters and loads, which should be considered to choose the motion parameters and optimize the servo control parameters in the high-speed and high-precision machine tools equipped with the linear motor feed drive system.

  1. Linearity and sex-specificity of impact force prediction during a fall onto the outstretched hand using a single-damper-model.

    PubMed

    Kawalilak, C E; Lanovaz, J L; Johnston, J D; Kontulainen, S A

    2014-09-01

    To assess the linearity and sex-specificity of damping coefficients used in a single-damper-model (SDM) when predicting impact forces during the worst-case falling scenario from fall heights up to 25 cm. Using 3-dimensional motion tracking and an integrated force plate, impact forces and impact velocities were assessed from 10 young adults (5 males; 5 females), falling from planted knees onto outstretched arms, from a random order of drop heights: 3, 5, 7, 10, 15, 20, and 25 cm. We assessed the linearity and sex-specificity between impact forces and impact velocities across all fall heights using analysis of variance linearity test and linear regression, respectively. Significance was accepted at P<0.05. Association between impact forces and impact velocities up to 25 cm was linear (P=0.02). Damping coefficients appeared sex-specific (males: 627 Ns/m, R(2)=0.70; females: 421 Ns/m; R(2)=0.81; sex combined: 532 Ns/m, R(2)=0.61). A linear damping coefficient used in the SDM proved valid for predicting impact forces from fall heights up to 25 cm. RESULTS suggested the use of sex-specific damping coefficients when estimating impact force using the SDM and calculating the factor-of-risk for wrist fractures.

  2. Linear response approach to active Brownian particles in time-varying activity fields

    NASA Astrophysics Data System (ADS)

    Merlitz, Holger; Vuijk, Hidde D.; Brader, Joseph; Sharma, Abhinav; Sommer, Jens-Uwe

    2018-05-01

    In a theoretical and simulation study, active Brownian particles (ABPs) in three-dimensional bulk systems are exposed to time-varying sinusoidal activity waves that are running through the system. A linear response (Green-Kubo) formalism is applied to derive fully analytical expressions for the torque-free polarization profiles of non-interacting particles. The activity waves induce fluxes that strongly depend on the particle size and may be employed to de-mix mixtures of ABPs or to drive the particles into selected areas of the system. Three-dimensional Langevin dynamics simulations are carried out to verify the accuracy of the linear response formalism, which is shown to work best when the particles are small (i.e., highly Brownian) or operating at low activity levels.

  3. Resting state fMRI reveals a default mode dissociation between retrosplenial and medial prefrontal subnetworks in ASD despite motion scrubbing.

    PubMed

    Starck, Tuomo; Nikkinen, Juha; Rahko, Jukka; Remes, Jukka; Hurtig, Tuula; Haapsamo, Helena; Jussila, Katja; Kuusikko-Gauffin, Sanna; Mattila, Marja-Leena; Jansson-Verkasalo, Eira; Pauls, David L; Ebeling, Hanna; Moilanen, Irma; Tervonen, Osmo; Kiviniemi, Vesa J

    2013-01-01

    In resting state functional magnetic resonance imaging (fMRI) studies of autism spectrum disorders (ASDs) decreased frontal-posterior functional connectivity is a persistent finding. However, the picture of the default mode network (DMN) hypoconnectivity remains incomplete. In addition, the functional connectivity analyses have been shown to be susceptible even to subtle motion. DMN hypoconnectivity in ASD has been specifically called for re-evaluation with stringent motion correction, which we aimed to conduct by so-called scrubbing. A rich set of default mode subnetworks can be obtained with high dimensional group independent component analysis (ICA) which can potentially provide more detailed view of the connectivity alterations. We compared the DMN connectivity in high-functioning adolescents with ASDs to typically developing controls using ICA dual-regression with decompositions from typical to high dimensionality. Dual-regression analysis within DMN subnetworks did not reveal alterations but connectivity between anterior and posterior DMN subnetworks was decreased in ASD. The results were very similar with and without motion scrubbing thus indicating the efficacy of the conventional motion correction methods combined with ICA dual-regression. Specific dissociation between DMN subnetworks was revealed on high ICA dimensionality, where networks centered at the medial prefrontal cortex and retrosplenial cortex showed weakened coupling in adolescents with ASDs compared to typically developing control participants. Generally the results speak for disruption in the anterior-posterior DMN interplay on the network level whereas local functional connectivity in DMN seems relatively unaltered.

  4. Network selection, Information filtering and Scalable computation

    NASA Astrophysics Data System (ADS)

    Ye, Changqing

    This dissertation explores two application scenarios of sparsity pursuit method on large scale data sets. The first scenario is classification and regression in analyzing high dimensional structured data, where predictors corresponds to nodes of a given directed graph. This arises in, for instance, identification of disease genes for the Parkinson's diseases from a network of candidate genes. In such a situation, directed graph describes dependencies among the genes, where direction of edges represent certain causal effects. Key to high-dimensional structured classification and regression is how to utilize dependencies among predictors as specified by directions of the graph. In this dissertation, we develop a novel method that fully takes into account such dependencies formulated through certain nonlinear constraints. We apply the proposed method to two applications, feature selection in large margin binary classification and in linear regression. We implement the proposed method through difference convex programming for the cost function and constraints. Finally, theoretical and numerical analyses suggest that the proposed method achieves the desired objectives. An application to disease gene identification is presented. The second application scenario is personalized information filtering which extracts the information specifically relevant to a user, predicting his/her preference over a large number of items, based on the opinions of users who think alike or its content. This problem is cast into the framework of regression and classification, where we introduce novel partial latent models to integrate additional user-specific and content-specific predictors, for higher predictive accuracy. In particular, we factorize a user-over-item preference matrix into a product of two matrices, each representing a user's preference and an item preference by users. Then we propose a likelihood method to seek a sparsest latent factorization, from a class of over-complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.

  5. Predicting birth weight with conditionally linear transformation models.

    PubMed

    Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten

    2016-12-01

    Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.

  6. Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation

    NASA Astrophysics Data System (ADS)

    Tripathy, Rohit; Bilionis, Ilias; Gonzalez, Marcial

    2016-09-01

    Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range of physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the model, we design a two-step maximum likelihood optimization procedure that ensures the orthogonality of the projection matrix by exploiting recent results on the Stiefel manifold, i.e., the manifold of matrices with orthogonal columns. The additional benefit of our probabilistic formulation, is that it allows us to select the dimensionality of the AS via the Bayesian information criterion. We validate our approach by showing that it can discover the right AS in synthetic examples without gradient information using both noiseless and noisy observations. We demonstrate that our method is able to discover the same AS as the classical approach in a challenging one-hundred-dimensional problem involving an elliptic stochastic partial differential equation with random conductivity. Finally, we use our approach to study the effect of geometric and material uncertainties in the propagation of solitary waves in a one dimensional granular system.

  7. Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tripathy, Rohit, E-mail: rtripath@purdue.edu; Bilionis, Ilias, E-mail: ibilion@purdue.edu; Gonzalez, Marcial, E-mail: marcial-gonzalez@purdue.edu

    2016-09-15

    Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range ofmore » physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the model, we design a two-step maximum likelihood optimization procedure that ensures the orthogonality of the projection matrix by exploiting recent results on the Stiefel manifold, i.e., the manifold of matrices with orthogonal columns. The additional benefit of our probabilistic formulation, is that it allows us to select the dimensionality of the AS via the Bayesian information criterion. We validate our approach by showing that it can discover the right AS in synthetic examples without gradient information using both noiseless and noisy observations. We demonstrate that our method is able to discover the same AS as the classical approach in a challenging one-hundred-dimensional problem involving an elliptic stochastic partial differential equation with random conductivity. Finally, we use our approach to study the effect of geometric and material uncertainties in the propagation of solitary waves in a one dimensional granular system.« less

  8. Sparse Regression as a Sparse Eigenvalue Problem

    NASA Technical Reports Server (NTRS)

    Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai

    2008-01-01

    We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization

  9. Predicting High Explosive Detonation Velocities from Their Composition and Structure

    DTIC Science & Technology

    1978-09-01

    for a gamut of ideal explosives. The explosives ranged from nitroaromatics, cyclic and linear nitramines, nitrate esters and nitro-nitrato...structure is postulated for a gamut of explosives. Since detonation velocity, DQ, is density dependent, the linear regression plot. Figure 1, of the

  10. Pseudo-second order models for the adsorption of safranin onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth

    2007-04-02

    Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.

  11. Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth

    ERIC Educational Resources Information Center

    Jeon, Minjeong

    2012-01-01

    Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…

  12. Single Image Super-Resolution Using Global Regression Based on Multiple Local Linear Mappings.

    PubMed

    Choi, Jae-Seok; Kim, Munchurl

    2017-03-01

    Super-resolution (SR) has become more vital, because of its capability to generate high-quality ultra-high definition (UHD) high-resolution (HR) images from low-resolution (LR) input images. Conventional SR methods entail high computational complexity, which makes them difficult to be implemented for up-scaling of full-high-definition input images into UHD-resolution images. Nevertheless, our previous super-interpolation (SI) method showed a good compromise between Peak-Signal-to-Noise Ratio (PSNR) performances and computational complexity. However, since SI only utilizes simple linear mappings, it may fail to precisely reconstruct HR patches with complex texture. In this paper, we present a novel SR method, which inherits the large-to-small patch conversion scheme from SI but uses global regression based on local linear mappings (GLM). Thus, our new SR method is called GLM-SI. In GLM-SI, each LR input patch is divided into 25 overlapped subpatches. Next, based on the local properties of these subpatches, 25 different local linear mappings are applied to the current LR input patch to generate 25 HR patch candidates, which are then regressed into one final HR patch using a global regressor. The local linear mappings are learned cluster-wise in our off-line training phase. The main contribution of this paper is as follows: Previously, linear-mapping-based conventional SR methods, including SI only used one simple yet coarse linear mapping to each patch to reconstruct its HR version. On the contrary, for each LR input patch, our GLM-SI is the first to apply a combination of multiple local linear mappings, where each local linear mapping is found according to local properties of the current LR patch. Therefore, it can better approximate nonlinear LR-to-HR mappings for HR patches with complex texture. Experiment results show that the proposed GLM-SI method outperforms most of the state-of-the-art methods, and shows comparable PSNR performance with much lower computational complexity when compared with a super-resolution method based on convolutional neural nets (SRCNN15). Compared with the previous SI method that is limited with a scale factor of 2, GLM-SI shows superior performance with average 0.79 dB higher in PSNR, and can be used for scale factors of 3 or higher.

  13. Multivariate time series analysis of neuroscience data: some challenges and opportunities.

    PubMed

    Pourahmadi, Mohsen; Noorbaloochi, Siamak

    2016-04-01

    Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Liquid electrolyte informatics using an exhaustive search with linear regression.

    PubMed

    Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato

    2018-06-14

    Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.

  15. The Grassmannian Atlas: A General Framework for Exploring Linear Projections of High-Dimensional Data

    DOE PAGES

    Liu, S.; Bremer, P. -T; Jayaraman, J. J.; ...

    2016-06-04

    Linear projections are one of the most common approaches to visualize high-dimensional data. Since the space of possible projections is large, existing systems usually select a small set of interesting projections by ranking a large set of candidate projections based on a chosen quality measure. However, while highly ranked projections can be informative, some lower ranked ones could offer important complementary information. Therefore, selection based on ranking may miss projections that are important to provide a global picture of the data. Here, the proposed work fills this gap by presenting the Grassmannian Atlas, a framework that captures the global structuresmore » of quality measures in the space of all projections, which enables a systematic exploration of many complementary projections and provides new insights into the properties of existing quality measures.« less

  16. An evaluation of the accuracy of some radar wind profiling techniques

    NASA Technical Reports Server (NTRS)

    Koscielny, A. J.; Doviak, R. J.

    1983-01-01

    Major advances in Doppler radar measurement in optically clear air have made it feasible to monitor radial velocities in the troposphere and lower stratosphere. For most applications the three dimensional wind vector is monitored rather than the radial velocity. Measurement of the wind vector with a single radar can be made assuming a spatially linear, time invariant wind field. The components and derivatives of the wind are estimated by the parameters of a linear regression of the radial velocities on functions of their spatial locations. The accuracy of the wind measurement thus depends on the locations of the radial velocities. The suitability is evaluated of some of the common retrieval techniques for simultaneous measurement of both the vertical and horizontal wind components. The techniques considered for study are fixed beam, azimuthal scanning (VAD) and elevation scanning (VED).

  17. Echocardiographic measurements of left ventricular mass by a non-geometric method

    NASA Technical Reports Server (NTRS)

    Parra, Beatriz; Buckey, Jay; Degraff, David; Gaffney, F. Andrew; Blomqvist, C. Gunnar

    1987-01-01

    The accuracy of a new nongeometric method for calculating left ventricular myocardial volumes from two-dimensional echocardiographic images was assessed in vitro using 20 formalin-fixed normal human hearts. Serial oblique short-axis images were acquired from one point at 5-deg intervals, for a total of 10-12 cross sections. Echocardiographic myocardial volumes were calculated as the difference between the volumes defined by the epi- and endocardial surfaces. Actual myocardial volumes were determined by water displacement. Volumes ranged from 80 to 174 ml (mean 130.8 ml). Linear regression analysis demonstrated excellent agreement between the echocardiographic and direct measurements.

  18. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    NASA Astrophysics Data System (ADS)

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  19. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  20. Gradient-Induced Voltages on 12-Lead ECGs during High Duty-Cycle MRI Sequences and a Method for Their Removal considering Linear and Concomitant Gradient Terms

    PubMed Central

    Zhang, Shelley HuaLei; Ho Tse, Zion Tsz; Dumoulin, Charles L.; Kwong, Raymond Y.; Stevenson, William G.; Watkins, Ronald; Ward, Jay; Wang, Wei; Schmidt, Ehud J.

    2015-01-01

    Purpose To restore 12-lead ECG signal fidelity inside MRI by removing magnetic-field gradient induced-voltages during high gradient-duty-cycle sequences. Theory and Methods A theoretical equation was derived, providing first- and second-order electrical fields induced at individual ECG electrode as a function of gradient fields. Experiments were performed at 3T on healthy volunteers, using a customized acquisition system which captured full amplitude and frequency response of ECGs, or a commercial recording system. The 19 equation coefficients were derived by linear regression of data from accelerated sequences, and used to compute induced-voltages in real-time during full-resolution sequences to remove ECG artifacts. Restored traces were evaluated relative to ones acquired without imaging. Results Measured induced-voltages were 0.7V peak-to-peak during balanced Steady-State Free Precession (bSSFP) with heart at the isocenter. Applying the equation during gradient echo sequencing, three-dimensional fast spin echo and multi-slice bSSFP imaging restored nonsaturated traces and second-order concomitant terms showed larger contributions in electrodes farther from the magnet isocenter. Equation coefficients are evaluated with high repeatability (ρ = 0.996) and are subject, sequence, and slice-orientation dependent. Conclusion Close agreement between theoretical and measured gradient-induced voltages allowed for real-time removal. Prospective estimation of sequence-periods where large induced-voltages occur may allow hardware removal of these signals. PMID:26101951

  1. High-flow oxygen therapy: pressure analysis in a pediatric airway model.

    PubMed

    Urbano, Javier; del Castillo, Jimena; López-Herce, Jesús; Gallardo, José A; Solana, María J; Carrillo, Ángel

    2012-05-01

    The mechanism of high-flow oxygen therapy and the pressures reached in the airway have not been defined. We hypothesized that the flow would generate a low continuous positive pressure, and that elevated flow rates in this model could produce moderate pressures. The objective of this study was to analyze the pressure generated by a high-flow oxygen therapy system in an experimental model of the pediatric airway. An experimental in vitro study was performed. A high-flow oxygen therapy system was connected to 3 types of interface (nasal cannulae, nasal mask, and oronasal mask) and applied to 2 types of pediatric manikin (infant and neonatal). The pressures generated in the circuit, in the airway, and in the pharynx were measured at different flow rates (5, 10, 15, and 20 L/min). The experiment was conducted with and without a leak (mouth sealed and unsealed). Linear regression analyses were performed for each set of measurements. The pressures generated with the different interfaces were very similar. The maximum pressure recorded was 4 cm H(2)O with a flow of 20 L/min via nasal cannulae or nasal mask. When the mouth of the manikin was held open, the pressures reached in the airway and pharynxes were undetectable. Linear regression analyses showed a similar linear relationship between flow and pressures measured in the pharynx (pressure = -0.375 + 0.138 × flow) and in the airway (pressure = -0.375 + 0.158 × flow) with the closed mouth condition. According to our hypothesis, high-flow oxygen therapy systems produced a low-level CPAP in an experimental pediatric model, even with the use of very high flow rates. Linear regression analyses showed similar linear relationships between flow and pressures measured in the pharynx and in the airway. This finding suggests that, at least in part, the effects may be due to other mechanisms.

  2. Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data

    PubMed Central

    2012-01-01

    Background Dimensionality reduction (DR) enables the construction of a lower dimensional space (embedding) from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding). Intelligent sub-sampling (via mean-shift) and code parallelization are utilized to provide for an efficient implementation of the scheme. Results Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1) image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2) classification of 4 high-dimensional gene-expression datasets, (3) cancer detection (at a pixel-level) on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. Conclusions We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range of high-dimensional biomedical data classification and segmentation problems. Our generalizable framework allows for improved representation and classification in the context of both imaging and non-imaging data. The algorithm offers a promising solution to problems that currently plague DR methods, and may allow for extension to other areas of biomedical data analysis. PMID:22316103

  3. Local-metrics error-based Shepard interpolation as surrogate for highly non-linear material models in high dimensions

    NASA Astrophysics Data System (ADS)

    Lorenzi, Juan M.; Stecher, Thomas; Reuter, Karsten; Matera, Sebastian

    2017-10-01

    Many problems in computational materials science and chemistry require the evaluation of expensive functions with locally rapid changes, such as the turn-over frequency of first principles kinetic Monte Carlo models for heterogeneous catalysis. Because of the high computational cost, it is often desirable to replace the original with a surrogate model, e.g., for use in coupled multiscale simulations. The construction of surrogates becomes particularly challenging in high-dimensions. Here, we present a novel version of the modified Shepard interpolation method which can overcome the curse of dimensionality for such functions to give faithful reconstructions even from very modest numbers of function evaluations. The introduction of local metrics allows us to take advantage of the fact that, on a local scale, rapid variation often occurs only across a small number of directions. Furthermore, we use local error estimates to weigh different local approximations, which helps avoid artificial oscillations. Finally, we test our approach on a number of challenging analytic functions as well as a realistic kinetic Monte Carlo model. Our method not only outperforms existing isotropic metric Shepard methods but also state-of-the-art Gaussian process regression.

  4. Local-metrics error-based Shepard interpolation as surrogate for highly non-linear material models in high dimensions.

    PubMed

    Lorenzi, Juan M; Stecher, Thomas; Reuter, Karsten; Matera, Sebastian

    2017-10-28

    Many problems in computational materials science and chemistry require the evaluation of expensive functions with locally rapid changes, such as the turn-over frequency of first principles kinetic Monte Carlo models for heterogeneous catalysis. Because of the high computational cost, it is often desirable to replace the original with a surrogate model, e.g., for use in coupled multiscale simulations. The construction of surrogates becomes particularly challenging in high-dimensions. Here, we present a novel version of the modified Shepard interpolation method which can overcome the curse of dimensionality for such functions to give faithful reconstructions even from very modest numbers of function evaluations. The introduction of local metrics allows us to take advantage of the fact that, on a local scale, rapid variation often occurs only across a small number of directions. Furthermore, we use local error estimates to weigh different local approximations, which helps avoid artificial oscillations. Finally, we test our approach on a number of challenging analytic functions as well as a realistic kinetic Monte Carlo model. Our method not only outperforms existing isotropic metric Shepard methods but also state-of-the-art Gaussian process regression.

  5. Chaos and Robustness in a Single Family of Genetic Oscillatory Networks

    PubMed Central

    Fu, Daniel; Tan, Patrick; Kuznetsov, Alexey; Molkov, Yaroslav I.

    2014-01-01

    Genetic oscillatory networks can be mathematically modeled with delay differential equations (DDEs). Interpreting genetic networks with DDEs gives a more intuitive understanding from a biological standpoint. However, it presents a problem mathematically, for DDEs are by construction infinitely-dimensional and thus cannot be analyzed using methods common for systems of ordinary differential equations (ODEs). In our study, we address this problem by developing a method for reducing infinitely-dimensional DDEs to two- and three-dimensional systems of ODEs. We find that the three-dimensional reductions provide qualitative improvements over the two-dimensional reductions. We find that the reducibility of a DDE corresponds to its robustness. For non-robust DDEs that exhibit high-dimensional dynamics, we calculate analytic dimension lines to predict the dependence of the DDEs’ correlation dimension on parameters. From these lines, we deduce that the correlation dimension of non-robust DDEs grows linearly with the delay. On the other hand, for robust DDEs, we find that the period of oscillation grows linearly with delay. We find that DDEs with exclusively negative feedback are robust, whereas DDEs with feedback that changes its sign are not robust. We find that non-saturable degradation damps oscillations and narrows the range of parameter values for which oscillations exist. Finally, we deduce that natural genetic oscillators with highly-regular periods likely have solely negative feedback. PMID:24667178

  6. [Research on the method of interference correction for nondispersive infrared multi-component gas analysis].

    PubMed

    Sun, You-Wen; Liu, Wen-Qing; Wang, Shi-Mei; Huang, Shu-Hua; Yu, Xiao-Man

    2011-10-01

    A method of interference correction for nondispersive infrared multi-component gas analysis was described. According to the successive integral gas absorption models and methods, the influence of temperature and air pressure on the integral line strengths and linetype was considered, and based on Lorentz detuning linetypes, the absorption cross sections and response coefficients of H2O, CO2, CO, and NO on each filter channel were obtained. The four dimension linear regression equations for interference correction were established by response coefficients, the absorption cross interference was corrected by solving the multi-dimensional linear regression equations, and after interference correction, the pure absorbance signal on each filter channel was only controlled by the corresponding target gas concentration. When the sample cell was filled with gas mixture with a certain concentration proportion of CO, NO and CO2, the pure absorbance after interference correction was used for concentration inversion, the inversion concentration error for CO2 is 2.0%, the inversion concentration error for CO is 1.6%, and the inversion concentration error for NO is 1.7%. Both the theory and experiment prove that the interference correction method proposed for NDIR multi-component gas analysis is feasible.

  7. Objective estimation of tropical cyclone innercore surface wind structure using infrared satellite images

    NASA Astrophysics Data System (ADS)

    Zhang, Changjiang; Dai, Lijie; Ma, Leiming; Qian, Jinfang; Yang, Bo

    2017-10-01

    An objective technique is presented for estimating tropical cyclone (TC) innercore two-dimensional (2-D) surface wind field structure using infrared satellite imagery and machine learning. For a TC with eye, the eye contour is first segmented by a geodesic active contour model, based on which the eye circumference is obtained as the TC eye size. A mathematical model is then established between the eye size and the radius of maximum wind obtained from the past official TC report to derive the 2-D surface wind field within the TC eye. Meanwhile, the composite information about the latitude of TC center, surface maximum wind speed, TC age, and critical wind radii of 34- and 50-kt winds can be combined to build another mathematical model for deriving the innercore wind structure. After that, least squares support vector machine (LSSVM), radial basis function neural network (RBFNN), and linear regression are introduced, respectively, in the two mathematical models, which are then tested with sensitivity experiments on real TC cases. Verification shows that the innercore 2-D surface wind field structure estimated by LSSVM is better than that of RBFNN and linear regression.

  8. Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs. NBER Working Paper No. 20405

    ERIC Educational Resources Information Center

    Gelman, Andrew; Imbens, Guido

    2014-01-01

    It is common in regression discontinuity analysis to control for high order (third, fourth, or higher) polynomials of the forcing variable. We argue that estimators for causal effects based on such methods can be misleading, and we recommend researchers do not use them, and instead use estimators based on local linear or quadratic polynomials or…

  9. Application of QuickBird imagery in fuel load estimation in the Daxinganling region, China.

    Treesearch

    Sen Jin; Shyh-Chin Chen

    2012-01-01

    A high spatial resolution QuickBird satellite image and a low spatial but high spectral resolution Landsat Thermatic Mapper image were used to linearly regress fuel loads of 70 plots with size 30X30m over the Daxinganling region of north-east China. The results were compared with loads from field surveys and from regression estimations by surveyed stand characteristics...

  10. Linear regression crash prediction models : issues and proposed solutions.

    DOT National Transportation Integrated Search

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  11. Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

    ERIC Educational Resources Information Center

    Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

    2013-01-01

    In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

  12. Evaluation of Uncertainty and Sensitivity in Environmental Modeling at a Radioactive Waste Management Site

    NASA Astrophysics Data System (ADS)

    Stockton, T. B.; Black, P. K.; Catlett, K. M.; Tauxe, J. D.

    2002-05-01

    Environmental modeling is an essential component in the evaluation of regulatory compliance of radioactive waste management sites (RWMSs) at the Nevada Test Site in southern Nevada, USA. For those sites that are currently operating, further goals are to support integrated decision analysis for the development of acceptance criteria for future wastes, as well as site maintenance, closure, and monitoring. At these RWMSs, the principal pathways for release of contamination to the environment are upward towards the ground surface rather than downwards towards the deep water table. Biotic processes, such as burrow excavation and plant uptake and turnover, dominate this upward transport. A combined multi-pathway contaminant transport and risk assessment model was constructed using the GoldSim modeling platform. This platform facilitates probabilistic analysis of environmental systems, and is especially well suited for assessments involving radionuclide decay chains. The model employs probabilistic definitions of key parameters governing contaminant transport, with the goals of quantifying cumulative uncertainty in the estimation of performance measures and providing information necessary to perform sensitivity analyses. This modeling differs from previous radiological performance assessments (PAs) in that the modeling parameters are intended to be representative of the current knowledge, and the uncertainty in that knowledge, of parameter values rather than reflective of a conservative assessment approach. While a conservative PA may be sufficient to demonstrate regulatory compliance, a parametrically honest PA can also be used for more general site decision-making. In particular, a parametrically honest probabilistic modeling approach allows both uncertainty and sensitivity analyses to be explicitly coupled to the decision framework using a single set of model realizations. For example, sensitivity analysis provides a guide for analyzing the value of collecting more information by quantifying the relative importance of each input parameter in predicting the model response. However, in these complex, high dimensional eco-system models, represented by the RWMS model, the dynamics of the systems can act in a non-linear manner. Quantitatively assessing the importance of input variables becomes more difficult as the dimensionality, the non-linearities, and the non-monotonicities of the model increase. Methods from data mining such as Multivariate Adaptive Regression Splines (MARS) and the Fourier Amplitude Sensitivity Test (FAST) provide tools that can be used in global sensitivity analysis in these high dimensional, non-linear situations. The enhanced interpretability of model output provided by the quantitative measures estimated by these global sensitivity analysis tools will be demonstrated using the RWMS model.

  13. Partitioning sources of variation in vertebrate species richness

    USGS Publications Warehouse

    Boone, R.B.; Krohn, W.B.

    2000-01-01

    Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.

  14. SU-E-J-237: Image Feature Based DRR and Portal Image Registration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, X; Chang, J

    Purpose: Two-dimensional (2D) matching of the kV X-ray and digitally reconstructed radiography (DRR) images is an important setup technique for image-guided radiotherapy (IGRT). In our clinics, mutual information based methods are used for this purpose on commercial linear accelerators, but with often needs for manual corrections. This work proved the feasibility that feature based image transform can be used to register kV and DRR images. Methods: The scale invariant feature transform (SIFT) method was implemented to detect the matching image details (or key points) between the kV and DRR images. These key points represent high image intensity gradients, and thusmore » the scale invariant features. Due to the poor image contrast from our kV image, direct application of the SIFT method yielded many detection errors. To assist the finding of key points, the center coordinates of the kV and DRR images were read from the DICOM header, and the two groups of key points with similar relative positions to their corresponding centers were paired up. Using these points, a rigid transform (with scaling, horizontal and vertical shifts) was estimated. We also artificially introduced vertical and horizontal shifts to test the accuracy of our registration method on anterior-posterior (AP) and lateral pelvic images. Results: The results provided a satisfactory overlay of the transformed kV onto the DRR image. The introduced vs. detected shifts were fit into a linear regression. In the AP image experiments, linear regression analysis showed a slope of 1.15 and 0.98 with an R2 of 0.89 and 0.99 for the horizontal and vertical shifts, respectively. The results are 1.2 and 1.3 with R2 of 0.72 and 0.82 for the lateral image shifts. Conclusion: This work provided an alternative technique for kV to DRR alignment. Further improvements in the estimation accuracy and image contrast tolerance are underway.« less

  15. High-Resolution Gamma-Ray Imaging Measurements Using Externally Segmented Germanium Detectors

    NASA Technical Reports Server (NTRS)

    Callas, J.; Mahoney, W.; Skelton, R.; Varnell, L.; Wheaton, W.

    1994-01-01

    Fully two-dimensional gamma-ray imaging with simultaneous high-resolution spectroscopy has been demonstrated using an externally segmented germanium sensor. The system employs a single high-purity coaxial detector with its outer electrode segmented into 5 distinct charge collection regions and a lead coded aperture with a uniformly redundant array (URA) pattern. A series of one-dimensional responses was collected around 511 keV while the system was rotated in steps through 180 degrees. A non-negative, linear least-squares algorithm was then employed to reconstruct a 2-dimensional image. Corrections for multiple scattering in the detector, and the finite distance of source and detector are made in the reconstruction process.

  16. Visual Analytics for Exploration of a High-Dimensional Structure

    DTIC Science & Technology

    2013-04-01

    5 Figure 3. Comparison of Euclidean vs. geodesic distance. LDRs use...manifold, whereas an LDR fails. ...........................6 Figure 4. WEKA GUI for data mining HDD using FRFS-ACO...multidimensional scaling (CMDS)— are a linear DR ( LDR ). An LDR is based on a linear combination of the feature data. LDRs keep similar data points close together

  17. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  18. A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty

    USGS Publications Warehouse

    Friedel, Michael J.

    2011-01-01

    This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.

  19. Efficient generation of sum-of-products representations of high-dimensional potential energy surfaces based on multimode expansions

    NASA Astrophysics Data System (ADS)

    Ziegler, Benjamin; Rauhut, Guntram

    2016-03-01

    The transformation of multi-dimensional potential energy surfaces (PESs) from a grid-based multimode representation to an analytical one is a standard procedure in quantum chemical programs. Within the framework of linear least squares fitting, a simple and highly efficient algorithm is presented, which relies on a direct product representation of the PES and a repeated use of Kronecker products. It shows the same scalings in computational cost and memory requirements as the potfit approach. In comparison to customary linear least squares fitting algorithms, this corresponds to a speed-up and memory saving by several orders of magnitude. Different fitting bases are tested, namely, polynomials, B-splines, and distributed Gaussians. Benchmark calculations are provided for the PESs of a set of small molecules.

  20. Efficient generation of sum-of-products representations of high-dimensional potential energy surfaces based on multimode expansions.

    PubMed

    Ziegler, Benjamin; Rauhut, Guntram

    2016-03-21

    The transformation of multi-dimensional potential energy surfaces (PESs) from a grid-based multimode representation to an analytical one is a standard procedure in quantum chemical programs. Within the framework of linear least squares fitting, a simple and highly efficient algorithm is presented, which relies on a direct product representation of the PES and a repeated use of Kronecker products. It shows the same scalings in computational cost and memory requirements as the potfit approach. In comparison to customary linear least squares fitting algorithms, this corresponds to a speed-up and memory saving by several orders of magnitude. Different fitting bases are tested, namely, polynomials, B-splines, and distributed Gaussians. Benchmark calculations are provided for the PESs of a set of small molecules.

  1. Sparse Additive Ordinary Differential Equations for Dynamic Gene Regulatory Network Modeling.

    PubMed

    Wu, Hulin; Lu, Tao; Xue, Hongqi; Liang, Hua

    2014-04-02

    The gene regulation network (GRN) is a high-dimensional complex system, which can be represented by various mathematical or statistical models. The ordinary differential equation (ODE) model is one of the popular dynamic GRN models. High-dimensional linear ODE models have been proposed to identify GRNs, but with a limitation of the linear regulation effect assumption. In this article, we propose a sparse additive ODE (SA-ODE) model, coupled with ODE estimation methods and adaptive group LASSO techniques, to model dynamic GRNs that could flexibly deal with nonlinear regulation effects. The asymptotic properties of the proposed method are established and simulation studies are performed to validate the proposed approach. An application example for identifying the nonlinear dynamic GRN of T-cell activation is used to illustrate the usefulness of the proposed method.

  2. Reduced-Order Models Based on POD-Tpwl for Compositional Subsurface Flow Simulation

    NASA Astrophysics Data System (ADS)

    Durlofsky, L. J.; He, J.; Jin, L. Z.

    2014-12-01

    A reduced-order modeling procedure applicable for compositional subsurface flow simulation will be described and applied. The technique combines trajectory piecewise linearization (TPWL) and proper orthogonal decomposition (POD) to provide highly efficient surrogate models. The method is based on a molar formulation (which uses pressure and overall component mole fractions as the primary variables) and is applicable for two-phase, multicomponent systems. The POD-TPWL procedure expresses new solutions in terms of linearizations around solution states generated and saved during previously simulated 'training' runs. High-dimensional states are projected into a low-dimensional subspace using POD. Thus, at each time step, only a low-dimensional linear system needs to be solved. Results will be presented for heterogeneous three-dimensional simulation models involving CO2 injection. Both enhanced oil recovery and carbon storage applications (with horizontal CO2 injectors) will be considered. Reasonably close agreement between full-order reference solutions and compositional POD-TPWL simulations will be demonstrated for 'test' runs in which the well controls differ from those used for training. Construction of the POD-TPWL model requires preprocessing overhead computations equivalent to about 3-4 full-order runs. Runtime speedups using POD-TPWL are, however, very significant - typically O(100-1000). The use of POD-TPWL for well control optimization will also be illustrated. For this application, some amount of retraining during the course of the optimization is required, which leads to smaller, but still significant, speedup factors.

  3. A novel spinal kinematic analysis using X-ray imaging and vicon motion analysis: a case study.

    PubMed

    Noh, Dong K; Lee, Nam G; You, Joshua H

    2014-01-01

    This study highlights a novel spinal kinematic analysis method and the feasibility of X-ray imaging measurements to accurately assess thoracic spine motion. The advanced X-ray Nash-Moe method and analysis were used to compute the segmental range of motion in thoracic vertebra pedicles in vivo. This Nash-Moe X-ray imaging method was compared with a standardized method using the Vicon 3-dimensional motion capture system. Linear regression analysis showed an excellent and significant correlation between the two methods (R2 = 0.99, p < 0.05), suggesting that the analysis of spinal segmental range of motion using X-ray imaging measurements was accurate and comparable to the conventional 3-dimensional motion analysis system. Clinically, this novel finding is compelling evidence demonstrating that measurements with X-ray imaging are useful to accurately decipher pathological spinal alignment and movement impairments in idiopathic scoliosis (IS).

  4. Partial least squares analysis of rocket propulsion fuel data using diaphragm valve-based comprehensive two-dimensional gas chromatography coupled with flame ionization detection.

    PubMed

    Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E

    2016-06-01

    The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. A Three-Dimensional Linearized Unsteady Euler Analysis for Turbomachinery Blade Rows

    NASA Technical Reports Server (NTRS)

    Montgomery, Matthew D.; Verdon, Joseph M.

    1997-01-01

    A three-dimensional, linearized, Euler analysis is being developed to provide an efficient unsteady aerodynamic analysis that can be used to predict the aeroelastic and aeroacoustic responses of axial-flow turbo-machinery blading.The field equations and boundary conditions needed to describe nonlinear and linearized inviscid unsteady flows through a blade row operating within a cylindrical annular duct are presented. A numerical model for linearized inviscid unsteady flows, which couples a near-field, implicit, wave-split, finite volume analysis to a far-field eigenanalysis, is also described. The linearized aerodynamic and numerical models have been implemented into a three-dimensional linearized unsteady flow code, called LINFLUX. This code has been applied to selected, benchmark, unsteady, subsonic flows to establish its accuracy and to demonstrate its current capabilities. The unsteady flows considered, have been chosen to allow convenient comparisons between the LINFLUX results and those of well-known, two-dimensional, unsteady flow codes. Detailed numerical results for a helical fan and a three-dimensional version of the 10th Standard Cascade indicate that important progress has been made towards the development of a reliable and useful, three-dimensional, prediction capability that can be used in aeroelastic and aeroacoustic design studies.

  6. Transmission of linear regression patterns between time series: From relationship in time series to complex networks

    NASA Astrophysics Data System (ADS)

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  7. Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

    PubMed

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  8. EPS-LASSO: Test for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits.

    PubMed

    Xu, Chao; Fang, Jian; Shen, Hui; Wang, Yu-Ping; Deng, Hong-Wen

    2018-01-25

    Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in extreme phenotypic samples, EPS can boost the association power compared to random sampling. Most existing statistical methods for EPS examine the genetic factors individually, despite many quantitative traits have multiple genetic factors underlying their variation. It is desirable to model the joint effects of genetic factors, which may increase the power and identify novel quantitative trait loci under EPS. The joint analysis of genetic data in high-dimensional situations requires specialized techniques, e.g., the least absolute shrinkage and selection operator (LASSO). Although there are extensive research and application related to LASSO, the statistical inference and testing for the sparse model under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function. The comprehensive simulation shows EPS-LASSO outperforms existing methods with stable type I error and FDR control. EPS-LASSO can provide a consistent power for both low- and high-dimensional situations compared with the other methods dealing with high-dimensional situations. The power of EPS-LASSO is close to other low-dimensional methods when the causal effect sizes are small and is superior when the effects are large. Applying EPS-LASSO to a transcriptome-wide gene expression study for obesity reveals 10 significant body mass index associated genes. Our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. The source code is available at https://github.com/xu1912/EPSLASSO. hdeng2@tulane.edu. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. From Airborne EM to Geology, some examples

    NASA Astrophysics Data System (ADS)

    Gunnink, Jan

    2014-05-01

    Introduction Airborne Electro Magnetics (AEM) provide a model of the 3-dimensional distribution of resistivity of the subsurface. These resistivity models were used for delineating geological structures (e.g. Buried Valleys and salt domes) and for geohydrological modeling of aquifers (sandy sediments) and aquitards (clayey sediments). Most of the interpretation of the AEM has been carried out manually, by interpretation of 2 and 3-dimensional resistivity models into geological units by a skilled geologists / geophysicist. The manual interpretation is tiresome, takes a long time and is prone to subjective choices of the interpreter. Therefore, semi-automatic interpretation of AEM resistivity models into geological units is a recent research topic. Two examples are presented that show how resistivity, as obtained from AEM, can be "converted" to useful geological / geohydrolocal models. Statistical relation between borehole data and resistivity In the northeastern part of the Netherlands, the 3D distribution of clay deposits - formed in a glacio-lacustrine environment with buried glacial valleys - was modelled. Boreholes with description of lithology, were linked to AEM resistivity. First, 1D AEM resistivity models from each individual sounding were interpolated to cover the entire study area, resulting in a 3-dimensional model of resistivity. For each interval of clay and sand in the boreholes, the corresponding resistivity was extracted from the 3D resistivity model. Linear regression was used to link the clay and non-clay proportion in each borehole interval to the Ln(resistivity). This regression is then used to "convert" the 3D resistivity model into proportion of clay for the entire study area. This so-called "soft information" is combined with the "hard data" (boreholes) to model the proportion of clay for the entire study area using geostatistical simulation techniques (Sequential Indicator Simulation with collocated co-kriging). 100 realizations of the 3-dimensional distribution of clay and sand were calculated giving an appreciation of the variability of the 3-dimensional distribution of clay and sand. Each realization was input into a groundwatermodel to assess the protection the of the clay against pollution from the surface. Artificial Neural Networks AEM resistivity models in an area in Northern part of the Netherlands were interpreted by Artificial Neural Networks (ANN) to obtain a 3-dimensional model of a glacial till deposit that is important in geohydrological modeling. The groundwater in the study area was brackish to saline, causing the AEM resistivity model to be dominated by the low resistivity of the groundwater. After conducting Electrical Cone Penetration Tests (ECPTs) it became clear that the glacial till showed a distinct, non-linear, pattern of resistivity, that was discriminating it from the surrounding sediments. The patterns, found in the ECPTs were used to train an ANN and was consequently applied to the resistivity model that was derived from the AEM. The result was a 3-dimensional model of the probability of having the glacial till, which was checked against boreholes and proved to be quite reasonable. Conclusion Resistivity derived from AEM can be linked to geological features in a number of ways. Besides manual interpretation, statistical techniques are used, either in the form of regression or by means of Neural Networks, to extract geological and geohydrological meaningful interpretations from the resistivity model.

  10. Are We All in the Same Boat? The Role of Perceptual Distance in Organizational Health Interventions.

    PubMed

    Hasson, Henna; von Thiele Schwarz, Ulrica; Nielsen, Karina; Tafvelin, Susanne

    2016-10-01

    The study investigates how agreement between leaders' and their team's perceptions influence intervention outcomes in a leadership-training intervention aimed at improving organizational learning. Agreement, i.e. perceptual distance was calculated for the organizational learning dimensions at baseline. Changes in the dimensions from pre-intervention to post-intervention were evaluated using polynomial regression analysis with response surface analysis. The general pattern of the results indicated that the organizational learning improved when leaders and their teams agreed on the level of organizational learning prior to the intervention. The improvement was greatest when the leader's and the team's perceptions at baseline were aligned and high rather than aligned and low. The least beneficial scenario was when the leader's perceptions were higher than the team's perceptions. These results give insights into the importance of comparing leaders' and their team's perceptions in intervention research. Polynomial regression analyses with response surface methodology allow three-dimensional examination of relationship between two predictor variables and an outcome. This contributes with knowledge on how combination of predictor variables may affect outcome and allows studies of potential non-linearity relating to the outcome. Future studies could use these methods in process evaluation of interventions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Skin microrelief profiles as a cutaneous aging index.

    PubMed

    Kim, Dai Hyun; Rhyu, Yeon Seung; Ahn, Hyo Hyun; Hwang, Eenjun; Uhm, Chang Sub

    2016-10-01

    An objective measurement of cutaneous topographical information is important for quantifying the degree of skin aging. Our aim was to improve methods for measuring microrelief patterns using a three-dimensional analysis based on silicone replicas and scanning electron microscope (SEM). Another objective was to compare the results with those obtained using a two-dimensional analysis method based on dermoscopy. Silicone replicas were obtained from forearms, dorsum of the hands and fingers of 51 volunteers. Cutaneous profiles obtained by SEM with silicone replicas showed more consistent correlations with age than data obtained by dermoscopy. This indicates the advantage of three-dimensional topography analysis using silicone replicas and SEM over the widely used dermoscopic assessment. The cutaneous age was calculated using stepwise linear regression, and the result was 57.40-9.47 × (number of furrows on dorsum of the hand) × (width of furrows on dorsum of the hand). © The Author 2016. Published by Oxford University Press on behalf of The Japanese Society of Microscopy. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

    PubMed

    Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

    2018-06-01

    This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.

  13. Ultraviolet and near-infrared femtosecond temporal pulse shaping with a new high-aspect-ratio one-dimensional micromirror array.

    PubMed

    Weber, Stefan M; Extermann, Jérôme; Bonacina, Luigi; Noell, Wilfried; Kiselev, Denis; Waldis, Severin; de Rooij, Nico F; Wolf, Jean-Pierre

    2010-09-15

    We demonstrate the capabilities of a new optical microelectromechanical systems device that we specifically developed for broadband femtosecond pulse shaping. It consists of a one-dimensional array of 100 independently addressable, high-aspect-ratio micromirrors with up to 3 μm stroke. We apply linear and quadratic phase modulations demonstrating the temporal compression of 800 and 400 nm pulses. Because of the device's surface flatness, stroke, and stroke resolution, phase shaping over an unprecedented bandwidth is attainable.

  14. Real-time model learning using Incremental Sparse Spectrum Gaussian Process Regression.

    PubMed

    Gijsberts, Arjan; Metta, Giorgio

    2013-05-01

    Novel applications in unstructured and non-stationary human environments require robots that learn from experience and adapt autonomously to changing conditions. Predictive models therefore not only need to be accurate, but should also be updated incrementally in real-time and require minimal human intervention. Incremental Sparse Spectrum Gaussian Process Regression is an algorithm that is targeted specifically for use in this context. Rather than developing a novel algorithm from the ground up, the method is based on the thoroughly studied Gaussian Process Regression algorithm, therefore ensuring a solid theoretical foundation. Non-linearity and a bounded update complexity are achieved simultaneously by means of a finite dimensional random feature mapping that approximates a kernel function. As a result, the computational cost for each update remains constant over time. Finally, algorithmic simplicity and support for automated hyperparameter optimization ensures convenience when employed in practice. Empirical validation on a number of synthetic and real-life learning problems confirms that the performance of Incremental Sparse Spectrum Gaussian Process Regression is superior with respect to the popular Locally Weighted Projection Regression, while computational requirements are found to be significantly lower. The method is therefore particularly suited for learning with real-time constraints or when computational resources are limited. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. [New method of mixed gas infrared spectrum analysis based on SVM].

    PubMed

    Bai, Peng; Xie, Wen-Jun; Liu, Jun-Hua

    2007-07-01

    A new method of infrared spectrum analysis based on support vector machine (SVM) for mixture gas was proposed. The kernel function in SVM was used to map the seriously overlapping absorption spectrum into high-dimensional space, and after transformation, the high-dimensional data could be processed in the original space, so the regression calibration model was established, then the regression calibration model with was applied to analyze the concentration of component gas. Meanwhile it was proved that the regression calibration model with SVM also could be used for component recognition of mixture gas. The method was applied to the analysis of different data samples. Some factors such as scan interval, range of the wavelength, kernel function and penalty coefficient C that affect the model were discussed. Experimental results show that the component concentration maximal Mean AE is 0.132%, and the component recognition accuracy is higher than 94%. The problems of overlapping absorption spectrum, using the same method for qualitative and quantitative analysis, and limit number of training sample, were solved. The method could be used in other mixture gas infrared spectrum analyses, promising theoretic and application values.

  16. The role of shoe design on the prediction of free torque at the shoe-surface interface using pressure insole technology.

    PubMed

    Weaver, Brian Thomas; Fitzsimons, Kathleen; Braman, Jerrod; Haut, Roger

    2016-09-01

    The goal of the current study was to expand on previous work to validate the use of pressure insole technology in conjunction with linear regression models to predict the free torque at the shoe-surface interface that is generated while wearing different athletic shoes. Three distinctly different shoe designs were utilised. The stiffness of each shoe was determined with a material's testing machine. Six participants wore each shoe that was fitted with an insole pressure measurement device and performed rotation trials on an embedded force plate. A pressure sensor mask was constructed from those sensors having a high linear correlation with free torque values. Linear regression models were developed to predict free torques from these pressure sensor data. The models were able to accurately predict their own free torque well (RMS error 3.72 ± 0.74 Nm), but not that of the other shoes (RMS error 10.43 ± 3.79 Nm). Models performing self-prediction were also able to measure differences in shoe stiffness. The results of the current study showed the need for participant-shoe specific linear regression models to insure high prediction accuracy of free torques from pressure sensor data during isolated internal and external rotations of the body with respect to a planted foot.

  17. INNOVATIVE INSTRUMENTATION AND ANALYSIS OF THE TEMPERATURE MEASUREMENT FOR HIGH TEMPERATURE GASIFICATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seong W. Lee

    2004-10-01

    The systematic tests of the gasifier simulator on the clean thermocouple were completed in this reporting period. Within the systematic tests on the clean thermocouple, five (5) factors were considered as the experimental parameters including air flow rate, water flow rate, fine dust particle amount, ammonia addition and high/low frequency device (electric motor). The fractional factorial design method was used in the experiment design with sixteen (16) data sets of readings. Analysis of Variances (ANOVA) was applied to the results from systematic tests. The ANOVA results show that the un-balanced motor vibration frequency did not have the significant impact onmore » the temperature changes in the gasifier simulator. For the fine dust particles testing, the amount of fine dust particles has significant impact to the temperature measurements in the gasifier simulator. The effects of the air and water on the temperature measurements show the same results as reported in the previous report. The ammonia concentration was included as an experimental parameter for the reducing environment in this reporting period. The ammonia concentration does not seem to be a significant factor on the temperature changes. The linear regression analysis was applied to the temperature reading with five (5) factors. The accuracy of the linear regression is relatively low, which is less than 10% accuracy. Nonlinear regression was also conducted to the temperature reading with the same factors. Since the experiments were designed in two (2) levels, the nonlinear regression is not very effective with the dataset (16 readings). An extra central point test was conducted. With the data of the center point testing, the accuracy of the nonlinear regression is much better than the linear regression.« less

  18. Quantification of 11-Nor-9-Carboxy-Δ9-Tetrahydrocannabinol in Human Oral Fluid by Gas Chromatography–Tandem Mass Spectrometry

    PubMed Central

    Barnes, Allan J.; Scheidweiler, Karl B.; Huestis, Marilyn A.

    2015-01-01

    A sensitive and specific method for the quantification of 11-nor-9-carboxy-Δ9-tetrahydrocannabinol (THCCOOH) in oral fluid collected with the Quantisal and Oral-Eze devices was developed and fully validated. Extracted analytes were derivatized with hexafluoroisopropanol and trifluoroacetic anhydride and quantified by gas chromatography–tandem mass spectrometry with negative chemical ionization. Standard curves, using linear least-squares regression with 1/x2 weighting were linear from 10 to 1000 ng/L with coefficients of determination >0.998 for both collection devices. Bias was 89.2%–112.6%, total imprecision 4.0%–5.1% coefficient of variation, and extraction efficiency >79.8% across the linear range for Quantisal-collected specimens. Bias was 84.6%–109.3%, total imprecision 3.6%–7.3% coefficient of variation, and extraction efficiency >92.6% for specimens collected with the Oral-Eze device at all 3 quality control concentrations (10, 120, and 750 ng/L). This effective high-throughput method reduces analysis time by 9 minutes per sample compared with our current 2-dimensional gas chromatography–mass spectrometry method and extends the capability of quantifying this important oral fluid analyte to gas chromatography–tandem mass spectrometry. This method was applied to the analysis of oral fluid specimens collected from individuals participating in controlled cannabis studies and will be effective for distinguishing passive environmental contamination from active cannabis smoking. PMID:24622724

  19. Feature Extraction of High-Dimensional Structures for Exploratory Analytics

    DTIC Science & Technology

    2013-04-01

    Comparison of Euclidean vs. geodesic distance. LDRs use metric based on the Euclidean distance between two points, while the NLDRs are based on...geodesic distance. An NLDR successfully unrolls the curved manifold, whereas an LDR fails. ...........................3 1 1. Introduction An...and classical metric multidimensional scaling, are a linear DR ( LDR ). An LDR is based on a linear combination of

  20. Does Nonlinear Modeling Play a Role in Plasmid Bioprocess Monitoring Using Fourier Transform Infrared Spectra?

    PubMed

    Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M

    2017-06-01

    The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.

  1. Multiple regression for physiological data analysis: the problem of multicollinearity.

    PubMed

    Slinker, B K; Glantz, S A

    1985-07-01

    Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.

  2. Element enrichment factor calculation using grain-size distribution and functional data regression.

    PubMed

    Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

    2015-01-01

    In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Who Will Win?: Predicting the Presidential Election Using Linear Regression

    ERIC Educational Resources Information Center

    Lamb, John H.

    2007-01-01

    This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…

  4. Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool.

    PubMed

    Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi

    2007-10-01

    Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.

  5. Baseline social amotivation predicts 1-year functioning in UHR subjects: A validation and prospective investigation.

    PubMed

    Lam, Max; Abdul Rashid, Nur Amirah; Lee, Sara-Ann; Lim, Jeanette; Foussias, George; Fervaha, Gagan; Ruhrman, Stephan; Remington, Gary; Lee, Jimmy

    2015-12-01

    Social amotivation and diminished expression have been reported to underlie negative symptomatology in schizophrenia. In the current study we sought to establish and validate these negative symptom domains in a large cohort of schizophrenia subjects (n=887) and individuals who are deemed to be Ultra-High Risk (UHR) for psychosis. Confirmatory factor analysis conducted on PANSS item domains demonstrate that the dual negative symptom domains exist in schizophrenia and UHR subjects. We further sought to examine if these negative symptom domains were associated with functioning in UHR subjects. Linear regression analyses confirmed that social amotivation predicted functioning in UHR subjects prospectively at 1 year follow up. Results suggest that the association between social amotivation and functioning is generalisable beyond schizophrenia populations to those who are at-risk of developing psychosis. Social amotivation may be an important dimensional clinical construct to be studied across a range of psychiatric conditions. Copyright © 2015 Elsevier B.V. and ECNP. All rights reserved.

  6. Smile detectors correlation

    NASA Astrophysics Data System (ADS)

    Yuksel, Kivanc; Chang, Xin; Skarbek, Władysław

    2017-08-01

    The novel smile recognition algorithm is presented based on extraction of 68 facial salient points (fp68) using the ensemble of regression trees. The smile detector exploits the Support Vector Machine linear model. It is trained with few hundreds exemplar images by SVM algorithm working in 136 dimensional space. It is shown by the strict statistical data analysis that such geometric detector strongly depends on the geometry of mouth opening area, measured by triangulation of outer lip contour. To this goal two Bayesian detectors were developed and compared with SVM detector. The first uses the mouth area in 2D image, while the second refers to the mouth area in 3D animated face model. The 3D modeling is based on Candide-3 model and it is performed in real time along with three smile detectors and statistics estimators. The mouth area/Bayesian detectors exhibit high correlation with fp68/SVM detector in a range [0:8; 1:0], depending mainly on light conditions and individual features with advantage of 3D technique, especially in hard light conditions.

  7. Characteristics of voxel prediction power in full-brain Granger causality analysis of fMRI data

    NASA Astrophysics Data System (ADS)

    Garg, Rahul; Cecchi, Guillermo A.; Rao, A. Ravishankar

    2011-03-01

    Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain activity.

  8. Prediction of Mutagenicity of Chemicals from Their Calculated Molecular Descriptors: A Case Study with Structurally Homogeneous versus Diverse Datasets.

    PubMed

    Basak, Subhash C; Majumdar, Subhabrata

    2015-01-01

    Variation in high-dimensional data is often caused by a few latent factors, and hence dimension reduction or variable selection techniques are often useful in gathering useful information from the data. In this paper we consider two such recent methods: Interrelated two-way clustering and envelope models. We couple these methods with traditional statistical procedures like ridge regression and linear discriminant analysis, and apply them on two data sets which have more predictors than samples (i.e. n < p scenario) and several types of molecular descriptors. One of these datasets consists of a congeneric group of Amines while the other has a much diverse collection compounds. The difference of prediction results between these two datasets for both the methods supports the hypothesis that for a congeneric set of compounds, descriptors of a certain type are enough to provide good QSAR models, but as the data set grows diverse including a variety of descriptors can improve model quality considerably.

  9. Optimizing the physical ergonomics indices for the use of partial pressure suits.

    PubMed

    Ding, Li; Li, Xianxue; Hedge, Alan; Hu, Huimin; Feathers, David; Qin, Zhifeng; Xiao, Huajun; Xue, Lihao; Zhou, Qianxiang

    2015-03-01

    This study developed an ergonomic evaluation system for the design of high-altitude partial pressure suits (PPSs). A total of twenty-one Chinese males participated in the experiment which tested three types of ergonomics indices (manipulative mission, operational reach and operational strength) were studied using a three-dimensional video-based motion capture system, a target-pointing board, a hand dynamometer, and a step-tread apparatus. In total, 36 ergonomics indices were evaluated and optimized using regression and fitting analysis. Some indices that were found to be linearly related and redundant were removed from the study. An optimal ergonomics index system was established that can be used to conveniently and quickly evaluate the performance of different pressurized/non-pressurized suit designs. The resulting ergonomics index system will provide a theoretical basis and practical guidance for mission planners, suit designers and engineers to design equipment for human use, and to aid in assessing partial pressure suits. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  10. Private traits and attributes are predictable from digital records of human behavior.

    PubMed

    Kosinski, Michal; Stillwell, David; Graepel, Thore

    2013-04-09

    We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait "Openness," prediction accuracy is close to the test-retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.

  11. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  12. Rational design of binder-free noble metal/metal oxide arrays with nanocauliflower structure for wide linear range nonenzymatic glucose detection

    PubMed Central

    Li, Zhenzhen; Xin, Yanmei; Zhang, Zhonghai; Wu, Hongjun; Wang, Peng

    2015-01-01

    One-dimensional nanocomposites of metal-oxide and noble metal were expected to present superior performance for nonenzymatic glucose detection due to its good conductivity and high catalytic activity inherited from noble metal and metal oxide respectively. As a proof of concept, we synthesized gold and copper oxide (Au/CuO) composite with unique one-dimensional nanocauliflowers structure. Due to the nature of the synthesis method, no any foreign binder was needed in keeping either Au or CuO in place. To the best of our knowledge, this is the first attempt in combining metal oxide and noble metal in a binder-free style for fabricating nonenzymatic glucose sensor. The Au/CuO nanocauliflowers with large electrochemical active surface and high electrolyte contact area would promise a wide linear range and high sensitive detection of glucose with good stability and reproducibility due to its good electrical conductivity of Au and high electrocatalytic activity of CuO. PMID:26068705

  13. Shape component analysis: structure-preserving dimension reduction on biological shape spaces.

    PubMed

    Lee, Hao-Chih; Liao, Tao; Zhang, Yongjie Jessica; Yang, Ge

    2016-03-01

    Quantitative shape analysis is required by a wide range of biological studies across diverse scales, ranging from molecules to cells and organisms. In particular, high-throughput and systems-level studies of biological structures and functions have started to produce large volumes of complex high-dimensional shape data. Analysis and understanding of high-dimensional biological shape data require dimension-reduction techniques. We have developed a technique for non-linear dimension reduction of 2D and 3D biological shape representations on their Riemannian spaces. A key feature of this technique is that it preserves distances between different shapes in an embedded low-dimensional shape space. We demonstrate an application of this technique by combining it with non-linear mean-shift clustering on the Riemannian spaces for unsupervised clustering of shapes of cellular organelles and proteins. Source code and data for reproducing results of this article are freely available at https://github.com/ccdlcmu/shape_component_analysis_Matlab The implementation was made in MATLAB and supported on MS Windows, Linux and Mac OS. geyang@andrew.cmu.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Experimental and computational prediction of glass transition temperature of drugs.

    PubMed

    Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S

    2014-12-22

    Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.

  15. Multiresponse semiparametric regression for modelling the effect of regional socio-economic variables on the use of information technology

    NASA Astrophysics Data System (ADS)

    Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania

    2017-03-01

    Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.

  16. Atmospheric refraction errors in laser ranging systems

    NASA Technical Reports Server (NTRS)

    Gardner, C. S.; Rowlett, J. R.

    1976-01-01

    The effects of horizontal refractivity gradients on the accuracy of laser ranging systems were investigated by ray tracing through three dimensional refractivity profiles. The profiles were generated by performing a multiple regression on measurements from seven or eight radiosondes, using a refractivity model which provided for both linear and quadratic variations in the horizontal direction. The range correction due to horizontal gradients was found to be an approximately sinusoidal function of azimuth having a minimum near 0 deg azimuth and a maximum near 180 deg azimuth. The peak to peak variation was approximately 5 centimeters at 10 deg elevation and decreased to less than 1 millimeter at 80 deg elevation.

  17. Accelerating cross-validation with total variation and its application to super-resolution imaging

    NASA Astrophysics Data System (ADS)

    Obuchi, Tomoyuki; Ikeda, Shiro; Akiyama, Kazunori; Kabashima, Yoshiyuki

    2017-12-01

    We develop an approximation formula for the cross-validation error (CVE) of a sparse linear regression penalized by ℓ_1-norm and total variation terms, which is based on a perturbative expansion utilizing the largeness of both the data dimensionality and the model. The developed formula allows us to reduce the necessary computational cost of the CVE evaluation significantly. The practicality of the formula is tested through application to simulated black-hole image reconstruction on the event-horizon scale with super resolution. The results demonstrate that our approximation reproduces the CVE values obtained via literally conducted cross-validation with reasonably good precision.

  18. Sparsity enabled cluster reduced-order models for control

    NASA Astrophysics Data System (ADS)

    Kaiser, Eurika; Morzyński, Marek; Daviller, Guillaume; Kutz, J. Nathan; Brunton, Bingni W.; Brunton, Steven L.

    2018-01-01

    Characterizing and controlling nonlinear, multi-scale phenomena are central goals in science and engineering. Cluster-based reduced-order modeling (CROM) was introduced to exploit the underlying low-dimensional dynamics of complex systems. CROM builds a data-driven discretization of the Perron-Frobenius operator, resulting in a probabilistic model for ensembles of trajectories. A key advantage of CROM is that it embeds nonlinear dynamics in a linear framework, which enables the application of standard linear techniques to the nonlinear system. CROM is typically computed on high-dimensional data; however, access to and computations on this full-state data limit the online implementation of CROM for prediction and control. Here, we address this key challenge by identifying a small subset of critical measurements to learn an efficient CROM, referred to as sparsity-enabled CROM. In particular, we leverage compressive measurements to faithfully embed the cluster geometry and preserve the probabilistic dynamics. Further, we show how to identify fewer optimized sensor locations tailored to a specific problem that outperform random measurements. Both of these sparsity-enabled sensing strategies significantly reduce the burden of data acquisition and processing for low-latency in-time estimation and control. We illustrate this unsupervised learning approach on three different high-dimensional nonlinear dynamical systems from fluids with increasing complexity, with one application in flow control. Sparsity-enabled CROM is a critical facilitator for real-time implementation on high-dimensional systems where full-state information may be inaccessible.

  19. Bayesian block-diagonal variable selection and model averaging

    PubMed Central

    Papaspiliopoulos, O.; Rossell, D.

    2018-01-01

    Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501

  20. Manifold Learning by Preserving Distance Orders.

    PubMed

    Ataer-Cansizoglu, Esra; Akcakaya, Murat; Orhan, Umut; Erdogmus, Deniz

    2014-03-01

    Nonlinear dimensionality reduction is essential for the analysis and the interpretation of high dimensional data sets. In this manuscript, we propose a distance order preserving manifold learning algorithm that extends the basic mean-squared error cost function used mainly in multidimensional scaling (MDS)-based methods. We develop a constrained optimization problem by assuming explicit constraints on the order of distances in the low-dimensional space. In this optimization problem, as a generalization of MDS, instead of forcing a linear relationship between the distances in the high-dimensional original and low-dimensional projection space, we learn a non-decreasing relation approximated by radial basis functions. We compare the proposed method with existing manifold learning algorithms using synthetic datasets based on the commonly used residual variance and proposed percentage of violated distance orders metrics. We also perform experiments on a retinal image dataset used in Retinopathy of Prematurity (ROP) diagnosis.

  1. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value

  2. Application of third molar development and eruption models in estimating dental age in Malay sub-adults.

    PubMed

    Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc

    2015-08-01

    The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  3. [Ultrasonic measurements of fetal thalamus, caudate nucleus and lenticular nucleus in prenatal diagnosis].

    PubMed

    Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei

    2015-05-19

    To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.

  4. Local Linear Regression for Data with AR Errors.

    PubMed

    Li, Runze; Li, Yan

    2009-07-01

    In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.

  5. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    PubMed

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  6. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  7. A three-dimensional spatial mapping approach to quantify fine-scale heterogeneity among leaves within canopies1

    PubMed Central

    Wingfield, Jenna L.; Ruane, Lauren G.; Patterson, Joshua D.

    2017-01-01

    Premise of the study: The three-dimensional structure of tree canopies creates environmental heterogeneity, which can differentially influence the chemistry, morphology, physiology, and/or phenology of leaves. Previous studies that subdivide canopy leaves into broad categories (i.e., “upper/lower”) fail to capture the differences in microenvironments experienced by leaves throughout the three-dimensional space of a canopy. Methods: We use a three-dimensional spatial mapping approach based on spherical polar coordinates to examine the fine-scale spatial distributions of photosynthetically active radiation (PAR) and the concentration of ultraviolet (UV)-absorbing compounds (A300) among leaves within the canopies of black mangroves (Avicennia germinans). Results: Linear regressions revealed that interior leaves received less PAR and produced fewer UV-absorbing compounds than leaves on the exterior of the canopy. By allocating more UV-absorbing compounds to the leaves on the exterior of the canopy, black mangroves may be maximizing UV-protection while minimizing biosynthesis of UV-absorbing compounds. Discussion: Three-dimensional spatial mapping provides an inexpensive and portable method to detect fine-scale differences in environmental and biological traits within canopies. We used it to understand the relationship between PAR and A300, but the same approach can also be used to identify traits associated with the spatial distribution of herbivores, pollinators, and pathogens. PMID:29188145

  8. A preliminary study of the thermal measurement with nMAG gel dosimeter by MRI

    NASA Astrophysics Data System (ADS)

    Chuang, Chun-Chao; Shao, Chia-Ho; Shih, Cheng-Ting; Yeh, Yu-Chen; Lu, Cheng-Chang; Chuang, Keh-Shih; Wu, Jay

    2014-11-01

    The methacrylic acid (nMAG) gel dosimeter is an effective tool for 3-dimensional quality assurance of radiation therapy. In addition to radiation induced polymerization effects, the nMAG gel also responds to temperature variation. In this study, we proposed a new method to evaluate the thermal response in thermal therapy using nMAG gel and magnetic resonance image (MRI) scans. Several properties of nMAG have been investigated including the R2 relaxation rate, temperature sensitivity, and temperature linearity of the thermal dose response. nMAG was heated by the double-boiling method in the range of 37-45 °C. MRI scans were performed with the head coil receiver. The temperature to R2 response curve was analyzed and simple linear regression was performed with an R-square value of 0.9835. The measured data showed a well inverse linear relationship between R2 and temperature. We conclude that the nMAG polymer gel dosimeter shows great potential as a technique to evaluate the temperature rise during thermal surgery.

  9. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  10. Multigrid approaches to non-linear diffusion problems on unstructured meshes

    NASA Technical Reports Server (NTRS)

    Mavriplis, Dimitri J.; Bushnell, Dennis M. (Technical Monitor)

    2001-01-01

    The efficiency of three multigrid methods for solving highly non-linear diffusion problems on two-dimensional unstructured meshes is examined. The three multigrid methods differ mainly in the manner in which the nonlinearities of the governing equations are handled. These comprise a non-linear full approximation storage (FAS) multigrid method which is used to solve the non-linear equations directly, a linear multigrid method which is used to solve the linear system arising from a Newton linearization of the non-linear system, and a hybrid scheme which is based on a non-linear FAS multigrid scheme, but employs a linear solver on each level as a smoother. Results indicate that all methods are equally effective at converging the non-linear residual in a given number of grid sweeps, but that the linear solver is more efficient in cpu time due to the lower cost of linear versus non-linear grid sweeps.

  11. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection.

    PubMed

    Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong

    2012-01-01

    Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.

  12. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection

    PubMed Central

    2012-01-01

    Background Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. Results In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Conclusions Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics. PMID:23046877

  13. Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

    PubMed Central

    Lin, Wei; Feng, Rui; Li, Hongzhe

    2014-01-01

    In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be much larger than the sample size. Motivated by such modern applications, we consider the problem of variable selection and estimation in high-dimensional sparse instrumental variables models. To overcome the difficulty of high dimensionality and unknown optimal instruments, we propose a two-stage regularization framework for identifying and estimating important covariate effects while selecting and estimating optimal instruments. The methodology extends the classical two-stage least squares estimator to high dimensions by exploiting sparsity using sparsity-inducing penalty functions in both stages. The resulting procedure is efficiently implemented by coordinate descent optimization. For the representative L1 regularization and a class of concave regularization methods, we establish estimation, prediction, and model selection properties of the two-stage regularized estimators in the high-dimensional setting where the dimensionality of co-variates and instruments are both allowed to grow exponentially with the sample size. The practical performance of the proposed method is evaluated by simulation studies and its usefulness is illustrated by an analysis of mouse obesity data. Supplementary materials for this article are available online. PMID:26392642

  14. Biomechanical factors associated with mandibular cantilevers: analysis with three-dimensional finite element models.

    PubMed

    Gonda, Tomoya; Yasuda, Daiisa; Ikebe, Kazunori; Maeda, Yoshinobu

    2014-01-01

    Although the risks of using a cantilever to treat missing teeth have been described, the mechanisms remain unclear. This study aimed to reveal these mechanisms from a biomechanical perspective. The effects of various implant sites, number of implants, and superstructural connections on stress distribution in the marginal bone were analyzed with three-dimensional finite element models based on mandibular computed tomography data. Forces from the masseter, temporalis, and internal pterygoid were applied as vectors. Two three-dimensional finite element models were created with the edentulous mandible showing severe and relatively modest residual ridge resorption. Cantilevers of the premolar and molar were simulated in the superstructures in the models. The following conditions were also included as factors in the models to investigate changes: poor bone quality, shortened dental arch, posterior occlusion, lateral occlusion, double force of the masseter, and short implant. Multiple linear regression analysis with a forced-entry method was performed with stress values as the objective variable and the factors as the explanatory variable. When bone mass was high, stress around the implant caused by differences in implantation sites was reduced. When bone mass was low, the presence of a cantilever was a possible risk factor. The stress around the implant increased significantly if bone quality was poor or if increased force (eg, bruxism) was applied. The addition of a cantilever to the superstructure increased stress around implants. When large muscle forces were applied to a superstructure with cantilevers or if bone quality was poor, stress around the implants increased.

  15. A novel strategy for forensic age prediction by DNA methylation and support vector regression model

    PubMed Central

    Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei

    2015-01-01

    High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134

  16. Poisson Mixture Regression Models for Heart Disease Prediction.

    PubMed

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  17. Poisson Mixture Regression Models for Heart Disease Prediction

    PubMed Central

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  18. Morse Code, Scrabble, and the Alphabet

    ERIC Educational Resources Information Center

    Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

    2004-01-01

    In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…

  19. Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method

    NASA Astrophysics Data System (ADS)

    Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty

    2017-03-01

    Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.

  20. Bayesian Travel Time Inversion adopting Gaussian Process Regression

    NASA Astrophysics Data System (ADS)

    Mauerberger, S.; Holschneider, M.

    2017-12-01

    A major application in seismology is the determination of seismic velocity models. Travel time measurements are putting an integral constraint on the velocity between source and receiver. We provide insight into travel time inversion from a correlation-based Bayesian point of view. Therefore, the concept of Gaussian process regression is adopted to estimate a velocity model. The non-linear travel time integral is approximated by a 1st order Taylor expansion. A heuristic covariance describes correlations amongst observations and a priori model. That approach enables us to assess a proxy of the Bayesian posterior distribution at ordinary computational costs. No multi dimensional numeric integration nor excessive sampling is necessary. Instead of stacking the data, we suggest to progressively build the posterior distribution. Incorporating only a single evidence at a time accounts for the deficit of linearization. As a result, the most probable model is given by the posterior mean whereas uncertainties are described by the posterior covariance.As a proof of concept, a synthetic purely 1d model is addressed. Therefore a single source accompanied by multiple receivers is considered on top of a model comprising a discontinuity. We consider travel times of both phases - direct and reflected wave - corrupted by noise. Left and right of the interface are assumed independent where the squared exponential kernel serves as covariance.

  1. Study on longitudinal dispersion relation in one-dimensional relativistic plasma: Linear theory and Vlasov simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, H.; Wu, S. Z.; Zhou, C. T.

    2013-09-15

    The dispersion relation of one-dimensional longitudinal plasma waves in relativistic homogeneous plasmas is investigated with both linear theory and Vlasov simulation in this paper. From the Vlasov-Poisson equations, the linear dispersion relation is derived for the proper one-dimensional Jüttner distribution. Numerically obtained linear dispersion relation as well as an approximate formula for plasma wave frequency in the long wavelength limit is given. The dispersion of longitudinal wave is also simulated with a relativistic Vlasov code. The real and imaginary parts of dispersion relation are well studied by varying wave number and plasma temperature. Simulation results are in agreement with establishedmore » linear theory.« less

  2. Numerical approximation for the infinite-dimensional discrete-time optimal linear-quadratic regulator problem

    NASA Technical Reports Server (NTRS)

    Gibson, J. S.; Rosen, I. G.

    1986-01-01

    An abstract approximation framework is developed for the finite and infinite time horizon discrete-time linear-quadratic regulator problem for systems whose state dynamics are described by a linear semigroup of operators on an infinite dimensional Hilbert space. The schemes included the framework yield finite dimensional approximations to the linear state feedback gains which determine the optimal control law. Convergence arguments are given. Examples involving hereditary and parabolic systems and the vibration of a flexible beam are considered. Spline-based finite element schemes for these classes of problems, together with numerical results, are presented and discussed.

  3. Application of Linear Discriminant Analysis in Dimensionality Reduction for Hand Motion Classification

    NASA Astrophysics Data System (ADS)

    Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.

    2012-01-01

    The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.

  4. A hybrid-stress finite element approach for stress and vibration analysis in linear anisotropic elasticity

    NASA Technical Reports Server (NTRS)

    Oden, J. Tinsley; Fly, Gerald W.; Mahadevan, L.

    1987-01-01

    A hybrid stress finite element method is developed for accurate stress and vibration analysis of problems in linear anisotropic elasticity. A modified form of the Hellinger-Reissner principle is formulated for dynamic analysis and an algorithm for the determination of the anisotropic elastic and compliance constants from experimental data is developed. These schemes were implemented in a finite element program for static and dynamic analysis of linear anisotropic two dimensional elasticity problems. Specific numerical examples are considered to verify the accuracy of the hybrid stress approach and compare it with that of the standard displacement method, especially for highly anisotropic materials. It is that the hybrid stress approach gives much better results than the displacement method. Preliminary work on extensions of this method to three dimensional elasticity is discussed, and the stress shape functions necessary for this extension are included.

  5. A one-dimensional nonlinear problem of thermoelasticity in extended thermodynamics

    NASA Astrophysics Data System (ADS)

    Rawy, E. K.

    2018-06-01

    We solve a nonlinear, one-dimensional initial boundary-value problem of thermoelasticity in generalized thermodynamics. A Cattaneo-type evolution equation for the heat flux is used, which differs from the one used extensively in the literature. The hyperbolic nature of the associated linear system is clarified through a study of the characteristic curves. Progressive wave solutions with two finite speeds are noted. A numerical treatment is presented for the nonlinear system using a three-step, quasi-linearization, iterative finite-difference scheme for which the linear system of equations is the initial step in the iteration. The obtained results are discussed in detail. They clearly show the hyperbolic nature of the system, and may be of interest in investigating thermoelastic materials, not only at low temperatures, but also during high temperature processes involving rapid changes in temperature as in laser treatment of surfaces.

  6. The increase in symptoms of anxiety and depressed mood among Icelandic adolescents: time trend between 2006 and 2016.

    PubMed

    Thorisdottir, Ingibjorg E; Asgeirsdottir, Bryndis B; Sigurvinsdottir, Rannveig; Allegrante, John P; Sigfusdottir, Inga D

    2017-10-01

    Both research and popular media reports suggest that adolescent mental health has been deteriorating across societies with advanced economies. This study sought to describe the trends in self-reported symptoms of depressed mood and anxiety among Icelandic adolescents. Data for this study come from repeated, cross-sectional, population-based school surveys of 43 482 Icelandic adolescents in 9th and 10th grade, with six waves of pooled data from 2006 to 2016. We used analysis of variance, linear regression and binomial logistic regression to examine trends in symptom scores of anxiety and depressed mood over time. Gender differences in trends of high symptoms were also tested for interactions. Linear regression analysis showed a significant linear increase over the course of the study period in mean symptoms of anxiety and depressed mood for girls only; however, symptoms of anxiety among boys decreased. The proportion of adolescents reporting high depressive symptoms increased by 1.6% for boys and 6.8% for girls; the proportion of those reporting high anxiety symptoms increased by 1.3% for boys and 8.6% for girls. Over the study period, the odds for reporting high depressive symptoms and high anxiety symptoms were significantly higher for both genders. Girls were more likely to report high symptoms of anxiety and depressed mood than boys. Self-reported symptoms of anxiety and depressed mood have increased over time among Icelandic adolescents. Our findings suggest that future research needs to look beyond mean changes and examine the trends among those adolescents who report high symptoms of emotional distress. © The Author 2017. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.

  7. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  8. Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

    NASA Astrophysics Data System (ADS)

    Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

    2017-11-01

    Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.

  9. Optical probing of the metal-to-insulator transition in a two-dimensional high-mobility electron gas

    NASA Astrophysics Data System (ADS)

    Dionigi, F.; Rossella, F.; Bellani, V.; Amado, M.; Diez, E.; Kowalik, K.; Biasiol, G.; Sorba, L.

    2011-06-01

    We study the quantum Hall liquid and the metal-insulator transition in a high-mobility two-dimensional electron gas, by means of photoluminescence and magnetotransport measurements. In the integer and fractional regime at ν>1/3, by analyzing the emission energy dispersion we probe the magneto-Coulomb screening and the hidden symmetry of the electron liquid. In the fractional regime above ν=1/3, the system undergoes metal-to-insulator transition, and in the insulating phase the dispersion becomes linear with evidence of an increased renormalized mass.

  10. A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

    PubMed

    Ferrari, Alberto; Comelli, Mario

    2016-12-01

    In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  12. Mapping High Dimensional Sparse Customer Requirements into Product Configurations

    NASA Astrophysics Data System (ADS)

    Jiao, Yao; Yang, Yu; Zhang, Hongshan

    2017-10-01

    Mapping customer requirements into product configurations is a crucial step for product design, while, customers express their needs ambiguously and locally due to the lack of domain knowledge. Thus the data mining process of customer requirements might result in fragmental information with high dimensional sparsity, leading the mapping procedure risk uncertainty and complexity. The Expert Judgment is widely applied against that background since there is no formal requirements for systematic or structural data. However, there are concerns on the repeatability and bias for Expert Judgment. In this study, an integrated method by adjusted Local Linear Embedding (LLE) and Naïve Bayes (NB) classifier is proposed to map high dimensional sparse customer requirements to product configurations. The integrated method adjusts classical LLE to preprocess high dimensional sparse dataset to satisfy the prerequisite of NB for classifying different customer requirements to corresponding product configurations. Compared with Expert Judgment, the adjusted LLE with NB performs much better in a real-world Tablet PC design case both in accuracy and robustness.

  13. Forecasting transitions in systems with high-dimensional stochastic complex dynamics: a linear stability analysis of the tangled nature model.

    PubMed

    Cairoli, Andrea; Piovani, Duccio; Jensen, Henrik Jeldtoft

    2014-12-31

    We propose a new procedure to monitor and forecast the onset of transitions in high-dimensional complex systems. We describe our procedure by an application to the tangled nature model of evolutionary ecology. The quasistable configurations of the full stochastic dynamics are taken as input for a stability analysis by means of the deterministic mean-field equations. Numerical analysis of the high-dimensional stability matrix allows us to identify unstable directions associated with eigenvalues with a positive real part. The overlap of the instantaneous configuration vector of the full stochastic system with the eigenvectors of the unstable directions of the deterministic mean-field approximation is found to be a good early warning of the transitions occurring intermittently.

  14. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  15. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  16. Estimating integrated variance in the presence of microstructure noise using linear regression

    NASA Astrophysics Data System (ADS)

    Holý, Vladimír

    2017-07-01

    Using financial high-frequency data for estimation of integrated variance of asset prices is beneficial but with increasing number of observations so-called microstructure noise occurs. This noise can significantly bias the realized variance estimator. We propose a method for estimation of the integrated variance robust to microstructure noise as well as for testing the presence of the noise. Our method utilizes linear regression in which realized variances estimated from different data subsamples act as dependent variable while the number of observations act as explanatory variable. We compare proposed estimator with other methods on simulated data for several microstructure noise structures.

  17. Face Hallucination with Linear Regression Model in Semi-Orthogonal Multilinear PCA Method

    NASA Astrophysics Data System (ADS)

    Asavaskulkiet, Krissada

    2018-04-01

    In this paper, we propose a new face hallucination technique, face images reconstruction in HSV color space with a semi-orthogonal multilinear principal component analysis method. This novel hallucination technique can perform directly from tensors via tensor-to-vector projection by imposing the orthogonality constraint in only one mode. In our experiments, we use facial images from FERET database to test our hallucination approach which is demonstrated by extensive experiments with high-quality hallucinated color faces. The experimental results assure clearly demonstrated that we can generate photorealistic color face images by using the SO-MPCA subspace with a linear regression model.

  18. Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

    NASA Astrophysics Data System (ADS)

    Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

    2008-04-01

    Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.

  19. Statistical downscaling of precipitation using long short-term memory recurrent neural networks

    NASA Astrophysics Data System (ADS)

    Misra, Saptarshi; Sarkar, Sudeshna; Mitra, Pabitra

    2017-11-01

    Hydrological impacts of global climate change on regional scale are generally assessed by downscaling large-scale climatic variables, simulated by General Circulation Models (GCMs), to regional, small-scale hydrometeorological variables like precipitation, temperature, etc. In this study, we propose a new statistical downscaling model based on Recurrent Neural Network with Long Short-Term Memory which captures the spatio-temporal dependencies in local rainfall. The previous studies have used several other methods such as linear regression, quantile regression, kernel regression, beta regression, and artificial neural networks. Deep neural networks and recurrent neural networks have been shown to be highly promising in modeling complex and highly non-linear relationships between input and output variables in different domains and hence we investigated their performance in the task of statistical downscaling. We have tested this model on two datasets—one on precipitation in Mahanadi basin in India and the second on precipitation in Campbell River basin in Canada. Our autoencoder coupled long short-term memory recurrent neural network model performs the best compared to other existing methods on both the datasets with respect to temporal cross-correlation, mean squared error, and capturing the extremes.

  20. Penalized Ordinal Regression Methods for Predicting Stage of Cancer in High-Dimensional Covariate Spaces.

    PubMed

    Gentry, Amanda Elswick; Jackson-Cook, Colleen K; Lyon, Debra E; Archer, Kellie J

    2015-01-01

    The pathological description of the stage of a tumor is an important clinical designation and is considered, like many other forms of biomedical data, an ordinal outcome. Currently, statistical methods for predicting an ordinal outcome using clinical, demographic, and high-dimensional correlated features are lacking. In this paper, we propose a method that fits an ordinal response model to predict an ordinal outcome for high-dimensional covariate spaces. Our method penalizes some covariates (high-throughput genomic features) without penalizing others (such as demographic and/or clinical covariates). We demonstrate the application of our method to predict the stage of breast cancer. In our model, breast cancer subtype is a nonpenalized predictor, and CpG site methylation values from the Illumina Human Methylation 450K assay are penalized predictors. The method has been made available in the ordinalgmifs package in the R programming environment.

  1. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  2. Mathematics Readiness of First-Year University Students

    ERIC Educational Resources Information Center

    Atuahene, Francis; Russell, Tammy A.

    2016-01-01

    The majority of high school students, particularly underrepresented minorities (URMs) from low socioeconomic backgrounds are graduating from high school less prepared academically for advanced-level college mathematics. Using 2009 and 2010 course enrollment data, several statistical analyses (multiple linear regression, Cochran Mantel Haenszel…

  3. Simplified large African carnivore density estimators from track indices.

    PubMed

    Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J

    2016-01-01

    The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y  =  αx  + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P  > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P  < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.

  4. Abdominal girth, vertebral column length, and spread of spinal anesthesia in 30 minutes after plain bupivacaine 5 mg/mL.

    PubMed

    Zhou, Qing-he; Xiao, Wang-pin; Shen, Ying-yan

    2014-07-01

    The spread of spinal anesthesia is highly unpredictable. In patients with increased abdominal girth and short stature, a greater cephalad spread after a fixed amount of subarachnoidally administered plain bupivacaine is often observed. We hypothesized that there is a strong correlation between abdominal girth/vertebral column length and cephalad spread. Age, weight, height, body mass index, abdominal girth, and vertebral column length were recorded for 114 patients. The L3-L4 interspace was entered, and 3 mL of 0.5% plain bupivacaine was injected into the subarachnoid space. The cephalad spread (loss of temperature sensation and loss of pinprick discrimination) was assessed 30 minutes after intrathecal injection. Linear regression analysis was performed for age, weight, height, body mass index, abdominal girth, vertebral column length, and the spread of spinal anesthesia, and the combined linear contribution of age up to 55 years, weight, height, abdominal girth, and vertebral column length was tested by multiple regression analysis. Linear regression analysis showed that there was a significant univariate correlation among all 6 patient characteristics evaluated and the spread of spinal anesthesia (all P < 0.039) except for age and loss of temperature sensation (P > 0.068). Multiple regression analysis showed that abdominal girth and the vertebral column length were the key determinants for spinal anesthesia spread (both P < 0.0001), whereas age, weight, and height could be omitted without changing the results (all P > 0.059, all 95% confidence limits < 0.372). Multiple regression analysis revealed that the combination of a patient's 5 general characteristics, especially abdominal girth and vertebral column length, had a high predictive value for the spread of spinal anesthesia after a given dose of plain bupivacaine.

  5. River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

    NASA Astrophysics Data System (ADS)

    Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

    2018-06-01

    Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.

  6. Multivariate functional response regression, with application to fluorescence spectroscopy in a cervical pre-cancer study.

    PubMed

    Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D

    2017-07-01

    Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.

  7. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  8. Direct Linear Transformation Method for Three-Dimensional Cinematography

    ERIC Educational Resources Information Center

    Shapiro, Robert

    1978-01-01

    The ability of Direct Linear Transformation Method for three-dimensional cinematography to locate points in space was shown to meet the accuracy requirements associated with research on human movement. (JD)

  9. Prediction models for CO2 emission in Malaysia using best subsets regression and multi-linear regression

    NASA Astrophysics Data System (ADS)

    Tan, C. H.; Matjafri, M. Z.; Lim, H. S.

    2015-10-01

    This paper presents the prediction models which analyze and compute the CO2 emission in Malaysia. Each prediction model for CO2 emission will be analyzed based on three main groups which is transportation, electricity and heat production as well as residential buildings and commercial and public services. The prediction models were generated using data obtained from World Bank Open Data. Best subset method will be used to remove irrelevant data and followed by multi linear regression to produce the prediction models. From the results, high R-square (prediction) value was obtained and this implies that the models are reliable to predict the CO2 emission by using specific data. In addition, the CO2 emissions from these three groups are forecasted using trend analysis plots for observation purpose.

  10. Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres.

    PubMed

    Banerjee, Arindam; Ghosh, Joydeep

    2004-05-01

    Competitive learning mechanisms for clustering, in general, suffer from poor performance for very high-dimensional (>1000) data because of "curse of dimensionality" effects. In applications such as document clustering, it is customary to normalize the high-dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regular kmeans and its soft expectation-maximization-based version, spkmeans tends to generate extremely imbalanced clusters in high-dimensional spaces when the desired number of clusters is large (tens or more). This paper first shows that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model, and in fact, it can be considered as a batch-mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency-sensitive competitive learning variants that are applicable to static data and produced high-quality and well-balanced clusters for high-dimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. A frequency-sensitive algorithm to cluster streaming data is also proposed. Experimental results on clustering of high-dimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques. Index Terms-Balanced clustering, expectation maximization (EM), frequency-sensitive competitive learning (FSCL), high-dimensional clustering, kmeans, normalized data, scalable clustering, streaming data, text clustering.

  11. Unsupervised spatiotemporal analysis of fMRI data using graph-based visualizations of self-organizing maps.

    PubMed

    Katwal, Santosh B; Gore, John C; Marois, Rene; Rogers, Baxter P

    2013-09-01

    We present novel graph-based visualizations of self-organizing maps for unsupervised functional magnetic resonance imaging (fMRI) analysis. A self-organizing map is an artificial neural network model that transforms high-dimensional data into a low-dimensional (often a 2-D) map using unsupervised learning. However, a postprocessing scheme is necessary to correctly interpret similarity between neighboring node prototypes (feature vectors) on the output map and delineate clusters and features of interest in the data. In this paper, we used graph-based visualizations to capture fMRI data features based upon 1) the distribution of data across the receptive fields of the prototypes (density-based connectivity); and 2) temporal similarities (correlations) between the prototypes (correlation-based connectivity). We applied this approach to identify task-related brain areas in an fMRI reaction time experiment involving a visuo-manual response task, and we correlated the time-to-peak of the fMRI responses in these areas with reaction time. Visualization of self-organizing maps outperformed independent component analysis and voxelwise univariate linear regression analysis in identifying and classifying relevant brain regions. We conclude that the graph-based visualizations of self-organizing maps help in advanced visualization of cluster boundaries in fMRI data enabling the separation of regions with small differences in the timings of their brain responses.

  12. An enhanced data visualization method for diesel engine malfunction classification using multi-sensor signals.

    PubMed

    Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan

    2015-10-21

    The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine.

  13. An Enhanced Data Visualization Method for Diesel Engine Malfunction Classification Using Multi-Sensor Signals

    PubMed Central

    Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan

    2015-01-01

    The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine. PMID:26506347

  14. Improved quantification of important beer quality parameters based on nonlinear calibration methods applied to FT-MIR spectra.

    PubMed

    Cernuda, Carlos; Lughofer, Edwin; Klein, Helmut; Forster, Clemens; Pawliczek, Marcin; Brandstetter, Markus

    2017-01-01

    During the production process of beer, it is of utmost importance to guarantee a high consistency of the beer quality. For instance, the bitterness is an essential quality parameter which has to be controlled within the specifications at the beginning of the production process in the unfermented beer (wort) as well as in final products such as beer and beer mix beverages. Nowadays, analytical techniques for quality control in beer production are mainly based on manual supervision, i.e., samples are taken from the process and analyzed in the laboratory. This typically requires significant lab technicians efforts for only a small fraction of samples to be analyzed, which leads to significant costs for beer breweries and companies. Fourier transform mid-infrared (FT-MIR) spectroscopy was used in combination with nonlinear multivariate calibration techniques to overcome (i) the time consuming off-line analyses in beer production and (ii) already known limitations of standard linear chemometric methods, like partial least squares (PLS), for important quality parameters Speers et al. (J I Brewing. 2003;109(3):229-235), Zhang et al. (J I Brewing. 2012;118(4):361-367) such as bitterness, citric acid, total acids, free amino nitrogen, final attenuation, or foam stability. The calibration models are established with enhanced nonlinear techniques based (i) on a new piece-wise linear version of PLS by employing fuzzy rules for local partitioning the latent variable space and (ii) on extensions of support vector regression variants (-PLSSVR and ν-PLSSVR), for overcoming high computation times in high-dimensional problems and time-intensive and inappropriate settings of the kernel parameters. Furthermore, we introduce a new model selection scheme based on bagged ensembles in order to improve robustness and thus predictive quality of the final models. The approaches are tested on real-world calibration data sets for wort and beer mix beverages, and successfully compared to linear methods, showing a clear out-performance in most cases and being able to meet the model quality requirements defined by the experts at the beer company. Figure Workflow for calibration of non-Linear model ensembles from FT-MIR spectra in beer production .

  15. Retention modelling of polychlorinated biphenyls in comprehensive two-dimensional gas chromatography.

    PubMed

    D'Archivio, Angelo Antonio; Incani, Angela; Ruggieri, Fabrizio

    2011-01-01

    In this paper, we use a quantitative structure-retention relationship (QSRR) method to predict the retention times of polychlorinated biphenyls (PCBs) in comprehensive two-dimensional gas chromatography (GC×GC). We analyse the GC×GC retention data taken from the literature by comparing predictive capability of different regression methods. The various models are generated using 70 out of 209 PCB congeners in the calibration stage, while their predictive performance is evaluated on the remaining 139 compounds. The two-dimensional chromatogram is initially estimated by separately modelling retention times of PCBs in the first and in the second column ((1) t (R) and (2) t (R), respectively). In particular, multilinear regression (MLR) combined with genetic algorithm (GA) variable selection is performed to extract two small subsets of predictors for (1) t (R) and (2) t (R) from a large set of theoretical molecular descriptors provided by the popular software Dragon, which after removal of highly correlated or almost constant variables consists of 237 structure-related quantities. Based on GA-MLR analysis, a four-dimensional and a five-dimensional relationship modelling (1) t (R) and (2) t (R), respectively, are identified. Single-response partial least square (PLS-1) regression is alternatively applied to independently model (1) t (R) and (2) t (R) without the need for preliminary GA variable selection. Further, we explore the possibility of predicting the two-dimensional chromatogram of PCBs in a single calibration procedure by using a two-response PLS (PLS-2) model or a feed-forward artificial neural network (ANN) with two output neurons. In the first case, regression is carried out on the full set of 237 descriptors, while the variables previously selected by GA-MLR are initially considered as ANN inputs and subjected to a sensitivity analysis to remove the redundant ones. Results show PLS-1 regression exhibits a noticeably better descriptive and predictive performance than the other investigated approaches. The observed values of determination coefficients for (1) t (R) and (2) t (R) in calibration (0.9999 and 0.9993, respectively) and prediction (0.9987 and 0.9793, respectively) provided by PLS-1 demonstrate that GC×GC behaviour of PCBs is properly modelled. In particular, the predicted two-dimensional GC×GC chromatogram of 139 PCBs not involved in the calibration stage closely resembles the experimental one. Based on the above lines of evidence, the proposed approach ensures accurate simulation of the whole GC×GC chromatogram of PCBs using experimental determination of only 1/3 retention data of representative congeners.

  16. Placental three-dimensional power Doppler indices in mid-pregnancy and late pregnancy complicated by gestational diabetes mellitus.

    PubMed

    Surányi, A; Kozinszky, Z; Molnár, A; Nyári, T; Bitó, T; Pál, A

    2013-10-01

    The aim of our study was to evaluate placental three-dimensional power Doppler indices in diabetic pregnancies in the second and third trimesters and to compare them with those of the normal controls. Placental vascularization of pregnant women was determined by three-dimensional power Doppler ultrasound technique. The calculated indices included vascularization index (VI), flow index (FI), and vascularization flow index (VFI). Uncomplicated pregnancies (n = 113) were compared with pregnancies complicated by gestational diabetes mellitus (n = 56) and diabetes mellitus (n = 43). The three-dimensional power Doppler indices were not significantly different between the two diabetic subgroups. All the indices in diabetic patients were significantly reduced compared with those in non-diabetic individuals (p < 0.001). Placental three-dimensional power Doppler indices are slightly diminished throughout diabetic pregnancy [regression coefficients: -0.23 (FI), -0.06 (VI), and -0.04 (VFI)] and normal pregnancy [regression coefficients: -0.13 (FI), -0.20 (VI), and -0.11 (VFI)]. The uteroplacental circulation (umbilical and uterine artery) was not correlated significantly to the three-dimensional power Doppler indices. If all placental indices are low during late pregnancy, then the odds of the diabetes are significantly high (adjusted odds ratio: 1.10). A decreased placental vascularization could be an adjunct sonographic marker in the diagnosis of diabetic pregnancy in mid-gestation and late gestation. © 2013 John Wiley & Sons, Ltd.

  17. Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics

    NASA Astrophysics Data System (ADS)

    Wehmeyer, Christoph; Noé, Frank

    2018-06-01

    Inspired by the success of deep learning techniques in the physical and chemical sciences, we apply a modification of an autoencoder type deep neural network to the task of dimension reduction of molecular dynamics data. We can show that our time-lagged autoencoder reliably finds low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stochastic processes—beyond the capabilities of linear dimension reduction techniques.

  18. Study of three-dimensional effects on vortex breakdown

    NASA Technical Reports Server (NTRS)

    Salas, M. D.; Kuruvila, G.

    1988-01-01

    The incompressible axisymmetric steady Navier-Stokes equations in primitive variables are used to simulate vortex breakdown. The equations, discretized using a second-order, central-difference scheme, are linearized and then solved using an exact LU decomposition, Gaussian elimination, and Newton iteration. Solutions are presented for Reynolds numbers, based on vortex-core radius, as high as 1500. An attempt to study the stability of the axisymmetric solutions against three-dimensional perturbations is discussed.

  19. Comparative study of microwave radiation-induced magnetoresistive oscillations induced by circularly- and linearly- polarized photo-excitation

    PubMed Central

    Ye, Tianyu; Liu, Han-Chun; Wang, Zhuo; Wegscheider, W.; Mani, Ramesh G.

    2015-01-01

    A comparative study of the radiation-induced magnetoresistance oscillations in the high mobility GaAs/AlGaAs heterostructure two dimensional electron system (2DES) under linearly- and circularly- polarized microwave excitation indicates a profound difference in the response observed upon rotating the microwave launcher for the two cases, although circularly polarized microwave radiation induced magnetoresistance oscillations observed at low magnetic fields are similar to the oscillations observed with linearly polarized radiation. For the linearly polarized radiation, the magnetoresistive response is a strong sinusoidal function of the launcher rotation (or linear polarization) angle, θ. For circularly polarized radiation, the oscillatory magnetoresistive response is hardly sensitive to θ. PMID:26450679

  20. Comparative study of microwave radiation-induced magnetoresistive oscillations induced by circularly- and linearly- polarized photo-excitation.

    PubMed

    Ye, Tianyu; Liu, Han-Chun; Wang, Zhuo; Wegscheider, W; Mani, Ramesh G

    2015-10-09

    A comparative study of the radiation-induced magnetoresistance oscillations in the high mobility GaAs/AlGaAs heterostructure two dimensional electron system (2DES) under linearly- and circularly- polarized microwave excitation indicates a profound difference in the response observed upon rotating the microwave launcher for the two cases, although circularly polarized microwave radiation induced magnetoresistance oscillations observed at low magnetic fields are similar to the oscillations observed with linearly polarized radiation. For the linearly polarized radiation, the magnetoresistive response is a strong sinusoidal function of the launcher rotation (or linear polarization) angle, θ. For circularly polarized radiation, the oscillatory magnetoresistive response is hardly sensitive to θ.

  1. Does transport time help explain the high trauma mortality rates in rural areas? New and traditional predictors assessed by new and traditional statistical methods

    PubMed Central

    Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas

    2015-01-01

    Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600

  2. Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

    PubMed

    Hemmila, April; McGill, Jim; Ritter, David

    2008-03-01

    To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.

  3. Linearity versus Nonlinearity of Offspring-Parent Regression: An Experimental Study of Drosophila Melanogaster

    PubMed Central

    Gimelfarb, A.; Willis, J. H.

    1994-01-01

    An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818

  4. How many stakes are required to measure the mass balance of a glacier?

    USGS Publications Warehouse

    Fountain, A.G.; Vecchia, A.

    1999-01-01

    Glacier mass balance is estimated for South Cascade Glacier and Maclure Glacier using a one-dimensional regression of mass balance with altitude as an alternative to the traditional approach of contouring mass balance values. One attractive feature of regression is that it can be applied to sparse data sets where contouring is not possible and can provide an objective error of the resulting estimate. Regression methods yielded mass balance values equivalent to contouring methods. The effect of the number of mass balance measurements on the final value for the glacier showed that sample sizes as small as five stakes provided reasonable estimates, although the error estimates were greater than for larger sample sizes. Different spatial patterns of measurement locations showed no appreciable influence on the final value as long as different surface altitudes were intermittently sampled over the altitude range of the glacier. Two different regression equations were examined, a quadratic, and a piecewise linear spline, and comparison of results showed little sensitivity to the type of equation. These results point to the dominant effect of the gradient of mass balance with altitude of alpine glaciers compared to transverse variations. The number of mass balance measurements required to determine the glacier balance appears to be scale invariant for small glaciers and five to ten stakes are sufficient.

  5. Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

    PubMed

    Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

    2014-03-01

    In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.

  6. Thermopile Detector Arrays for Space Science Applications

    NASA Technical Reports Server (NTRS)

    Foote, M. C.; Kenyon, M.; Krueger, T. R.; McCann, T. A.; Chacon, R.; Jones, E. W.; Dickie, M. R.; Schofield, J. T.; McCleese, D. J.; Gaalema, S.

    2004-01-01

    Thermopile detectors are widely used in uncooled applications where small numbers of detectors are required, particularly in low-cost commercial applications or applications requiring accurate radiometry. Arrays of thermopile detectors, however, have not been developed to the extent of uncooled bolometer and pyroelectric/ferroelectric arrays. Efforts at JPL seek to remedy this deficiency by developing high performance thin-film thermopile detectors in both linear and two-dimensional formats. The linear thermopile arrays are produced by bulk micromachining and wire bonded to separate CMOS readout electronic chips. Such arrays are currently being fabricated for the Mars Climate Sounder instrument, scheduled for launch in 2005. Progress is also described towards realizing a two-dimensional thermopile array built over CMOS readout circuitry in the substrate.

  7. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  8. An Expert System for the Evaluation of Cost Models

    DTIC Science & Technology

    1990-09-01

    contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John

  9. Leaf reflectance-nitrogen-chlorophyll relations among three south Texas woody rangeland plant species

    NASA Technical Reports Server (NTRS)

    Gausman, H. W.; Everitt, J. H.; Escobar, D. E. (Principal Investigator)

    1982-01-01

    Annual variations in the nitrogen-chlorophyll leaf reflectance of hackberry, honey mesquite and live oak in south Texas, were compared. In spring, leaf reflectance at the 0.55 m wavelength and nitrogen (N) concentration was high but leaf chlorophyll (chl) concentrations were low. In summer, leaf reflectance and N-concentration were low but lead chl concentrations were high. Linear correlations for both spring and summer of leaf reflectance with N and chl concentration or deviations from linear regression were not statistically significant.

  10. The cortisol response to social stress in social anxiety disorder.

    PubMed

    Vaccarino, Oriana; Levitan, Robert; Ravindran, Arun

    2015-04-01

    This study evaluated the cortisol stress response (CSR) following the Trier Social Stress Test in Social Anxiety Disorder (SAD) and control participants, to determine whether individual differences in CSR associate more with SAD diagnosis or dimensional characteristics [i.e. childhood trauma (CT)]. Twenty-one participants (11 with SAD) had full data available for both CT-scores and cortisol area-under-the-curve (AUC). Linear regression produced significant results: predicting AUCG with study group, emotional abuse (EA) scores and their interaction (F=3.14, df=5,15; p=.039); of note, the study group by EA interaction was significant at p=.015, driven by a strong positive association between EA and cortisol AUCG in the control group, and a negative association between these variables in the SAD group (standardized-beta=1.56, t=2.75, p=.015). This suggests that EA in SAD patients is associated with altered CSR, highlighting need to measure dimensional characteristics. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Using machine learning to identify air pollution exposure profiles associated with early cognitive skills among U.S. children.

    PubMed

    Stingone, Jeanette A; Pandey, Om P; Claudio, Luz; Pandey, Gaurav

    2017-11-01

    Data-driven machine learning methods present an opportunity to simultaneously assess the impact of multiple air pollutants on health outcomes. The goal of this study was to apply a two-stage, data-driven approach to identify associations between air pollutant exposure profiles and children's cognitive skills. Data from 6900 children enrolled in the Early Childhood Longitudinal Study, Birth Cohort, a national study of children born in 2001 and followed through kindergarten, were linked to estimated concentrations of 104 ambient air toxics in the 2002 National Air Toxics Assessment using ZIP code of residence at age 9 months. In the first-stage, 100 regression trees were learned to identify ambient air pollutant exposure profiles most closely associated with scores on a standardized mathematics test administered to children in kindergarten. In the second-stage, the exposure profiles frequently predicting lower math scores were included within linear regression models and adjusted for confounders in order to estimate the magnitude of their effect on math scores. This approach was applied to the full population, and then to the populations living in urban and highly-populated urban areas. Our first-stage results in the full population suggested children with low trichloroethylene exposure had significantly lower math scores. This association was not observed for children living in urban communities, suggesting that confounding related to urbanicity needs to be considered within the first-stage. When restricting our analysis to populations living in urban and highly-populated urban areas, high isophorone levels were found to predict lower math scores. Within adjusted regression models of children in highly-populated urban areas, the estimated effect of higher isophorone exposure on math scores was -1.19 points (95% CI -1.94, -0.44). Similar results were observed for the overall population of urban children. This data-driven, two-stage approach can be applied to other populations, exposures and outcomes to generate hypotheses within high-dimensional exposure data. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  12. Three-Dimensional Transgenic Cell Models to Quantify Space Genotoxic Effects

    NASA Technical Reports Server (NTRS)

    Gonda, S.; Wu, H.; Pingerelli, P.; Glickman, B.

    2000-01-01

    In this paper we describe a three-dimensional, multicellular tissue-equivalent model, produced in NASA-designed, rotating wall bioreactors using mammalian cells engineered for genomic containment of mUltiple copies of defined target genes for genotoxic assessment. The Rat 2(lambda) fibroblasts (Stratagene, Inc.) were genetically engineered to contain high-density target genes for mutagenesis. Stable three-dimensional, multicellular spheroids were formed when human mammary epithelial cells and Rat 2(lambda) fibroblasts were cocultured on Cytodex 3 Beads in a rotating wall bioreactor. The utility of this spheroidal model for genotoxic assessment was indicated by a linear dose response curve and by results of gene sequence analysis of mutant clones from 400micron diameter spheroids following low-dose, high-energy, neon radiation exposure

  13. ASTROP2-LE: A Mistuned Aeroelastic Analysis System Based on a Two Dimensional Linearized Euler Solver

    NASA Technical Reports Server (NTRS)

    Reddy, T. S. R.; Srivastava, R.; Mehmed, Oral

    2002-01-01

    An aeroelastic analysis system for flutter and forced response analysis of turbomachines based on a two-dimensional linearized unsteady Euler solver has been developed. The ASTROP2 code, an aeroelastic stability analysis program for turbomachinery, was used as a basis for this development. The ASTROP2 code uses strip theory to couple a two dimensional aerodynamic model with a three dimensional structural model. The code was modified to include forced response capability. The formulation was also modified to include aeroelastic analysis with mistuning. A linearized unsteady Euler solver, LINFLX2D is added to model the unsteady aerodynamics in ASTROP2. By calculating the unsteady aerodynamic loads using LINFLX2D, it is possible to include the effects of transonic flow on flutter and forced response in the analysis. The stability is inferred from an eigenvalue analysis. The revised code, ASTROP2-LE for ASTROP2 code using Linearized Euler aerodynamics, is validated by comparing the predictions with those obtained using linear unsteady aerodynamic solutions.

  14. 3D-liquid chromatography as a complex mixture characterization tool for knowledge-based downstream process development.

    PubMed

    Hanke, Alexander T; Tsintavi, Eleni; Ramirez Vazquez, Maria Del Pilar; van der Wielen, Luuk A M; Verhaert, Peter D E M; Eppink, Michel H M; van de Sandt, Emile J A X; Ottens, Marcel

    2016-09-01

    Knowledge-based development of chromatographic separation processes requires efficient techniques to determine the physicochemical properties of the product and the impurities to be removed. These characterization techniques are usually divided into approaches that determine molecular properties, such as charge, hydrophobicity and size, or molecular interactions with auxiliary materials, commonly in the form of adsorption isotherms. In this study we demonstrate the application of a three-dimensional liquid chromatography approach to a clarified cell homogenate containing a therapeutic enzyme. Each separation dimension determines a molecular property relevant to the chromatographic behavior of each component. Matching of the peaks across the different separation dimensions and against a high-resolution reference chromatogram allows to assign the determined parameters to pseudo-components, allowing to determine the most promising technique for the removal of each impurity. More detailed process design using mechanistic models requires isotherm parameters. For this purpose, the second dimension consists of multiple linear gradient separations on columns in a high-throughput screening compatible format, that allow regression of isotherm parameters with an average standard error of 8%. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1283-1291, 2016. © 2016 American Institute of Chemical Engineers.

  15. A Novel Approach to Prenatal Measurement of the Fetal Frontal Lobe Using Three-Dimensional Sonography

    PubMed Central

    Brown, Steffen A.; Hall, Rebecca; Hund, Lauren; Gutierrez, Hilda L.; Hurley, Timothy; Holbrook, Bradley D.; Bakhireva, Ludmila N.

    2017-01-01

    Objective While prenatal 3D ultrasonography results in improved diagnostic accuracy, no data are available on biometric assessment of the fetal frontal lobe. This study was designed to assess feasibility of a standardized approach to biometric measurement of the fetal frontal lobe and to construct frontal lobe growth trajectories throughout gestation. Study Design A sonographic 3D volume set was obtained and measured in 101 patients between 16.1 and 33.7 gestational weeks. Measurements were obtained by two independent raters. To model the relationship between gestational age and each frontal lobe measurement, flexible linear regression models were fit using penalized regression splines. Results The sample contained an ethnically diverse population (7.9% Native Americans, 45.5% Hispanic/Latina). There was high inter-rater reliability (correlation coefficients: 0.95, 1.0, and 0.87 for frontal lobe length, width, and height; p-values < 0.001). Graphs of the growth trajectories and corresponding percentiles were estimated as a function of gestational age. The estimated rates of frontal lobe growth were 0.096 cm/week, 0.247 cm/week, and 0.111 cm/week for length, width, and height. Conclusion To our knowledge, this is the first study to examine fetal frontal lobe growth trajectories through 3D prenatal ultrasound examination. Such normative data will allow for future prenatal evaluation of a particular disease state by 3D ultrasound imaging. PMID:29075046

  16. A Novel Approach to Prenatal Measurement of the Fetal Frontal Lobe Using Three-Dimensional Sonography.

    PubMed

    Brown, Steffen A; Hall, Rebecca; Hund, Lauren; Gutierrez, Hilda L; Hurley, Timothy; Holbrook, Bradley D; Bakhireva, Ludmila N

    2017-01-01

    While prenatal 3D ultrasonography results in improved diagnostic accuracy, no data are available on biometric assessment of the fetal frontal lobe. This study was designed to assess feasibility of a standardized approach to biometric measurement of the fetal frontal lobe and to construct frontal lobe growth trajectories throughout gestation. A sonographic 3D volume set was obtained and measured in 101 patients between 16.1 and 33.7 gestational weeks. Measurements were obtained by two independent raters. To model the relationship between gestational age and each frontal lobe measurement, flexible linear regression models were fit using penalized regression splines. The sample contained an ethnically diverse population (7.9% Native Americans, 45.5% Hispanic/Latina). There was high inter-rater reliability (correlation coefficients: 0.95, 1.0, and 0.87 for frontal lobe length, width, and height; p-values < 0.001). Graphs of the growth trajectories and corresponding percentiles were estimated as a function of gestational age. The estimated rates of frontal lobe growth were 0.096 cm/week, 0.247 cm/week, and 0.111 cm/week for length, width, and height. To our knowledge, this is the first study to examine fetal frontal lobe growth trajectories through 3D prenatal ultrasound examination. Such normative data will allow for future prenatal evaluation of a particular disease state by 3D ultrasound imaging.

  17. Ensemble of sparse classifiers for high-dimensional biological data.

    PubMed

    Kim, Sunghan; Scalzo, Fabien; Telesca, Donatello; Hu, Xiao

    2015-01-01

    Biological data are often high in dimension while the number of samples is small. In such cases, the performance of classification can be improved by reducing the dimension of data, which is referred to as feature selection. Recently, a novel feature selection method has been proposed utilising the sparsity of high-dimensional biological data where a small subset of features accounts for most variance of the dataset. In this study we propose a new classification method for high-dimensional biological data, which performs both feature selection and classification within a single framework. Our proposed method utilises a sparse linear solution technique and the bootstrap aggregating algorithm. We tested its performance on four public mass spectrometry cancer datasets along with two other conventional classification techniques such as Support Vector Machines and Adaptive Boosting. The results demonstrate that our proposed method performs more accurate classification across various cancer datasets than those conventional classification techniques.

  18. A General Sparse Tensor Framework for Electronic Structure Theory

    DOE PAGES

    Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I.; ...

    2017-01-24

    Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. But, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We then avoid cumbersome machine-generatedmore » code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.« less

  19. Two-dimensional energy spectra in a high Reynolds number turbulent boundary layer

    NASA Astrophysics Data System (ADS)

    Chandran, Dileep; Baidya, Rio; Monty, Jason; Marusic, Ivan

    2016-11-01

    The current study measures the two-dimensional (2D) spectra of streamwise velocity component (u) in a high Reynolds number turbulent boundary layer for the first time. A 2D spectra shows the contribution of streamwise (λx) and spanwise (λy) length scales to the streamwise variance at a given wall height (z). 2D spectra could be a better tool to analyse spectral scaling laws as it is devoid of energy aliasing errors that could be present in one-dimensional spectra. A novel method is used to calculate the 2D spectra from the 2D correlation of u which is obtained by measuring velocity time series at various spanwise locations using hot-wire anemometry. At low Reynolds number, the shape of the 2D spectra at a constant energy level shows λy √{ zλx } behaviour at larger scales which is in agreement with the literature. However, at high Reynolds number, it is observed that the square-root relationship gradually transforms into a linear relationship (λy λx) which could be caused by the large packets of eddies whose length grows proportionately to the growth of its width. Additionally, we will show that this linear relationship observed at high Reynolds number is consistent with attached eddy predictions. The authors gratefully acknowledge the support from the Australian Research Council.

  20. Time-resolved flow reconstruction with indirect measurements using regression models and Kalman-filtered POD ROM

    NASA Astrophysics Data System (ADS)

    Leroux, Romain; Chatellier, Ludovic; David, Laurent

    2018-01-01

    This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.

  1. Three-dimensional viscous fingering of miscible fluids in porous media

    NASA Astrophysics Data System (ADS)

    Suekane, Tetsuya; Ono, Jei; Hyodo, Akimitsu; Nagatsu, Yuichiro

    2017-10-01

    Viscous fingering is a flow instability that is induced at the displacement front when a less-viscous fluid (LVF) displaces a more-viscous fluid (MVF). Because of the opaque nature of porous media, most experimental investigations of the structure of viscous fingering and its development in time have been limited to two-dimensional porous media or Hele-Shaw cells. In this study, we investigate the three-dimensional characteristics of viscous fingering in porous media using a microfocused x-ray computer tomography (CT) scanner. Similar to two-dimensional experiments, characteristic events such as tip-splitting, shielding, and coalescence were observed in three-dimensional viscous fingering as well. With an increase in the Péclet number at a fixed viscosity ratio, M , the fingers appearing on the interface tend to be fine; however, the locations of the tips of the fingers remain the same for the same injected volume of the LVF. The finger extensions increase in proportion to ln M , and the number of fingers emerging at the initial interface increases with M . This fact agrees qualitatively with linear stability analyses. Within the fingers, the local concentration of NaI, which is needed for the x-ray CT scanner, linearly decreases, whereas it sharply decreases at the tips of the fingers. A locally high Péclet number as well as unsteady motions in lateral directions may enhance the dispersion at the tips of the fingers. As the viscosity ratio increases, the efficiency of each sweep monotonically decreases and reaches an asymptotic state; in addition, the degree of mixing increases with the viscosity ratio. For high flow rates, the asymptotic value of the sweep efficiency is low for high viscosity ratios, while there is no clear dependence of the asymptotic value on the Péclet number.

  2. A Method for Assessing the Quality of Model-Based Estimates of Ground Temperature and Atmospheric Moisture Using Satellite Data

    NASA Technical Reports Server (NTRS)

    Wu, Man Li C.; Schubert, Siegfried; Lin, Ching I.; Stajner, Ivanka; Einaudi, Franco (Technical Monitor)

    2000-01-01

    A method is developed for validating model-based estimates of atmospheric moisture and ground temperature using satellite data. The approach relates errors in estimates of clear-sky longwave fluxes at the top of the Earth-atmosphere system to errors in geophysical parameters. The fluxes include clear-sky outgoing longwave radiation (CLR) and radiative flux in the window region between 8 and 12 microns (RadWn). The approach capitalizes on the availability of satellite estimates of CLR and RadWn and other auxiliary satellite data, and multiple global four-dimensional data assimilation (4-DDA) products. The basic methodology employs off-line forward radiative transfer calculations to generate synthetic clear-sky longwave fluxes from two different 4-DDA data sets. Simple linear regression is used to relate the clear-sky longwave flux discrepancies to discrepancies in ground temperature ((delta)T(sub g)) and broad-layer integrated atmospheric precipitable water ((delta)pw). The slopes of the regression lines define sensitivity parameters which can be exploited to help interpret mismatches between satellite observations and model-based estimates of clear-sky longwave fluxes. For illustration we analyze the discrepancies in the clear-sky longwave fluxes between an early implementation of the Goddard Earth Observing System Data Assimilation System (GEOS2) and a recent operational version of the European Centre for Medium-Range Weather Forecasts data assimilation system. The analysis of the synthetic clear-sky flux data shows that simple linear regression employing (delta)T(sub g)) and broad layer (delta)pw provides a good approximation to the full radiative transfer calculations, typically explaining more thin 90% of the 6 hourly variance in the flux differences. These simple regression relations can be inverted to "retrieve" the errors in the geophysical parameters, Uncertainties (normalized by standard deviation) in the monthly mean retrieved parameters range from 7% for (delta)T(sub g) to approx. 20% for the lower tropospheric moisture between 500 hPa and surface. The regression relationships developed from the synthetic flux data, together with CLR and RadWn observed with the Clouds and Earth Radiant Energy System instrument, ire used to assess the quality of the GEOS2 T(sub g) and pw. Results showed that the GEOS2 T(sub g) is too cold over land, and pw in upper layers is too high over the tropical oceans and too low in the lower atmosphere.

  3. Independence screening for high dimensional nonlinear additive ODE models with applications to dynamic gene regulatory networks.

    PubMed

    Xue, Hongqi; Wu, Shuang; Wu, Yichao; Ramirez Idarraga, Juan C; Wu, Hulin

    2018-05-02

    Mechanism-driven low-dimensional ordinary differential equation (ODE) models are often used to model viral dynamics at cellular levels and epidemics of infectious diseases. However, low-dimensional mechanism-based ODE models are limited for modeling infectious diseases at molecular levels such as transcriptomic or proteomic levels, which is critical to understand pathogenesis of diseases. Although linear ODE models have been proposed for gene regulatory networks (GRNs), nonlinear regulations are common in GRNs. The reconstruction of large-scale nonlinear networks from time-course gene expression data remains an unresolved issue. Here, we use high-dimensional nonlinear additive ODEs to model GRNs and propose a 4-step procedure to efficiently perform variable selection for nonlinear ODEs. To tackle the challenge of high dimensionality, we couple the 2-stage smoothing-based estimation method for ODEs and a nonlinear independence screening method to perform variable selection for the nonlinear ODE models. We have shown that our method possesses the sure screening property and it can handle problems with non-polynomial dimensionality. Numerical performance of the proposed method is illustrated with simulated data and a real data example for identifying the dynamic GRN of Saccharomyces cerevisiae. Copyright © 2018 John Wiley & Sons, Ltd.

  4. Control Variate Selection for Multiresponse Simulation.

    DTIC Science & Technology

    1987-05-01

    M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels

  5. A sparse grid based method for generative dimensionality reduction of high-dimensional data

    NASA Astrophysics Data System (ADS)

    Bohn, Bastian; Garcke, Jochen; Griebel, Michael

    2016-03-01

    Generative dimensionality reduction methods play an important role in machine learning applications because they construct an explicit mapping from a low-dimensional space to the high-dimensional data space. We discuss a general framework to describe generative dimensionality reduction methods, where the main focus lies on a regularized principal manifold learning variant. Since most generative dimensionality reduction algorithms exploit the representer theorem for reproducing kernel Hilbert spaces, their computational costs grow at least quadratically in the number n of data. Instead, we introduce a grid-based discretization approach which automatically scales just linearly in n. To circumvent the curse of dimensionality of full tensor product grids, we use the concept of sparse grids. Furthermore, in real-world applications, some embedding directions are usually more important than others and it is reasonable to refine the underlying discretization space only in these directions. To this end, we employ a dimension-adaptive algorithm which is based on the ANOVA (analysis of variance) decomposition of a function. In particular, the reconstruction error is used to measure the quality of an embedding. As an application, the study of large simulation data from an engineering application in the automotive industry (car crash simulation) is performed.

  6. Identifying Aspects of Parental Involvement that Affect the Academic Achievement of High School Students

    ERIC Educational Resources Information Center

    Roulette-McIntyre, Ovella; Bagaka's, Joshua G.; Drake, Daniel D.

    2005-01-01

    This study identified parental practices that relate positively to high school students' academic performance. Parents of 643 high school students participated in the study. Data analysis, using a multiple linear regression model, shows parent-school connection, student gender, and race are significant predictors of student academic performance.…

  7. Investigating the sex-related geometric variation of the human cranium.

    PubMed

    Bertsatos, Andreas; Papageorgopoulou, Christina; Valakos, Efstratios; Chovalopoulou, Maria-Eleni

    2018-01-29

    Accurate sexing methods are of great importance in forensic anthropology since sex assessment is among the principal tasks when examining human skeletal remains. The present study explores a novel approach in assessing the most accurate metric traits of the human cranium for sex estimation based on 80 ectocranial landmarks from 176 modern individuals of known age and sex from the Athens Collection. The purpose of the study is to identify those distance and angle measurements that can be most effectively used in sex assessment. Three-dimensional landmark coordinates were digitized with a Microscribe 3DX and analyzed in GNU Octave. An iterative linear discriminant analysis of all possible combinations of landmarks was performed for each unique set of the 3160 distances and 246,480 angles. Cross-validated correct classification as well as multivariate DFA on top performing variables reported 13 craniometric distances with over 85% classification accuracy, 7 angles over 78%, as well as certain multivariate combinations yielding over 95%. Linear regression of these variables with the centroid size was used to assess their relation to the size of the cranium. In contrast to the use of generalized procrustes analysis (GPA) and principal component analysis (PCA), which constitute the common analytical work flow for such data, our method, although computational intensive, produced easily applicable discriminant functions of high accuracy, while at the same time explored the maximum of cranial variability.

  8. Dental computed tomographic imaging as age estimation: morphological analysis of the third molar of a group of Turkish population.

    PubMed

    Cantekin, Kenan; Sekerci, Ahmet Ercan; Buyuk, Suleyman Kutalmis

    2013-12-01

    Computed tomography (CT) is capable of providing accurate and measurable 3-dimensional images of the third molar. The aims of this study were to analyze the development of the mandibular third molar and its relation to chronological age and to create new reference data for a group of Turkish participants aged 9 to 25 years on the basis of cone-beam CT images. All data were obtained from the patients' records including medical, social, and dental anamnesis and cone-beam CT images of 752 patients. Linear regression analysis was performed to obtain regression formulas for dental age calculation with chronological age and to determine the coefficient of determination (r) for each sex. Statistical analysis showed a strong correlation between age and third-molar development for the males (r2 = 0.80) and the females (r2 = 0.78). Computed tomographic images are clinically useful for accurate and reliable estimation of dental ages of children and youth.

  9. Heat and mass transfer rates during flow of dissociated hydrogen gas over graphite surface

    NASA Technical Reports Server (NTRS)

    Nema, V. K.; Sharma, O. P.

    1986-01-01

    To improve upon the performance of chemical rockets, the nuclear reactor has been applied to a rocket propulsion system using hydrogen gas as working fluid and a graphite-composite forming a part of the structure. Under the boundary layer approximation, theoretical predictions of skin friction coefficient, surface heat transfer rate and surface regression rate have been made for laminar/turbulent dissociated hydrogen gas flowing over a flat graphite surface. The external stream is assumed to be frozen. The analysis is restricted to Mach numbers low enough to deal with the situation of only surface-reaction between hydrogen and graphite. Empirical correlations of displacement thickness, local skin friction coefficient, local Nusselt number and local non-dimensional heat transfer rate have been obtained. The magnitude of the surface regression rate is found low enough to ensure the use of graphite as a linear or a component of the system over an extended period without loss of performance.

  10. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

    PubMed Central

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-01-01

    Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393

  11. High-Efficiency Visible Transmitting Polarizations Devices Based on the GaN Metasurface.

    PubMed

    Guo, Zhongyi; Xu, Haisheng; Guo, Kai; Shen, Fei; Zhou, Hongping; Zhou, Qingfeng; Gao, Jun; Yin, Zhiping

    2018-05-15

    Metasurfaces are capable of tailoring the amplitude, phase, and polarization of incident light to design various polarization devices. Here, we propose a metasurface based on the novel dielectric material gallium nitride (GaN) to realize high-efficiency modulation for both of the orthogonal linear polarizations simultaneously in the visible range. Both modulated transmitted phases of the orthogonal linear polarizations can almost span the whole 2π range by tailoring geometric sizes of the GaN nanobricks, while maintaining high values of transmission (almost all over 90%). At the wavelength of 530 nm, we designed and realized the beam splitter and the focusing lenses successfully. To further prove that our proposed method is suitable for arbitrary orthogonal linear polarization, we also designed a three-dimensional (3D) metalens that can simultaneously focus the X -, Y -, 45°, and 135° linear polarizations on spatially symmetric positions, which can be applied to the linear polarization measurement. Our work provides a possible method to achieve high-efficiency multifunctional optical devices in visible light by extending the modulating dimensions.

  12. Step-Growth Polymerization.

    ERIC Educational Resources Information Center

    Stille, J. K.

    1981-01-01

    Following a comparison of chain-growth and step-growth polymerization, focuses on the latter process by describing requirements for high molecular weight, step-growth polymerization kinetics, synthesis and molecular weight distribution of some linear step-growth polymers, and three-dimensional network step-growth polymers. (JN)

  13. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  14. Integration of measurements with atmospheric dispersion models: Source term estimation for dispersal of (239)Pu due to non-nuclear detonation of high explosive

    NASA Astrophysics Data System (ADS)

    Edwards, L. L.; Harvey, T. F.; Freis, R. P.; Pitovranov, S. E.; Chernokozhin, E. V.

    1992-10-01

    The accuracy associated with assessing the environmental consequences of an accidental release of radioactivity is highly dependent on our knowledge of the source term characteristics and, in the case when the radioactivity is condensed on particles, the particle size distribution, all of which are generally poorly known. This paper reports on the development of a numerical technique that integrates the radiological measurements with atmospheric dispersion modeling. This results in a more accurate particle-size distribution and particle injection height estimation when compared with measurements of high explosive dispersal of (239)Pu. The estimation model is based on a non-linear least squares regression scheme coupled with the ARAC three-dimensional atmospheric dispersion models. The viability of the approach is evaluated by estimation of ADPIC model input parameters such as the ADPIC particle size mean aerodynamic diameter, the geometric standard deviation, and largest size. Additionally we estimate an optimal 'coupling coefficient' between the particles and an explosive cloud rise model. The experimental data are taken from the Clean Slate 1 field experiment conducted during 1963 at the Tonopah Test Range in Nevada. The regression technique optimizes the agreement between the measured and model predicted concentrations of (239)Pu by varying the model input parameters within their respective ranges of uncertainties. The technique generally estimated the measured concentrations within a factor of 1.5, with the worst estimate being within a factor of 5, very good in view of the complexity of the concentration measurements, the uncertainties associated with the meteorological data, and the limitations of the models. The best fit also suggest a smaller mean diameter and a smaller geometric standard deviation on the particle size as well as a slightly weaker particle to cloud coupling than previously reported.

  15. Conducting linear chains of sulphur inside carbon nanotubes

    PubMed Central

    Fujimori, Toshihiko; Morelos-Gómez, Aarón; Zhu, Zhen; Muramatsu, Hiroyuki; Futamura, Ryusuke; Urita, Koki; Terrones, Mauricio; Hayashi, Takuya; Endo, Morinobu; Young Hong, Sang; Chul Choi, Young; Tománek, David; Kaneko, Katsumi

    2013-01-01

    Despite extensive research for more than 200 years, the experimental isolation of monatomic sulphur chains, which are believed to exhibit a conducting character, has eluded scientists. Here we report the synthesis of a previously unobserved composite material of elemental sulphur, consisting of monatomic chains stabilized in the constraining volume of a carbon nanotube. This one-dimensional phase is confirmed by high-resolution transmission electron microscopy and synchrotron X-ray diffraction. Interestingly, these one-dimensional sulphur chains exhibit long domain sizes of up to 160 nm and high thermal stability (~800 K). Synchrotron X-ray diffraction shows a sharp structural transition of the one-dimensional sulphur occurring at ~450–650 K. Our observations, and corresponding electronic structure and quantum transport calculations, indicate the conducting character of the one-dimensional sulphur chains under ambient pressure. This is in stark contrast to bulk sulphur that needs ultrahigh pressures exceeding ~90 GPa to become metallic. PMID:23851903

  16. Acceleration and stability of a high-current ion beam in induction fields

    NASA Astrophysics Data System (ADS)

    Karas', V. I.; Manuilenko, O. V.; Tarakanov, V. P.; Federovskaya, O. V.

    2013-03-01

    A one-dimensional nonlinear analytic theory of the filamentation instability of a high-current ion beam is formulated. The results of 2.5-dimensional numerical particle-in-cell simulations of acceleration and stability of an annular compensated ion beam (CIB) in a linear induction particle accelerator are presented. It is shown that additional transverse injection of electron beams in magnetically insulated gaps (cusps) improves the quality of the ion-beam distribution function and provides uniform beam acceleration along the accelerator. The CIB filamentation instability in both the presence and the absence of an external magnetic field is considered.

  17. Quantum states and optical responses of low-dimensional electron hole systems

    NASA Astrophysics Data System (ADS)

    Ogawa, Tetsuo

    2004-09-01

    Quantum states and their optical responses of low-dimensional electron-hole systems in photoexcited semiconductors and/or metals are reviewed from a theoretical viewpoint, stressing the electron-hole Coulomb interaction, the excitonic effects, the Fermi-surface effects and the dimensionality. Recent progress of theoretical studies is stressed and important problems to be solved are introduced. We cover not only single-exciton problems but also few-exciton and many-exciton problems, including electron-hole plasma situations. Dimensionality of the Wannier exciton is clarified in terms of its linear and nonlinear responses. We also discuss a biexciton system, exciton bosonization technique, high-density degenerate electron-hole systems, gas-liquid phase separation in an excited state and the Fermi-edge singularity due to a Mahan exciton in a low-dimensional metal.

  18. Normal forms for reduced stochastic climate models

    PubMed Central

    Majda, Andrew J.; Franzke, Christian; Crommelin, Daan

    2009-01-01

    The systematic development of reduced low-dimensional stochastic climate models from observations or comprehensive high-dimensional climate models is an important topic for atmospheric low-frequency variability, climate sensitivity, and improved extended range forecasting. Here techniques from applied mathematics are utilized to systematically derive normal forms for reduced stochastic climate models for low-frequency variables. The use of a few Empirical Orthogonal Functions (EOFs) (also known as Principal Component Analysis, Karhunen–Loéve and Proper Orthogonal Decomposition) depending on observational data to span the low-frequency subspace requires the assessment of dyad interactions besides the more familiar triads in the interaction between the low- and high-frequency subspaces of the dynamics. It is shown below that the dyad and multiplicative triad interactions combine with the climatological linear operator interactions to simultaneously produce both strong nonlinear dissipation and Correlated Additive and Multiplicative (CAM) stochastic noise. For a single low-frequency variable the dyad interactions and climatological linear operator alone produce a normal form with CAM noise from advection of the large scales by the small scales and simultaneously strong cubic damping. These normal forms should prove useful for developing systematic strategies for the estimation of stochastic models from climate data. As an illustrative example the one-dimensional normal form is applied below to low-frequency patterns such as the North Atlantic Oscillation (NAO) in a climate model. The results here also illustrate the short comings of a recent linear scalar CAM noise model proposed elsewhere for low-frequency variability. PMID:19228943

  19. Three-dimensional modeling of flexible pavements : executive summary, August 2001.

    DOT National Transportation Integrated Search

    2001-08-01

    A linear viscoelastic model has been incorporated into a three-dimensional finite element program for analysis of flexible pavements. Linear and quadratic versions of hexahedral elements and quadrilateral axisymmetrix elements are provided. Dynamic p...

  20. Three dimensional modeling of flexible pavements : final report, March 2002.

    DOT National Transportation Integrated Search

    2001-08-01

    A linear viscoelastic model has been incorporated into a three-dimensional finite element program for analysis of flexible pavements. Linear and quadratic versions of hexahedral elements and quadrilateral axisymmetrix elements are provided. Dynamic p...

  1. Pseudo second order kinetics and pseudo isotherms for malachite green onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth; Sivanesan, S

    2006-08-25

    Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.

  2. [Influence of autoclave sterilization on dimensional stability and detail reproduction of 5 additional silicone impression materials].

    PubMed

    Xu, Tong-kai; Sun, Zhi-hui; Jiang, Yong

    2012-03-01

    To evaluate the dimensional stability and detail reproduction of five additional silicone impression materials after autoclave sterilization. Impressions were made on the ISO 4823 standard mold containing several marking lines, in five kinds of additional silicone. All the impressions were sterilized by high temperature and pressure (135 °C, 212.8 kPa) for 25 min. Linear measurements of pre-sterilization and post-sterilization were made with a measuring microscope. Statistical analysis utilized single-factor analysis with pair-wise comparison of mean values when appropriate. Hypothesis testing was conducted at alpha = 0.05. No significant difference was found between the pre-sterilization and post-sterilization conditions for all locations, and all the absolute valuse of linear rate of change less than 8%. All the sterilization by the autoclave did not affect the surfuce detail reproduction of the 5 impression materials. The dimensional stability and detail reproduction of the five additional silicone impression materials in the study was unaffected by autoclave sterilization.

  3. Accuracy of 1H magnetic resonance spectroscopy for quantification of 2-hydroxyglutarate using linear combination and J-difference editing at 9.4T.

    PubMed

    Neuberger, Ulf; Kickingereder, Philipp; Helluy, Xavier; Fischer, Manuel; Bendszus, Martin; Heiland, Sabine

    2017-12-01

    Non-invasive detection of 2-hydroxyglutarate (2HG) by magnetic resonance spectroscopy is attractive since it is related to tumor metabolism. Here, we compare the detection accuracy of 2HG in a controlled phantom setting via widely used localized spectroscopy sequences quantified by linear combination of metabolite signals vs. a more complex approach applying a J-difference editing technique at 9.4T. Different phantoms, comprised out of a concentration series of 2HG and overlapping brain metabolites, were measured with an optimized point-resolved-spectroscopy sequence (PRESS) and an in-house developed J-difference editing sequence. The acquired spectra were post-processed with LCModel and a simulated metabolite set (PRESS) or with a quantification formula for J-difference editing. Linear regression analysis demonstrated a high correlation of real 2HG values with those measured with the PRESS method (adjusted R-squared: 0.700, p<0.001) as well as with those measured with the J-difference editing method (adjusted R-squared: 0.908, p<0.001). The regression model with the J-difference editing method however had a significantly higher explanatory value over the regression model with the PRESS method (p<0.0001). Moreover, with J-difference editing 2HG was discernible down to 1mM, whereas with the PRESS method 2HG values were not discernable below 2mM and with higher systematic errors, particularly in phantoms with high concentrations of N-acetyl-asparate (NAA) and glutamate (Glu). In summary, quantification of 2HG with linear combination of metabolite signals shows high systematic errors particularly at low 2HG concentration and high concentration of confounding metabolites such as NAA and Glu. In contrast, J-difference editing offers a more accurate quantification even at low 2HG concentrations, which outweighs the downsides of longer measurement time and more complex postprocessing. Copyright © 2017. Published by Elsevier GmbH.

  4. Three-dimensional to two-dimensional transition in mode-I fracture microbranching in a perturbed hexagonal close-packed lattice

    NASA Astrophysics Data System (ADS)

    Heizler, Shay I.; Kessler, David A.

    2017-06-01

    Mode-I fracture exhibits microbranching in the high velocity regime where the simple straight crack is unstable. For velocities below the instability, classic modeling using linear elasticity is valid. However, showing the existence of the instability and calculating the dynamics postinstability within the linear elastic framework is difficult and controversial. The experimental results give several indications that the microbranching phenomenon is basically a three-dimensional (3D) phenomenon. Nevertheless, the theoretical effort has been focused mostly on two-dimensional (2D) modeling. In this paper we study the microbranching instability using three-dimensional atomistic simulations, exploring the difference between the 2D and the 3D models. We find that the basic 3D fracture pattern shares similar behavior with the 2D case. Nevertheless, we exhibit a clear 3D-2D transition as the crack velocity increases, whereas as long as the microbranches are sufficiently small, the behavior is pure 3D behavior, whereas at large driving, as the size of the microbranches increases, more 2D-like behavior is exhibited. In addition, in 3D simulations, the quantitative features of the microbranches, separating the regimes of steady-state cracks (mirror) and postinstability (mist-hackle) are reproduced clearly, consistent with the experimental findings.

  5. Evolution of the linear-polarization-angle-dependence of the radiation-induced magnetoresistance-oscillations with microwave power

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ye, Tianyu; Mani, R. G.; Wegscheider, W.

    2014-11-10

    We examine the role of the microwave power in the linear polarization angle dependence of the microwave radiation induced magnetoresistance oscillations observed in the high mobility GaAs/AlGaAs two dimensional electron system. The diagonal resistance R{sub xx} was measured at the fixed magnetic fields of the photo-excited oscillatory extrema of R{sub xx} as a function of both the microwave power, P, and the linear polarization angle, θ. Color contour plots of such measurements demonstrate the evolution of the lineshape of R{sub xx} versus θ with increasing microwave power. We report that the non-linear power dependence of the amplitude of the radiation-inducedmore » magnetoresistance oscillations distorts the cosine-square relation between R{sub xx} and θ at high power.« less

  6. Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

    DTIC Science & Technology

    2015-07-15

    Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

  7. Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

    NASA Astrophysics Data System (ADS)

    Cepowski, Tomasz

    2017-06-01

    The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.

  8. High Resolution, Large Deformation 3D Traction Force Microscopy

    PubMed Central

    López-Fagundo, Cristina; Reichner, Jonathan; Hoffman-Kim, Diane; Franck, Christian

    2014-01-01

    Traction Force Microscopy (TFM) is a powerful approach for quantifying cell-material interactions that over the last two decades has contributed significantly to our understanding of cellular mechanosensing and mechanotransduction. In addition, recent advances in three-dimensional (3D) imaging and traction force analysis (3D TFM) have highlighted the significance of the third dimension in influencing various cellular processes. Yet irrespective of dimensionality, almost all TFM approaches have relied on a linear elastic theory framework to calculate cell surface tractions. Here we present a new high resolution 3D TFM algorithm which utilizes a large deformation formulation to quantify cellular displacement fields with unprecedented resolution. The results feature some of the first experimental evidence that cells are indeed capable of exerting large material deformations, which require the formulation of a new theoretical TFM framework to accurately calculate the traction forces. Based on our previous 3D TFM technique, we reformulate our approach to accurately account for large material deformation and quantitatively contrast and compare both linear and large deformation frameworks as a function of the applied cell deformation. Particular attention is paid in estimating the accuracy penalty associated with utilizing a traditional linear elastic approach in the presence of large deformation gradients. PMID:24740435

  9. Comparative study of microwave radiation-induced magnetoresistive oscillations induced by circularly- and linearly- polarized photo-excitation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ye, Tianyu; Liu, Han -Chun; Wang, Zhuo

    A comparative study of the radiation-induced magnetoresistance oscillations in the high mobility GaAs/AlGaAs heterostructure two dimensional electron system (2DES) under linearly- and circularly- polarized microwave excitation indicates a profound difference in the response observed upon rotating the microwave launcher for the two cases, although circularly polarized microwave radiation induced magnetoresistance oscillations observed at low magnetic fields are similar to the oscillations observed with linearly polarized radiation. For the linearly polarized radiation, the magnetoresistive response is a strong sinusoidal function of the launcher rotation (or linear polarization) angle, θ. As a result, for circularly polarized radiation, the oscillatory magnetoresistive response ismore » hardly sensitive to θ.« less

  10. Comparative study of microwave radiation-induced magnetoresistive oscillations induced by circularly- and linearly- polarized photo-excitation

    DOE PAGES

    Ye, Tianyu; Liu, Han -Chun; Wang, Zhuo; ...

    2015-10-09

    A comparative study of the radiation-induced magnetoresistance oscillations in the high mobility GaAs/AlGaAs heterostructure two dimensional electron system (2DES) under linearly- and circularly- polarized microwave excitation indicates a profound difference in the response observed upon rotating the microwave launcher for the two cases, although circularly polarized microwave radiation induced magnetoresistance oscillations observed at low magnetic fields are similar to the oscillations observed with linearly polarized radiation. For the linearly polarized radiation, the magnetoresistive response is a strong sinusoidal function of the launcher rotation (or linear polarization) angle, θ. As a result, for circularly polarized radiation, the oscillatory magnetoresistive response ismore » hardly sensitive to θ.« less

  11. Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

    ERIC Educational Resources Information Center

    Li, Deping; Oranje, Andreas

    2007-01-01

    Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…

  12. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes

    PubMed Central

    2013-01-01

    Motivation Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. Results We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. Availability The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana. PMID:24564704

  13. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes.

    PubMed

    Wang, Yue; Goh, Wilson; Wong, Limsoon; Montana, Giovanni

    2013-01-01

    Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana.

  14. Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

    PubMed

    Ernst, Anja F; Albers, Casper J

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.

  15. Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

    PubMed Central

    Ernst, Anja F.

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971

  16. Development of a linearized unsteady Euler analysis for turbomachinery blade rows

    NASA Technical Reports Server (NTRS)

    Verdon, Joseph M.; Montgomery, Matthew D.; Kousen, Kenneth A.

    1995-01-01

    A linearized unsteady aerodynamic analysis for axial-flow turbomachinery blading is described in this report. The linearization is based on the Euler equations of fluid motion and is motivated by the need for an efficient aerodynamic analysis that can be used in predicting the aeroelastic and aeroacoustic responses of blade rows. The field equations and surface conditions required for inviscid, nonlinear and linearized, unsteady aerodynamic analyses of three-dimensional flow through a single, blade row operating within a cylindrical duct, are derived. An existing numerical algorithm for determining time-accurate solutions of the nonlinear unsteady flow problem is described, and a numerical model, based upon this nonlinear flow solver, is formulated for the first-harmonic linear unsteady problem. The linearized aerodynamic and numerical models have been implemented into a first-harmonic unsteady flow code, called LINFLUX. At present this code applies only to two-dimensional flows, but an extension to three-dimensions is planned as future work. The three-dimensional aerodynamic and numerical formulations are described in this report. Numerical results for two-dimensional unsteady cascade flows, excited by prescribed blade motions and prescribed aerodynamic disturbances at inlet and exit, are also provided to illustrate the present capabilities of the LINFLUX analysis.

  17. Tolerance of ciliated protozoan Paramecium bursaria (Protozoa, Ciliophora) to ammonia and nitrites

    NASA Astrophysics Data System (ADS)

    Xu, Henglong; Song, Weibo; Lu, Lu; Alan, Warren

    2005-09-01

    The tolerance to ammonia and nitrites in freshwater ciliate Paramecium bursaria was measured in a conventional open system. The ciliate was exposed to different concentrations of ammonia and nitrites for 2h and 12h in order to determine the lethal concentrations. Linear regression analysis revealed that the 2h-LC50 value for ammonia was 95.94 mg/L and for nitrite 27.35 mg/L using probit scale method (with 95% confidence intervals). There was a linear correlation between the mortality probit scale and logarithmic concentration of ammonia which fit by a regression equation y=7.32 x 9.51 ( R 2=0.98; y, mortality probit scale; x, logarithmic concentration of ammonia), by which 2 h-LC50 value for ammonia was found to be 95.50 mg/L. A linear correlation between mortality probit scales and logarithmic concentration of nitrite is also followed the regression equation y=2.86 x+0.89 ( R 2=0.95; y, mortality probit scale; x, logarithmic concentration of nitrite). The regression analysis of toxicity curves showed that the linear correlation between exposed time of ammonia-N LC50 value and ammonia-N LC50 value followed the regression equation y=2 862.85 e -0.08 x ( R 2=0.95; y, duration of exposure to LC50 value; x, LC50 value), and that between exposed time of nitrite-N LC50 value and nitrite-N LC50 value followed the regression equation y=127.15 e -0.13 x ( R 2=0.91; y, exposed time of LC50 value; x, LC50 value). The results demonstrate that the tolerance to ammonia in P. bursaria is considerably higher than that of the larvae or juveniles of some metozoa, e.g. cultured prawns and oysters. In addition, ciliates, as bacterial predators, are likely to play a positive role in maintaining and improving water quality in aquatic environments with high-level ammonium, such as sewage treatment systems.

  18. High school science enrollment of black students

    NASA Astrophysics Data System (ADS)

    Goggins, Ellen O.; Lindbeck, Joy S.

    How can the high school science enrollment of black students be increased? School and home counseling and classroom procedures could benefit from variables identified as predictors of science enrollment. The problem in this study was to identify a set of variables which characterize science course enrollment by black secondary students. The population consisted of a subsample of 3963 black high school seniors from The High School and Beyond 1980 Base-Year Survey. Using multiple linear regression, backward regression, and correlation analyses, the US Census regions and grades mostly As and Bs in English were found to be significant predictors of the number of science courses scheduled by black seniors.

  19. Estimating linear temporal trends from aggregated environmental monitoring data

    USGS Publications Warehouse

    Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.

    2017-01-01

    Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.

  20. Fire spread in chaparral – a comparison of laboratory data and model predictions in burning live fuels

    Treesearch

    David R. Weise; Eunmo Koo; Xiangyang Zhou; Shankar Mahalingam; Frédéric Morandini; Jacques-Henri Balbi

    2016-01-01

    Fire behaviour data from 240 laboratory fires in high-density live chaparral fuel beds were compared with model predictions. Logistic regression was used to develop a model to predict fire spread success in the fuel beds and linear regression was used to predict rate of spread. Predictions from the Rothermel equation and three proposed changes as well as two physically...

  1. Ensemble learning with trees and rules: supervised, semi-supervised, unsupervised

    USDA-ARS?s Scientific Manuscript database

    In this article, we propose several new approaches for post processing a large ensemble of conjunctive rules for supervised and semi-supervised learning problems. We show with various examples that for high dimensional regression problems the models constructed by the post processing the rules with ...

  2. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    PubMed

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.

  3. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

    PubMed Central

    2011-01-01

    Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852

  4. TH-CD-207A-07: Prediction of High Dimensional State Subject to Respiratory Motion: A Manifold Learning Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, W; Sawant, A; Ruan, D

    Purpose: The development of high dimensional imaging systems (e.g. volumetric MRI, CBCT, photogrammetry systems) in image-guided radiotherapy provides important pathways to the ultimate goal of real-time volumetric/surface motion monitoring. This study aims to develop a prediction method for the high dimensional state subject to respiratory motion. Compared to conventional linear dimension reduction based approaches, our method utilizes manifold learning to construct a descriptive feature submanifold, where more efficient and accurate prediction can be performed. Methods: We developed a prediction framework for high-dimensional state subject to respiratory motion. The proposed method performs dimension reduction in a nonlinear setting to permit moremore » descriptive features compared to its linear counterparts (e.g., classic PCA). Specifically, a kernel PCA is used to construct a proper low-dimensional feature manifold, where low-dimensional prediction is performed. A fixed-point iterative pre-image estimation method is applied subsequently to recover the predicted value in the original state space. We evaluated and compared the proposed method with PCA-based method on 200 level-set surfaces reconstructed from surface point clouds captured by the VisionRT system. The prediction accuracy was evaluated with respect to root-mean-squared-error (RMSE) for both 200ms and 600ms lookahead lengths. Results: The proposed method outperformed PCA-based approach with statistically higher prediction accuracy. In one-dimensional feature subspace, our method achieved mean prediction accuracy of 0.86mm and 0.89mm for 200ms and 600ms lookahead lengths respectively, compared to 0.95mm and 1.04mm from PCA-based method. The paired t-tests further demonstrated the statistical significance of the superiority of our method, with p-values of 6.33e-3 and 5.78e-5, respectively. Conclusion: The proposed approach benefits from the descriptiveness of a nonlinear manifold and the prediction reliability in such low dimensional manifold. The fixed-point iterative approach turns out to work well practically for the pre-image recovery. Our approach is particularly suitable to facilitate managing respiratory motion in image-guide radiotherapy. This work is supported in part by NIH grant R01 CA169102-02.« less

  5. Comparing The Effectiveness of a90/95 Calculations (Preprint)

    DTIC Science & Technology

    2006-09-01

    Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity

  6. Koopman operator theory: Past, present, and future

    NASA Astrophysics Data System (ADS)

    Brunton, Steven; Kaiser, Eurika; Kutz, Nathan

    2017-11-01

    Koopman operator theory has emerged as a dominant method to represent nonlinear dynamics in terms of an infinite-dimensional linear operator. The Koopman operator acts on the space of all possible measurement functions of the system state, advancing these measurements with the flow of the dynamics. A linear representation of nonlinear dynamics has tremendous potential to enable the prediction, estimation, and control of nonlinear systems with standard textbook methods developed for linear systems. Dynamic mode decomposition has become the leading data-driven method to approximate the Koopman operator, although there are still open questions and challenges around how to obtain accurate approximations for strongly nonlinear systems. This talk will provide an introductory overview of modern Koopman operator theory, reviewing the basics and describing recent theoretical and algorithmic developments. Particular emphasis will be placed on the use of data-driven Koopman theory to characterize and control high-dimensional fluid dynamic systems. This talk will also address key advances in the rapidly growing fields of machine learning and data science that are likely to drive future developments.

  7. Flow Equation Approach to the Statistics of Nonlinear Dynamical Systems

    NASA Astrophysics Data System (ADS)

    Marston, J. B.; Hastings, M. B.

    2005-03-01

    The probability distribution function of non-linear dynamical systems is governed by a linear framework that resembles quantum many-body theory, in which stochastic forcing and/or averaging over initial conditions play the role of non-zero . Besides the well-known Fokker-Planck approach, there is a related Hopf functional methodootnotetextUriel Frisch, Turbulence: The Legacy of A. N. Kolmogorov (Cambridge University Press, 1995) chapter 9.5.; in both formalisms, zero modes of linear operators describe the stationary non-equilibrium statistics. To access the statistics, we investigate the method of continuous unitary transformationsootnotetextS. D. Glazek and K. G. Wilson, Phys. Rev. D 48, 5863 (1993); Phys. Rev. D 49, 4214 (1994). (also known as the flow equation approachootnotetextF. Wegner, Ann. Phys. 3, 77 (1994).), suitably generalized to the diagonalization of non-Hermitian matrices. Comparison to the more traditional cumulant expansion method is illustrated with low-dimensional attractors. The treatment of high-dimensional dynamical systems is also discussed.

  8. One dimensional wavefront distortion sensor comprising a lens array system

    DOEpatents

    Neal, Daniel R.; Michie, Robert B.

    1996-01-01

    A 1-dimensional sensor for measuring wavefront distortion of a light beam as a function of time and spatial position includes a lens system which incorporates a linear array of lenses, and a detector system which incorporates a linear array of light detectors positioned from the lens system so that light passing through any of the lenses is focused on at least one of the light detectors. The 1-dimensional sensor determines the slope of the wavefront by location of the detectors illuminated by the light. The 1 dimensional sensor has much greater bandwidth that 2 dimensional systems.

  9. Stable Direct Adaptive Control of Linear Infinite-dimensional Systems Using a Command Generator Tracker Approach

    NASA Technical Reports Server (NTRS)

    Balas, M. J.; Kaufman, H.; Wen, J.

    1985-01-01

    A command generator tracker approach to model following contol of linear distributed parameter systems (DPS) whose dynamics are described on infinite dimensional Hilbert spaces is presented. This method generates finite dimensional controllers capable of exponentially stable tracking of the reference trajectories when certain ideal trajectories are known to exist for the open loop DPS; we present conditions for the existence of these ideal trajectories. An adaptive version of this type of controller is also presented and shown to achieve (in some cases, asymptotically) stable finite dimensional control of the infinite dimensional DPS.

  10. One dimensional wavefront distortion sensor comprising a lens array system

    DOEpatents

    Neal, D.R.; Michie, R.B.

    1996-02-20

    A 1-dimensional sensor for measuring wavefront distortion of a light beam as a function of time and spatial position includes a lens system which incorporates a linear array of lenses, and a detector system which incorporates a linear array of light detectors positioned from the lens system so that light passing through any of the lenses is focused on at least one of the light detectors. The 1-dimensional sensor determines the slope of the wavefront by location of the detectors illuminated by the light. The 1 dimensional sensor has much greater bandwidth that 2 dimensional systems. 8 figs.

  11. Numerical study of electromagnetic scattering from one-dimensional nonlinear fractal sea surface

    NASA Astrophysics Data System (ADS)

    Xie, Tao; He, Chao; William, Perrie; Kuang, Hai-Lan; Zou, Guang-Hui; Chen, Wei

    2010-02-01

    In recent years, linear fractal sea surface models have been developed for the sea surface in order to establish an electromagnetic backscattering model. Unfortunately, the sea surface is always nonlinear, particularly at high sea states. We present a nonlinear fractal sea surface model and derive an electromagnetic backscattering model. Using this model, we numerically calculate the normalized radar cross section (NRCS) of a nonlinear sea surface. Comparing the averaged NRCS between linear and nonlinear fractal models, we show that the NRCS of a linear fractal sea surface underestimates the NRCS of the real sea surface, especially for sea states with high fractal dimensions, and for dominant ocean surface gravity waves that are either very short or extremely long.

  12. Decoherence mechanisms of Landau level THz excitations in two dimensional electron gases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maissen, Curdin; Scalari, Giacomo; Faist, Jérôme

    2013-12-04

    We report coherent THz transmission measurements on different two dimensional electron gases (2DEGs) in magnetic field. The investigated 2DEGs form in GaAs/AlGaAs heterostructures. A short (1 ps) linearly polarized THz pulse is used to excite inter Landau level transitions. The circular polarized radiation emitted by the 2DEG is then measured by electro optic sampling of the linear component orthogonal to the pump pulse polarization. Here we present measurements on two high mobility samples with μ = 5×10{sup 6}cm{sup 2}/Vs and μ = 16×10{sup 6}cm{sup 2}/Vs respectively. The decay times of the emitted radiation are 5.5 ps and 9 ps respectivelymore » at 2 K.« less

  13. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    PubMed

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included. Copyright © 2011 Elsevier B.V. All rights reserved.

  14. Risk prediction for myocardial infarction via generalized functional regression models.

    PubMed

    Ieva, Francesca; Paganoni, Anna M

    2016-08-01

    In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.

  15. Correlation and simple linear regression.

    PubMed

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  16. Exploration of graphene oxide as an intelligent platform for cancer vaccines

    NASA Astrophysics Data System (ADS)

    Yue, Hua; Wei, Wei; Gu, Zonglin; Ni, Dezhi; Luo, Nana; Yang, Zaixing; Zhao, Lin; Garate, Jose Antonio; Zhou, Ruhong; Su, Zhiguo; Ma, Guanghui

    2015-11-01

    We explored an intelligent vaccine system via facile approaches using both experimental and theoretical techniques based on the two-dimensional graphene oxide (GO). Without extra addition of bio/chemical stimulators, the microsized GO imparted various immune activation tactics to improve the antigen immunogenicity. A high antigen adsorption was acquired, and the mechanism was revealed to be a combination of electrostatic, hydrophobic, and π-π stacking interactions. The ``folding GO'' acted as a cytokine self-producer and antigen reservoir and showed a particular autophagy, which efficiently promoted the activation of antigen presenting cells (APCs) and subsequent antigen cross-presentation. Such a ``One but All'' modality thus induced a high level of anti-tumor responses in a programmable way and resulted in efficient tumor regression in vivo. This work may shed light on the potential use of a new dimensional nano-platform in the development of high-performance cancer vaccines.We explored an intelligent vaccine system via facile approaches using both experimental and theoretical techniques based on the two-dimensional graphene oxide (GO). Without extra addition of bio/chemical stimulators, the microsized GO imparted various immune activation tactics to improve the antigen immunogenicity. A high antigen adsorption was acquired, and the mechanism was revealed to be a combination of electrostatic, hydrophobic, and π-π stacking interactions. The ``folding GO'' acted as a cytokine self-producer and antigen reservoir and showed a particular autophagy, which efficiently promoted the activation of antigen presenting cells (APCs) and subsequent antigen cross-presentation. Such a ``One but All'' modality thus induced a high level of anti-tumor responses in a programmable way and resulted in efficient tumor regression in vivo. This work may shed light on the potential use of a new dimensional nano-platform in the development of high-performance cancer vaccines. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr04986e

  17. Three-Dimensional Host Bone Coverage in Total Hip Arthroplasty for Crowe Types II and III Developmental Dysplasia of the Hip.

    PubMed

    Xu, Jiawei; Qu, Xinhua; Li, Huiwu; Mao, Yuanqing; Yu, Degang; Zhu, Zhenan

    2017-04-01

    Recommendations for minimum cup coverage based on anteroposterior radiographs are widely used as an intraoperative guide in total hip arthroplasty for patients with developmental dysplasia of the hip. The purpose of this study was to examine the validity of two-dimensional (2D) measurement of coverage with three-dimensional (3D) coverage and to identify parameters for determining the 3D coverage during surgery. We developed a technique to accurately reproduce the intraoperative anatomic geometry of the dysplastic acetabulum and measure the 3D cup coverage postoperatively. With this technique, we retrospectively analyzed the difference and correlation between 2D and 3D measurements of native bone coverage in 35 patients (45 hips) with Crowe II or III DDH. Linear regression analysis was performed to examine the intraoperative parameters related to coverage. The mean follow-up period was 7.64 years (range, 6.1-9.5 years). There was a significant difference and a fair correlation between 2D and 3D measurements. The 2D measurement underestimated the 3D cup coverage by approximately 13%. An excellent linear relationship was noted between the 3D coverage/uncoverage and the height of the uncovered portion (R 2  = 0.8440, P < .0001). There was no case of loosening or revision during the follow-up. Current minimum cup coverage recommendations based on 2D radiograph measurements should not be used as a direct intraoperative guide. The height of the uncovered portion is a useful parameter to determine the 3D coverage during surgery. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. The high performance parallel algorithm for Unified Gas-Kinetic Scheme

    NASA Astrophysics Data System (ADS)

    Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu

    2016-11-01

    A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.

  19. High Dimensional Classification Using Features Annealed Independence Rules.

    PubMed

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  20. Exploring nonlinear feature space dimension reduction and data representation in breast Cadx with Laplacian eigenmaps and t-SNE.

    PubMed

    Jamieson, Andrew R; Giger, Maryellen L; Drukker, Karen; Li, Hui; Yuan, Yading; Bhooshan, Neha

    2010-01-01

    In this preliminary study, recently developed unsupervised nonlinear dimension reduction (DR) and data representation techniques were applied to computer-extracted breast lesion feature spaces across three separate imaging modalities: Ultrasound (U.S.) with 1126 cases, dynamic contrast enhanced magnetic resonance imaging with 356 cases, and full-field digital mammography with 245 cases. Two methods for nonlinear DR were explored: Laplacian eigenmaps [M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural Comput. 15, 1373-1396 (2003)] and t-distributed stochastic neighbor embedding (t-SNE) [L. van der Maaten and G. Hinton, "Visualizing data using t-SNE," J. Mach. Learn. Res. 9, 2579-2605 (2008)]. These methods attempt to map originally high dimensional feature spaces to more human interpretable lower dimensional spaces while preserving both local and global information. The properties of these methods as applied to breast computer-aided diagnosis (CADx) were evaluated in the context of malignancy classification performance as well as in the visual inspection of the sparseness within the two-dimensional and three-dimensional mappings. Classification performance was estimated by using the reduced dimension mapped feature output as input into both linear and nonlinear classifiers: Markov chain Monte Carlo based Bayesian artificial neural network (MCMC-BANN) and linear discriminant analysis. The new techniques were compared to previously developed breast CADx methodologies, including automatic relevance determination and linear stepwise (LSW) feature selection, as well as a linear DR method based on principal component analysis. Using ROC analysis and 0.632+bootstrap validation, 95% empirical confidence intervals were computed for the each classifier's AUC performance. In the large U.S. data set, sample high performance results include, AUC0.632+ = 0.88 with 95% empirical bootstrap interval [0.787;0.895] for 13 ARD selected features and AUC0.632+ = 0.87 with interval [0.817;0.906] for four LSW selected features compared to 4D t-SNE mapping (from the original 81D feature space) giving AUC0.632+ = 0.90 with interval [0.847;0.919], all using the MCMC-BANN. Preliminary results appear to indicate capability for the new methods to match or exceed classification performance of current advanced breast lesion CADx algorithms. While not appropriate as a complete replacement of feature selection in CADx problems, DR techniques offer a complementary approach, which can aid elucidation of additional properties associated with the data. Specifically, the new techniques were shown to possess the added benefit of delivering sparse lower dimensional representations for visual interpretation, revealing intricate data structure of the feature space.

  1. Advanced Mathematical Tools in Metrology III

    NASA Astrophysics Data System (ADS)

    Ciarlini, P.

    The Table of Contents for the book is as follows: * Foreword * Invited Papers * The ISO Guide to the Expression of Uncertainty in Measurement: A Bridge between Statistics and Metrology * Bootstrap Algorithms and Applications * The TTRSs: 13 Oriented Constraints for Dimensioning, Tolerancing & Inspection * Graded Reference Data Sets and Performance Profiles for Testing Software Used in Metrology * Uncertainty in Chemical Measurement * Mathematical Methods for Data Analysis in Medical Applications * High-Dimensional Empirical Linear Prediction * Wavelet Methods in Signal Processing * Software Problems in Calibration Services: A Case Study * Robust Alternatives to Least Squares * Gaining Information from Biomagnetic Measurements * Full Papers * Increase of Information in the Course of Measurement * A Framework for Model Validation and Software Testing in Regression * Certification of Algorithms for Determination of Signal Extreme Values during Measurement * A Method for Evaluating Trends in Ozone-Concentration Data and Its Application to Data from the UK Rural Ozone Monitoring Network * Identification of Signal Components by Stochastic Modelling in Measurements of Evoked Magnetic Fields from Peripheral Nerves * High Precision 3D-Calibration of Cylindrical Standards * Magnetic Dipole Estimations for MCG-Data * Transfer Functions of Discrete Spline Filters * An Approximation Method for the Linearization of Tridimensional Metrology Problems * Regularization Algorithms for Image Reconstruction from Projections * Quality of Experimental Data in Hydrodynamic Research * Stochastic Drift Models for the Determination of Calibration Intervals * Short Communications * Projection Method for Lidar Measurement * Photon Flux Measurements by Regularised Solution of Integral Equations * Correct Solutions of Fit Problems in Different Experimental Situations * An Algorithm for the Nonlinear TLS Problem in Polynomial Fitting * Designing Axially Symmetric Electromechanical Systems of Superconducting Magnetic Levitation in Matlab Environment * Data Flow Evaluation in Metrology * A Generalized Data Model for Integrating Clinical Data and Biosignal Records of Patients * Assessment of Three-Dimensional Structures in Clinical Dentistry * Maximum Entropy and Bayesian Approaches to Parameter Estimation in Mass Metrology * Amplitude and Phase Determination of Sinusoidal Vibration in the Nanometer Range using Quadrature Signals * A Class of Symmetric Compactly Supported Wavelets and Associated Dual Bases * Analysis of Surface Topography by Maximum Entropy Power Spectrum Estimation * Influence of Different Kinds of Errors on Imaging Results in Optical Tomography * Application of the Laser Interferometry for Automatic Calibration of Height Setting Micrometer * Author Index

  2. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    PubMed

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  3. Private traits and attributes are predictable from digital records of human behavior

    PubMed Central

    Kosinski, Michal; Stillwell, David; Graepel, Thore

    2013-01-01

    We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait “Openness,” prediction accuracy is close to the test–retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy. PMID:23479631

  4. U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

    DTIC Science & Technology

    2017-10-01

    ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID

  5. Coupled dimensionality reduction and classification for supervised and semi-supervised multilabel learning

    PubMed Central

    Gönen, Mehmet

    2014-01-01

    Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F1, and micro F1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks. PMID:24532862

  6. Coupled dimensionality reduction and classification for supervised and semi-supervised multilabel learning.

    PubMed

    Gönen, Mehmet

    2014-03-01

    Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F 1 , and micro F 1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks.

  7. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  8. The Accuracy of Shock Capturing in Two Spatial Dimensions

    NASA Technical Reports Server (NTRS)

    Carpenter, Mark H.; Casper, Jay H.

    1997-01-01

    An assessment of the accuracy of shock capturing schemes is made for two-dimensional steady flow around a cylindrical projectile. Both a linear fourth-order method and a nonlinear third-order method are used in this study. It is shown, contrary to conventional wisdom, that captured two-dimensional shocks are asymptotically first-order, regardless of the design accuracy of the numerical method. The practical implications of this finding are discussed in the context of the efficacy of high-order numerical methods for discontinuous flows.

  9. Classification of molecular structure images by using ANN, RF, LBP, HOG, and size reduction methods for early stomach cancer detection

    NASA Astrophysics Data System (ADS)

    Aytaç Korkmaz, Sevcan; Binol, Hamidullah

    2018-03-01

    Patients who die from stomach cancer are still present. Early diagnosis is crucial in reducing the mortality rate of cancer patients. Therefore, computer aided methods have been developed for early detection in this article. Stomach cancer images were obtained from Fırat University Medical Faculty Pathology Department. The Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG) features of these images are calculated. At the same time, Sammon mapping, Stochastic Neighbor Embedding (SNE), Isomap, Classical multidimensional scaling (MDS), Local Linear Embedding (LLE), Linear Discriminant Analysis (LDA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Laplacian Eigenmaps methods are used for dimensional the reduction of the features. The high dimension of these features has been reduced to lower dimensions using dimensional reduction methods. Artificial neural networks (ANN) and Random Forest (RF) classifiers were used to classify stomach cancer images with these new lower feature sizes. New medical systems have developed to measure the effects of these dimensions by obtaining features in different dimensional with dimensional reduction methods. When all the methods developed are compared, it has been found that the best accuracy results are obtained with LBP_MDS_ANN and LBP_LLE_ANN methods.

  10. Protein linear indices of the 'macromolecular pseudograph alpha-carbon atom adjacency matrix' in bioinformatics. Part 1: prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor.

    PubMed

    Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A

    2005-04-15

    A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.

  11. Distributed Computation of the knn Graph for Large High-Dimensional Point Sets

    PubMed Central

    Plaku, Erion; Kavraki, Lydia E.

    2009-01-01

    High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors. PMID:19847318

  12. A Constrained Linear Estimator for Multiple Regression

    ERIC Educational Resources Information Center

    Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

    2010-01-01

    "Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

  13. Linear or linearizable first-order delay ordinary differential equations and their Lie point symmetries

    NASA Astrophysics Data System (ADS)

    Dorodnitsyn, Vladimir A.; Kozlov, Roman; Meleshko, Sergey V.; Winternitz, Pavel

    2018-05-01

    A recent article was devoted to an analysis of the symmetry properties of a class of first-order delay ordinary differential systems (DODSs). Here we concentrate on linear DODSs, which have infinite-dimensional Lie point symmetry groups due to the linear superposition principle. Their symmetry algebra always contains a two-dimensional subalgebra realized by linearly connected vector fields. We identify all classes of linear first-order DODSs that have additional symmetries, not due to linearity alone, and we present representatives of each class. These additional symmetries are then used to construct exact analytical particular solutions using symmetry reduction.

  14. On the design of classifiers for crop inventories

    NASA Technical Reports Server (NTRS)

    Heydorn, R. P.; Takacs, H. C.

    1986-01-01

    Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.

  15. Formation of large-scale structures with sharp density gradient through Rayleigh-Taylor growth in a two-dimensional slab under the two-fluid and finite Larmor radius effects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goto, R.; Hatori, T.; Miura, H., E-mail: miura.hideaki@nifs.ac.jp

    Two-fluid and the finite Larmor effects on linear and nonlinear growth of the Rayleigh-Taylor instability in a two-dimensional slab are studied numerically with special attention to high-wave-number dynamics and nonlinear structure formation at a low β-value. The two effects stabilize the unstable high wave number modes for a certain range of the β-value. In nonlinear simulations, the absence of the high wave number modes in the linear stage leads to the formation of the density field structure much larger than that in the single-fluid magnetohydrodynamic simulation, together with a sharp density gradient as well as a large velocity difference. Themore » formation of the sharp velocity difference leads to a subsequent Kelvin-Helmholtz-type instability only when both the two-fluid and finite Larmor radius terms are incorporated, whereas it is not observed otherwise. It is shown that the emergence of the secondary instability can modify the outline of the turbulent structures associated with the primary Rayleigh-Taylor instability.« less

  16. Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

    DOE PAGES

    Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

    2017-07-14

    A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less

  17. Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

    A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less

  18. Spatial correspondence of 4D CT ventilation and SPECT pulmonary perfusion defects in patients with malignant airway stenosis

    NASA Astrophysics Data System (ADS)

    Castillo, Richard; Castillo, Edward; McCurdy, Matthew; Gomez, Daniel R.; Block, Alec M.; Bergsma, Derek; Joy, Sarah; Guerrero, Thomas

    2012-04-01

    To determine the spatial overlap agreement between four-dimensional computed tomography (4D CT) ventilation and single photon emission computed tomography (SPECT) perfusion hypo-functioning pulmonary defect regions in a patient population with malignant airway stenosis. Treatment planning 4D CT images were obtained retrospectively for ten lung cancer patients with radiographically demonstrated airway obstruction due to gross tumor volume. Each patient also received a SPECT perfusion study within one week of the planning 4D CT, and prior to the initiation of treatment. Deformable image registration was used to map corresponding lung tissue elements between the extreme component phase images, from which quantitative three-dimensional (3D) images representing the local pulmonary specific ventilation were constructed. Semi-automated segmentation of the percentile perfusion distribution was performed to identify regional defects distal to the known obstructing lesion. Semi-automated segmentation was similarly performed by multiple observers to delineate corresponding defect regions depicted on 4D CT ventilation. Normalized Dice similarity coefficient (NDSC) indices were determined for each observer between SPECT perfusion and 4D CT ventilation defect regions to assess spatial overlap agreement. Tidal volumes determined from 4D CT ventilation were evaluated versus measurements obtained from lung parenchyma segmentation. Linear regression resulted in a linear fit with slope = 1.01 (R2 = 0.99). Respective values for the average DSC, NDSC1 mm and NDSC2 mm for all cases and multiple observers were 0.78, 0.88 and 0.99, indicating that, on average, spatial overlap agreement between ventilation and perfusion defect regions was comparable to the threshold for agreement within 1-2 mm uncertainty. Corresponding coefficients of variation for all metrics were similarly in the range: 0.10%-19%. This study is the first to quantitatively assess 3D spatial overlap agreement between clinically acquired SPECT perfusion and specific ventilation from 4D CT. Results suggest high correlation between methods within the sub-population of lung cancer patients with malignant airway stenosis.

  19. A higher-order conservation element solution element method for solving hyperbolic differential equations on unstructured meshes

    NASA Astrophysics Data System (ADS)

    Bilyeu, David

    This dissertation presents an extension of the Conservation Element Solution Element (CESE) method from second- to higher-order accuracy. The new method retains the favorable characteristics of the original second-order CESE scheme, including (i) the use of the space-time integral equation for conservation laws, (ii) a compact mesh stencil, (iii) the scheme will remain stable up to a CFL number of unity, (iv) a fully explicit, time-marching integration scheme, (v) true multidimensionality without using directional splitting, and (vi) the ability to handle two- and three-dimensional geometries by using unstructured meshes. This algorithm has been thoroughly tested in one, two and three spatial dimensions and has been shown to obtain the desired order of accuracy for solving both linear and non-linear hyperbolic partial differential equations. The scheme has also shown its ability to accurately resolve discontinuities in the solutions. Higher order unstructured methods such as the Discontinuous Galerkin (DG) method and the Spectral Volume (SV) methods have been developed for one-, two- and three-dimensional application. Although these schemes have seen extensive development and use, certain drawbacks of these methods have been well documented. For example, the explicit versions of these two methods have very stringent stability criteria. This stability criteria requires that the time step be reduced as the order of the solver increases, for a given simulation on a given mesh. The research presented in this dissertation builds upon the work of Chang, who developed a fourth-order CESE scheme to solve a scalar one-dimensional hyperbolic partial differential equation. The completed research has resulted in two key deliverables. The first is a detailed derivation of a high-order CESE methods on unstructured meshes for solving the conservation laws in two- and three-dimensional spaces. The second is the code implementation of these numerical methods in a computer code. For code development, a one-dimensional solver for the Euler equations was developed. This work is an extension of Chang's work on the fourth-order CESE method for solving a one-dimensional scalar convection equation. A generic formulation for the nth-order CESE method, where n ≥ 4, was derived. Indeed, numerical implementation of the scheme confirmed that the order of convergence was consistent with the order of the scheme. For the two- and three-dimensional solvers, SOLVCON was used as the basic framework for code implementation. A new solver kernel for the fourth-order CESE method has been developed and integrated into the framework provided by SOLVCON. The main part of SOLVCON, which deals with unstructured meshes and parallel computing, remains intact. The SOLVCON code for data transmission between computer nodes for High Performance Computing (HPC). To validate and verify the newly developed high-order CESE algorithms, several one-, two- and three-dimensional simulations where conducted. For the arbitrary order, one-dimensional, CESE solver, three sets of governing equations were selected for simulation: (i) the linear convection equation, (ii) the linear acoustic equations, (iii) the nonlinear Euler equations. All three systems of equations were used to verify the order of convergence through mesh refinement. In addition the Euler equations were used to solve the Shu-Osher and Blastwave problems. These two simulations demonstrated that the new high-order CESE methods can accurately resolve discontinuities in the flow field.For the two-dimensional, fourth-order CESE solver, the Euler equation was employed in four different test cases. The first case was used to verify the order of convergence through mesh refinement. The next three cases demonstrated the ability of the new solver to accurately resolve discontinuities in the flows. This was demonstrated through: (i) the interaction between acoustic waves and an entropy pulse, (ii) supersonic flow over a circular blunt body, (iii) supersonic flow over a guttered wedge. To validate and verify the three-dimensional, fourth-order CESE solver, two different simulations where selected. The first used the linear convection equations to demonstrate fourth-order convergence. The second used the Euler equations to simulate supersonic flow over a spherical body to demonstrate the scheme's ability to accurately resolve shocks. All test cases used are well known benchmark problems and as such, there are multiple sources available to validate the numerical results. Furthermore, the simulations showed that the high-order CESE solver was stable at a CFL number near unity.

  20. Laser-driven three-stage heavy-ion acceleration from relativistic laser-plasma interaction.

    PubMed

    Wang, H Y; Lin, C; Liu, B; Sheng, Z M; Lu, H Y; Ma, W J; Bin, J H; Schreiber, J; He, X T; Chen, J E; Zepf, M; Yan, X Q

    2014-01-01

    A three-stage heavy ion acceleration scheme for generation of high-energy quasimonoenergetic heavy ion beams is investigated using two-dimensional particle-in-cell simulation and analytical modeling. The scheme is based on the interaction of an intense linearly polarized laser pulse with a compound two-layer target (a front heavy ion layer + a second light ion layer). We identify that, under appropriate conditions, the heavy ions preaccelerated by a two-stage acceleration process in the front layer can be injected into the light ion shock wave in the second layer for a further third-stage acceleration. These injected heavy ions are not influenced by the screening effect from the light ions, and an isolated high-energy heavy ion beam with relatively low-energy spread is thus formed. Two-dimensional particle-in-cell simulations show that ∼100MeV/u quasimonoenergetic Fe24+ beams can be obtained by linearly polarized laser pulses at intensities of 1.1×1021W/cm2.

  1. Conceptual model of consumer’s willingness to eat functional foods

    PubMed

    Babicz-Zielinska, Ewa; Jezewska-Zychowicz, Maria

    The functional foods constitute the important segment of the food market. Among factors that determine the intentions to eat functional foods, the psychological factors play very important roles. Motives, attitudes and personality are key factors. The relationships between socio-demographic characteristics, attitudes and willingness to purchase functional foods were not fully confirmed. Consumers’ beliefs about health benefits from eaten foods seem to be a strong determinant of a choice of functional foods. The objective of this study was to determine relations between familiarity, attitudes, and beliefs in benefits and risks about functional foods and develop some conceptual models of willingness to eat. The sample of Polish consumers counted 1002 subjects at age 15+. The foods enriched with vitamins or minerals, and cholesterol-lowering margarine or drinks were considered. The questionnaire focused on familiarity with foods, attitudes, beliefs about benefits and risks of their consumption was constructed. The Pearson’s correlations and linear regression equations were calculated. The strongest relations appeared between attitudes, high health value and high benefits, (r = 0.722 and 0.712 for enriched foods, and 0.664 and 0.693 for cholesterol-lowering foods), and between high health value and high benefits (0.814 for enriched foods and 0.758 for cholesterol-lowering foods). The conceptual models based on linear regression of relations between attitudes and all other variables, considering or not the familiarity with the foods, were developed. The positive attitudes and declared consumption are more important for enriched foods. The beliefs on high health value and high benefits play the most important role in the purchase. The interrelations between different variables may be described by new linear regression models, with the beliefs in high benefits, positive attitudes and familiarity being most significant predictors. Health expectations and trust to functional foods are the key factors in their choice.

  2. A Predictive Model to Identify Patients With Fecal Incontinence Based on High-Definition Anorectal Manometry.

    PubMed

    Zifan, Ali; Ledgerwood-Lee, Melissa; Mittal, Ravinder K

    2016-12-01

    Three-dimensional high-definition anorectal manometry (3D-HDAM) is used to assess anal sphincter function; it determines profiles of regional pressure distribution along the length and circumference of the anal canal. There is no consensus, however, on the best way to analyze data from 3D-HDAM to distinguish healthy individuals from persons with sphincter dysfunction. We developed a computer analysis system to analyze 3D-HDAM data and to aid in the diagnosis and assessment of patients with fecal incontinence (FI). In a prospective study, we performed 3D-HDAM analysis of 24 asymptomatic healthy subjects (control subjects; all women; mean age, 39 ± 10 years) and 24 patients with symptoms of FI (all women; mean age, 58 ± 13 years). Patients completed a standardized questionnaire (FI severity index) to score the severity of FI symptoms. We developed and evaluated a robust prediction model to distinguish patients with FI from control subjects using linear discriminant, quadratic discriminant, and logistic regression analyses. In addition to collecting pressure information from the HDAM data, we assessed regional features based on shape characteristics and the anal sphincter pressure symmetry index. The combination of pressure values, anal sphincter area, and reflective symmetry values was identified in patients with FI versus control subjects with an area under the curve value of 1.0. In logistic regression analyses using different predictors, the model identified patients with FI with an area under the curve value of 0.96 (interquartile range, 0.22). In discriminant analysis, results were classified with a minimum error of 0.02, calculated using 10-fold cross-validation; different combinations of predictors produced median classification errors of 0.16 in linear discriminant analysis (interquartile range, 0.25) and 0.08 in quadratic discriminant analysis (interquartile range, 0.25). We developed and validated a novel prediction model to analyze 3D-HDAM data. This system can accurately distinguish patients with FI from control subjects. Copyright © 2016 AGA Institute. Published by Elsevier Inc. All rights reserved.

  3. Linear and nonlinear subspace analysis of hand movements during grasping.

    PubMed

    Cui, Phil Hengjun; Visell, Yon

    2014-01-01

    This study investigated nonlinear patterns of coordination, or synergies, underlying whole-hand grasping kinematics. Prior research has shed considerable light on roles played by such coordinated degrees-of-freedom (DOF), illuminating how motor control is facilitated by structural and functional specializations in the brain, peripheral nervous system, and musculoskeletal system. However, existing analyses suppose that the patterns of coordination can be captured by means of linear analyses, as linear combinations of nominally independent DOF. In contrast, hand kinematics is itself highly nonlinear in nature. To address this discrepancy, we sought to to determine whether nonlinear synergies might serve to more accurately and efficiently explain human grasping kinematics than is possible with linear analyses. We analyzed motion capture data acquired from the hands of individuals as they grasped an array of common objects, using four of the most widely used linear and nonlinear dimensionality reduction algorithms. We compared the results using a recently developed algorithm-agnostic quality measure, which enabled us to assess the quality of the dimensional reductions that resulted by assessing the extent to which local neighborhood information in the data was preserved. Although qualitative inspection of this data suggested that nonlinear correlations between kinematic variables were present, we found that linear modeling, in the form of Principle Components Analysis, could perform better than any of the nonlinear techniques we applied.

  4. Accuracy of Multi-echo Magnitude-based MRI (M-MRI) for Estimation of Hepatic Proton Density Fat Fraction (PDFF) in Children

    PubMed Central

    Zand, Kevin A.; Shah, Amol; Heba, Elhamy; Wolfson, Tanya; Hamilton, Gavin; Lam, Jessica; Chen, Joshua; Hooker, Jonathan C.; Gamst, Anthony C.; Middleton, Michael S.; Schwimmer, Jeffrey B.; Sirlin, Claude B.

    2015-01-01

    Purpose To assess accuracy of magnitude-based magnetic resonance imaging (M-MRI) in children to estimate hepatic proton density fat fraction (PDFF) using two to six echoes, with magnetic resonance spectroscopy (MRS)-measured PDFF as a reference standard. Materials and Methods This was an IRB-approved, HIPAA-compliant, single-center, cross-sectional, retrospective analysis of data collected prospectively between 2008 and 2013 in children with known or suspected non-alcoholic fatty liver disease (NAFLD). Two hundred and eighty-six children (8 – 20 [mean 14.2 ± 2.5] yrs; 182 boys) underwent same-day MRS and M-MRI. Unenhanced two-dimensional axial spoiled gradient-recalled-echo images at six echo times were obtained at 3T after a single low-flip-angle (10°) excitation with ≥ 120-ms recovery time. Hepatic PDFF was estimated using the first two, three, four, five, and all six echoes. For each number of echoes, accuracy of M-MRI to estimate PDFF was assessed by linear regression with MRS-PDFF as reference standard. Accuracy metrics were regression intercept, slope, average bias, and R2. Results MRS-PDFF ranged from 0.2 – 40.4% (mean 13.1 ± 9.8%). Using three to six echoes, regression intercept, slope, and average bias were 0.46 – 0.96%, 0.99 – 1.01, and 0.57 – 0.89%, respectively. Using two echoes, these values were 2.98%, 0.97, and 2.72%, respectively. R2 ranged 0.98 – 0.99 for all methods. Conclusion Using three to six echoes, M-MRI has high accuracy for hepatic PDFF estimation in children. PMID:25847512

  5. Accuracy of multiecho magnitude-based MRI (M-MRI) for estimation of hepatic proton density fat fraction (PDFF) in children.

    PubMed

    Zand, Kevin A; Shah, Amol; Heba, Elhamy; Wolfson, Tanya; Hamilton, Gavin; Lam, Jessica; Chen, Joshua; Hooker, Jonathan C; Gamst, Anthony C; Middleton, Michael S; Schwimmer, Jeffrey B; Sirlin, Claude B

    2015-11-01

    To assess accuracy of magnitude-based magnetic resonance imaging (M-MRI) in children to estimate hepatic proton density fat fraction (PDFF) using two to six echoes, with magnetic resonance spectroscopy (MRS) -measured PDFF as a reference standard. This was an IRB-approved, HIPAA-compliant, single-center, cross-sectional, retrospective analysis of data collected prospectively between 2008 and 2013 in children with known or suspected nonalcoholic fatty liver disease (NAFLD). Two hundred eighty-six children (8-20 [mean 14.2 ± 2.5] years; 182 boys) underwent same-day MRS and M-MRI. Unenhanced two-dimensional axial spoiled gradient-recalled-echo images at six echo times were obtained at 3T after a single low-flip-angle (10°) excitation with ≥ 120-ms recovery time. Hepatic PDFF was estimated using the first two, three, four, five, and all six echoes. For each number of echoes, accuracy of M-MRI to estimate PDFF was assessed by linear regression with MRS-PDFF as reference standard. Accuracy metrics were regression intercept, slope, average bias, and R(2) . MRS-PDFF ranged from 0.2-40.4% (mean 13.1 ± 9.8%). Using three to six echoes, regression intercept, slope, and average bias were 0.46-0.96%, 0.99-1.01, and 0.57-0.89%, respectively. Using two echoes, these values were 2.98%, 0.97, and 2.72%, respectively. R(2) ranged 0.98-0.99 for all methods. Using three to six echoes, M-MRI has high accuracy for hepatic PDFF estimation in children. © 2015 Wiley Periodicals, Inc.

  6. Spectral embedding finds meaningful (relevant) structure in image and microarray data

    PubMed Central

    Higgs, Brandon W; Weller, Jennifer; Solka, Jeffrey L

    2006-01-01

    Background Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. Results We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. Conclusion Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology. PMID:16483359

  7. Learning accurate and interpretable models based on regularized random forests regression

    PubMed Central

    2014-01-01

    Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120

  8. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  9. Linear regression models and k-means clustering for statistical analysis of fNIRS data.

    PubMed

    Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro

    2015-02-01

    We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.

  10. Linear regression models and k-means clustering for statistical analysis of fNIRS data

    PubMed Central

    Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro

    2015-01-01

    We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets. PMID:25780751

  11. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  12. Linear regression analysis of survival data with missing censoring indicators.

    PubMed

    Wang, Qihua; Dinse, Gregg E

    2011-04-01

    Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

  13. An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.

    DTIC Science & Technology

    1983-09-01

    books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40

  14. Testing hypotheses for differences between linear regression lines

    Treesearch

    Stanley J. Zarnoch

    2009-01-01

    Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...

  15. Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

    ERIC Educational Resources Information Center

    Schafer, William D.; Wang, Yuh-Yin

    A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…

  16. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    ERIC Educational Resources Information Center

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  17. Estimating monotonic rates from biological data using local linear regression.

    PubMed

    Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

    2017-03-01

    Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.

  18. Improving power and robustness for detecting genetic association with extreme-value sampling design.

    PubMed

    Chen, Hua Yun; Li, Mingyao

    2011-12-01

    Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs. © 2011 Wiley Periodicals, Inc.

  19. A comparison of radiometric correction techniques in the evaluation of the relationship between LST and NDVI in Landsat imagery.

    PubMed

    Tan, Kok Chooi; Lim, Hwee San; Matjafri, Mohd Zubir; Abdullah, Khiruddin

    2012-06-01

    Atmospheric corrections for multi-temporal optical satellite images are necessary, especially in change detection analyses, such as normalized difference vegetation index (NDVI) rationing. Abrupt change detection analysis using remote-sensing techniques requires radiometric congruity and atmospheric correction to monitor terrestrial surfaces over time. Two atmospheric correction methods were used for this study: relative radiometric normalization and the simplified method for atmospheric correction (SMAC) in the solar spectrum. A multi-temporal data set consisting of two sets of Landsat images from the period between 1991 and 2002 of Penang Island, Malaysia, was used to compare NDVI maps, which were generated using the proposed atmospheric correction methods. Land surface temperature (LST) was retrieved using ATCOR3_T in PCI Geomatica 10.1 image processing software. Linear regression analysis was utilized to analyze the relationship between NDVI and LST. This study reveals that both of the proposed atmospheric correction methods yielded high accuracy through examination of the linear correlation coefficients. To check for the accuracy of the equation obtained through linear regression analysis for every single satellite image, 20 points were randomly chosen. The results showed that the SMAC method yielded a constant value (in terms of error) to predict the NDVI value from linear regression analysis-derived equation. The errors (average) from both proposed atmospheric correction methods were less than 10%.

  20. Multivariate Strategies in Functional Magnetic Resonance Imaging

    ERIC Educational Resources Information Center

    Hansen, Lars Kai

    2007-01-01

    We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.

  1. A Few New 2+1-Dimensional Nonlinear Dynamics and the Representation of Riemann Curvature Tensors

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Zhang, Yufeng; Zhang, Xiangzhi

    2016-09-01

    We first introduced a linear stationary equation with a quadratic operator in ∂x and ∂y, then a linear evolution equation is given by N-order polynomials of eigenfunctions. As applications, by taking N=2, we derived a (2+1)-dimensional generalized linear heat equation with two constant parameters associative with a symmetric space. When taking N=3, a pair of generalized Kadomtsev-Petviashvili equations with the same eigenvalues with the case of N=2 are generated. Similarly, a second-order flow associative with a homogeneous space is derived from the integrability condition of the two linear equations, which is a (2+1)-dimensional hyperbolic equation. When N=3, the third second flow associative with the homogeneous space is generated, which is a pair of new generalized Kadomtsev-Petviashvili equations. Finally, as an application of a Hermitian symmetric space, we established a pair of spectral problems to obtain a new (2+1)-dimensional generalized Schrödinger equation, which is expressed by the Riemann curvature tensors.

  2. INNOVATIVE INSTRUMENTATION AND ANALYSIS OF THE TEMPERATURE MEASUREMENT FOR HIGH TEMPERATURE GASIFICATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seong W. Lee

    During this reporting period, the literature survey including the gasifier temperature measurement literature, the ultrasonic application and its background study in cleaning application, and spray coating process are completed. The gasifier simulator (cold model) testing has been successfully conducted. Four factors (blower voltage, ultrasonic application, injection time intervals, particle weight) were considered as significant factors that affect the temperature measurement. The Analysis of Variance (ANOVA) was applied to analyze the test data. The analysis shows that all four factors are significant to the temperature measurements in the gasifier simulator (cold model). The regression analysis for the case with the normalizedmore » room temperature shows that linear model fits the temperature data with 82% accuracy (18% error). The regression analysis for the case without the normalized room temperature shows 72.5% accuracy (27.5% error). The nonlinear regression analysis indicates a better fit than that of the linear regression. The nonlinear regression model's accuracy is 88.7% (11.3% error) for normalized room temperature case, which is better than the linear regression analysis. The hot model thermocouple sleeve design and fabrication are completed. The gasifier simulator (hot model) design and the fabrication are completed. The system tests of the gasifier simulator (hot model) have been conducted and some modifications have been made. Based on the system tests and results analysis, the gasifier simulator (hot model) has met the proposed design requirement and the ready for system test. The ultrasonic cleaning method is under evaluation and will be further studied for the gasifier simulator (hot model) application. The progress of this project has been on schedule.« less

  3. Locally linear regression for pose-invariant face recognition.

    PubMed

    Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

    2007-07-01

    The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.

  4. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  5. Novel Approach to Three-Dimensional Echocardiographic Quantification of Right Ventricular Volumes and Function from Focused Views.

    PubMed

    Medvedofsky, Diego; Addetia, Karima; Patel, Amit R; Sedlmeier, Anke; Baumann, Rolf; Mor-Avi, Victor; Lang, Roberto M

    2015-10-01

    Echocardiographic assessment of the right ventricle is difficult because of its complex shape. Three-dimensional echocardiographic (3DE) imaging allows more accurate and reproducible analysis of the right ventricle than two-dimensional methodology. However, three-dimensional volumetric analysis has been hampered by difficulties obtaining consistently high-quality coronal views, required by the existing software packages. The aim of this study was to test a new approach for volumetric analysis without coronal views by using instead right ventricle-focused three-dimensional acquisition with multiple short-axis views extracted from the same data set. Transthoracic 3DE and cardiovascular magnetic resonance (CMR) images were prospectively obtained on the same day in 147 patients with wide ranges of right ventricular (RV) size and function. RV volumes and ejection fraction were measured from 3DE images using the new software and compared with CMR reference values. Comparisons included linear regression and Bland-Altman analyses. Repeated measurements were performed to assess measurement variability. Sixteen patients were excluded because of suboptimal image quality (89% feasibility). RV volumes and ejection fraction obtained with the new 3DE technique were in good agreement with CMR (end-diastolic volume, r = 0.95; end-systolic volume, r = 0.96; ejection fraction, r = 0.83). Biases were, respectively, -6 ± 11%, 0 ± 15%, and -7 ± 17% of the mean measured values. In a subset of patients with suboptimal 3DE images, the new analysis resulted in significantly improved accuracy against CMR and reproducibility, compared with previously used coronal view-based techniques. The time required for the 3DE analysis was approximately 4 min. The new software is fast, reproducible, and accurate compared with CMR over a wide range of RV size and function. Because right ventricle-focused 3DE acquisition is feasible in most patients, this approach may be applicable to a broader population of patients who can benefit from RV volumetric assessment. Copyright © 2015 American Society of Echocardiography. Published by Elsevier Inc. All rights reserved.

  6. Linear and volumetric dimensional changes of injection-molded PMMA denture base resins.

    PubMed

    El Bahra, Shadi; Ludwig, Klaus; Samran, Abdulaziz; Freitag-Wolf, Sandra; Kern, Matthias

    2013-11-01

    The aim of this study was to evaluate the linear and volumetric dimensional changes of six denture base resins processed by their corresponding injection-molding systems at 3 time intervals of water storage. Two heat-curing (SR Ivocap Hi Impact and Lucitone 199) and four auto-curing (IvoBase Hybrid, IvoBase Hi Impact, PalaXpress, and Futura Gen) acrylic resins were used with their specific injection-molding technique to fabricate 6 specimens of each material. Linear and volumetric dimensional changes were determined by means of a digital caliper and an electronic hydrostatic balance, respectively, after water storage of 1, 30, or 90 days. Means and standard deviations of linear and volumetric dimensional changes were calculated in percentage (%). Statistical analysis was done using Student's and Welch's t tests with Bonferroni-Holm correction for multiple comparisons (α=0.05). Statistically significant differences in linear dimensional changes between resins were demonstrated at all three time intervals of water immersion (p≤0.05), with exception of the following comparisons which showed no significant difference: IvoBase Hi Impact/SR Ivocap Hi Impact and PalaXpress/Lucitone 199 after 1 day, Futura Gen/PalaXpress and PalaXpress/Lucitone 199 after 30 days, and IvoBase Hybrid/IvoBase Hi Impact after 90 days. Also, statistically significant differences in volumetric dimensional changes between resins were found at all three time intervals of water immersion (p≤0.05), with exception of the comparison between PalaXpress and Futura Gen. Denture base resins (IvoBase Hybrid and IvoBase Hi Impact) processed by the new injection-molding system (IvoBase), revealed superior dimensional precision. Copyright © 2013 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

  7. Evaluation of force-velocity and power-velocity relationship of arm muscles.

    PubMed

    Sreckovic, Sreten; Cuk, Ivan; Djuric, Sasa; Nedeljkovic, Aleksandar; Mirkov, Dragan; Jaric, Slobodan

    2015-08-01

    A number of recent studies have revealed an approximately linear force-velocity (F-V) and, consequently, a parabolic power-velocity (P-V) relationship of multi-joint tasks. However, the measurement characteristics of their parameters have been neglected, particularly those regarding arm muscles, which could be a problem for using the linear F-V model in both research and routine testing. Therefore, the aims of the present study were to evaluate the strength, shape, reliability, and concurrent validity of the F-V relationship of arm muscles. Twelve healthy participants performed maximum bench press throws against loads ranging from 20 to 70 % of their maximum strength, and linear regression model was applied on the obtained range of F and V data. One-repetition maximum bench press and medicine ball throw tests were also conducted. The observed individual F-V relationships were exceptionally strong (r = 0.96-0.99; all P < 0.05) and fairly linear, although it remains unresolved whether a polynomial fit could provide even stronger relationships. The reliability of parameters obtained from the linear F-V regressions proved to be mainly high (ICC > 0.80), while their concurrent validity regarding directly measured F, P, and V ranged from high (for maximum F) to medium-to-low (for maximum P and V). The findings add to the evidence that the linear F-V and, consequently, parabolic P-V models could be used to study the mechanical properties of muscular systems, as well as to design a relatively simple, reliable, and ecologically valid routine test of the muscle ability of force, power, and velocity production.

  8. Robust learning for optimal treatment decision with NP-dimensionality

    PubMed Central

    Shi, Chengchun; Song, Rui; Lu, Wenbin

    2016-01-01

    In order to identify important variables that are involved in making optimal treatment decision, Lu, Zhang and Zeng (2013) proposed a penalized least squared regression framework for a fixed number of predictors, which is robust against the misspecification of the conditional mean model. Two problems arise: (i) in a world of explosively big data, effective methods are needed to handle ultra-high dimensional data set, for example, with the dimension of predictors is of the non-polynomial (NP) order of the sample size; (ii) both the propensity score and conditional mean models need to be estimated from data under NP dimensionality. In this paper, we propose a robust procedure for estimating the optimal treatment regime under NP dimensionality. In both steps, penalized regressions are employed with the non-concave penalty function, where the conditional mean model of the response given predictors may be misspecified. The asymptotic properties, such as weak oracle properties, selection consistency and oracle distributions, of the proposed estimators are investigated. In addition, we study the limiting distribution of the estimated value function for the obtained optimal treatment regime. The empirical performance of the proposed estimation method is evaluated by simulations and an application to a depression dataset from the STAR*D study. PMID:28781717

  9. Aspects of porosity prediction using multivariate linear regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Byrnes, A.P.; Wilson, M.D.

    1991-03-01

    Highly accurate multiple linear regression models have been developed for sandstones of diverse compositions. Porosity reduction or enhancement processes are controlled by the fundamental variables, Pressure (P), Temperature (T), Time (t), and Composition (X), where composition includes mineralogy, size, sorting, fluid composition, etc. The multiple linear regression equation, of which all linear porosity prediction models are subsets, takes the generalized form: Porosity = C{sub 0} + C{sub 1}(P) + C{sub 2}(T) + C{sub 3}(X) + C{sub 4}(t) + C{sub 5}(PT) + C{sub 6}(PX) + C{sub 7}(Pt) + C{sub 8}(TX) + C{sub 9}(Tt) + C{sub 10}(Xt) + C{sub 11}(PTX) + C{submore » 12}(PXt) + C{sub 13}(PTt) + C{sub 14}(TXt) + C{sub 15}(PTXt). The first four primary variables are often interactive, thus requiring terms involving two or more primary variables (the form shown implies interaction and not necessarily multiplication). The final terms used may also involve simple mathematic transforms such as log X, e{sup T}, X{sup 2}, or more complex transformations such as the Time-Temperature Index (TTI). The X term in the equation above represents a suite of compositional variable and, therefore, a fully expanded equation may include a series of terms incorporating these variables. Numerous published bivariate porosity prediction models involving P (or depth) or Tt (TTI) are effective to a degree, largely because of the high degree of colinearity between p and TTI. However, all such bivariate models ignore the unique contributions of P and Tt, as well as various X terms. These simpler models become poor predictors in regions where colinear relations change, were important variables have been ignored, or where the database does not include a sufficient range or weight distribution for the critical variables.« less

  10. An automated ranking platform for machine learning regression models for meat spoilage prediction using multi-spectral imaging and metabolic profiling.

    PubMed

    Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady

    2017-09-01

    Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Highly cytocompatible and flexible three-dimensional graphene/polydimethylsiloxane composite for culture and electrochemical detection of L929 fibroblast cells.

    PubMed

    Waiwijit, Uraiwan; Maturos, Thitima; Pakapongpan, Saithip; Phokharatkul, Ditsayut; Wisitsoraat, Anurat; Tuantranont, Adisorn

    2016-08-01

    Recently, three-dimensional graphene interconnected network has attracted great interest as a scaffold structure for tissue engineering due to its high biocompatibility, high electrical conductivity, high specific surface area and high porosity. However, free-standing three-dimensional graphene exhibits poor flexibility and stability due to ease of disintegration during processing. In this work, three-dimensional graphene is composited with polydimethylsiloxane to improve the structural flexibility and stability by a new simple two-step process comprising dip coating of polydimethylsiloxane on chemical vapor deposited graphene/Ni foam and wet etching of nickel foam. Structural characterizations confirmed an interconnected three-dimensional multi-layer graphene structure with thin polydimethylsiloxane scaffold. The composite was employed as a substrate for culture of L929 fibroblast cells and its cytocompatibility was evaluated by cell viability (Alamar blue assay), reactive oxygen species production and vinculin immunofluorescence imaging. The result revealed that cell viability on three-dimensional graphene/polydimethylsiloxane composite increased with increasing culture time and was slightly different from a polystyrene substrate (control). Moreover, cells cultured on three-dimensional graphene/polydimethylsiloxane composite generated less ROS than the control at culture times of 3-6 h. The results of immunofluorescence staining demonstrated that fibroblast cells expressed adhesion protein (vinculin) and adhered well on three-dimensional graphene/polydimethylsiloxane surface. Good cell adhesion could be attributed to suitable surface properties of three-dimensional graphene/polydimethylsiloxane with moderate contact angle and small negative zeta potential in culture solution. The results of electrochemical study by cyclic voltammetry showed that an oxidation current signal with no apparent peak was induced by fibroblast cells and the oxidation current at an oxidation potential of +0.9 V increased linearly with increasing cell number. Therefore, the three-dimensional graphene/polydimethylsiloxane composite exhibits high cytocompatibility and can potentially be used as a conductive substrate for cell-based electrochemical sensing. © The Author(s) 2016.

  12. Proarrhythmia risk prediction using human induced pluripotent stem cell-derived cardiomyocytes.

    PubMed

    Yamazaki, Daiju; Kitaguchi, Takashi; Ishimura, Masakazu; Taniguchi, Tomohiko; Yamanishi, Atsuhiro; Saji, Daisuke; Takahashi, Etsushi; Oguchi, Masao; Moriyama, Yuta; Maeda, Sanae; Miyamoto, Kaori; Morimura, Kaoru; Ohnaka, Hiroki; Tashibu, Hiroyuki; Sekino, Yuko; Miyamoto, Norimasa; Kanda, Yasunari

    2018-04-01

    Human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) are expected to become a useful tool for proarrhythmia risk prediction in the non-clinical drug development phase. Several features including electrophysiological properties, ion channel expression profile and drug responses were investigated using commercially available hiPSC-CMs, such as iCell-CMs and Cor.4U-CMs. Although drug-induced arrhythmia has been extensively examined by microelectrode array (MEA) assays in iCell-CMs, it has not been fully understood an availability of Cor.4U-CMs for proarrhythmia risk. Here, we evaluated the predictivity of proarrhythmia risk using Cor.4U-CMs. MEA assay revealed linear regression between inter-spike interval and field potential duration (FPD). The hERG inhibitor E-4031 induced reverse-use dependent FPD prolongation. We next evaluated the proarrhythmia risk prediction by a two-dimensional map, which we have previously proposed. We determined the relative torsade de pointes risk score, based on the extent of FPD with Fridericia's correction (FPDcF) change and early afterdepolarization occurrence, and calculated the margins normalized to free effective therapeutic plasma concentrations. The drugs were classified into three risk groups using the two-dimensional map. This risk-categorization system showed high concordance with the torsadogenic information obtained by a public database CredibleMeds. Taken together, these results indicate that Cor.4U-CMs can be used for drug-induced proarrhythmia risk prediction. Copyright © 2018 The Authors. Production and hosting by Elsevier B.V. All rights reserved.

  13. MUSTA fluxes for systems of conservation laws

    NASA Astrophysics Data System (ADS)

    Toro, E. F.; Titarev, V. A.

    2006-08-01

    This paper is about numerical fluxes for hyperbolic systems and we first present a numerical flux, called GFORCE, that is a weighted average of the Lax-Friedrichs and Lax-Wendroff fluxes. For the linear advection equation with constant coefficient, the new flux reduces identically to that of the Godunov first-order upwind method. Then we incorporate GFORCE in the framework of the MUSTA approach [E.F. Toro, Multi-Stage Predictor-Corrector Fluxes for Hyperbolic Equations. Technical Report NI03037-NPA, Isaac Newton Institute for Mathematical Sciences, University of Cambridge, UK, 17th June, 2003], resulting in a version that we call GMUSTA. For non-linear systems this gives results that are comparable to those of the Godunov method in conjunction with the exact Riemann solver or complete approximate Riemann solvers, noting however that in our approach, the solution of the Riemann problem in the conventional sense is avoided. Both the GFORCE and GMUSTA fluxes are extended to multi-dimensional non-linear systems in a straightforward unsplit manner, resulting in linearly stable schemes that have the same stability regions as the straightforward multi-dimensional extension of Godunov's method. The methods are applicable to general meshes. The schemes of this paper share with the family of centred methods the common properties of being simple and applicable to a large class of hyperbolic systems, but the schemes of this paper are distinctly more accurate. Finally, we proceed to the practical implementation of our numerical fluxes in the framework of high-order finite volume WENO methods for multi-dimensional non-linear hyperbolic systems. Numerical results are presented for the Euler equations and for the equations of magnetohydrodynamics.

  14. Knowledge Driven Image Mining with Mixture Density Mercer Kernels

    NASA Technical Reports Server (NTRS)

    Srivastava, Ashok N.; Oza, Nikunj

    2004-01-01

    This paper presents a new methodology for automatic knowledge driven image mining based on the theory of Mercer Kernels; which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. In that high dimensional feature space, linear clustering, prediction, and classification algorithms can be applied and the results can be mapped back down to the original image space. Thus, highly nonlinear structure in the image can be recovered through the use of well-known linear mathematics in the feature space. This process has a number of advantages over traditional methods in that it allows for nonlinear interactions to be modelled with only a marginal increase in computational costs. In this paper, we present the theory of Mercer Kernels, describe its use in image mining, discuss a new method to generate Mercer Kernels directly from data, and compare the results with existing algorithms on data from the MODIS (Moderate Resolution Spectral Radiometer) instrument taken over the Arctic region. We also discuss the potential application of these methods on the Intelligent Archive, a NASA initiative for developing a tagged image data warehouse for the Earth Sciences.

  15. Knowledge Driven Image Mining with Mixture Density Mercer Kernals

    NASA Technical Reports Server (NTRS)

    Srivastava, Ashok N.; Oza, Nikunj

    2004-01-01

    This paper presents a new methodology for automatic knowledge driven image mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. In that high dimensional feature space, linear clustering, prediction, and classification algorithms can be applied and the results can be mapped back down to the original image space. Thus, highly nonlinear structure in the image can be recovered through the use of well-known linear mathematics in the feature space. This process has a number of advantages over traditional methods in that it allows for nonlinear interactions to be modelled with only a marginal increase in computational costs. In this paper we present the theory of Mercer Kernels; describe its use in image mining, discuss a new method to generate Mercer Kernels directly from data, and compare the results with existing algorithms on data from the MODIS (Moderate Resolution Spectral Radiometer) instrument taken over the Arctic region. We also discuss the potential application of these methods on the Intelligent Archive, a NASA initiative for developing a tagged image data warehouse for the Earth Sciences.

  16. Participant characteristics and intervention processes associated with reductions in television viewing in the High Five for Kids study

    PubMed Central

    Cespedes, Elizabeth M.; Horan, Christine M.; Gillman, Matthew W.; Gortmaker, Steven L.; Price, Sarah; Rifas-Shiman, Sheryl L.; Mitchell, Kathleen; Taveras, Elsie M.

    2014-01-01

    Objective To evaluate the High Five for Kids intervention effect on television (TV) within subgroups, examine participant characteristics associated with process measures and assess perceived helpfulness of TV intervention components. Method High Five (RCT of 445 overweight/obese 2–7 year-olds in Massachusetts [2006–2008]) reduced TV by 0.36 hours/day. 1-year effects on TV, stratified by subgroup, were assessed using linear regression. Among intervention participants (n=253), associations of intervention component helpfulness with TV reduction were examined using linear regression and associations of participant characteristics with processes linked to TV reduction (choosing TV and completing intervention visits) were examined using logistic regression. Results High Five reduced TV across subgroups. Parents of Latino (v. white) children had lower odds of completing >=2 study visits (OR 0.39 [95%CI: 0.18, 0.84]). Parents of black (v. white) children had higher odds of choosing TV (OR: 2.23 [95% CI: 1.08, 4.59]), as did parents of obese (v. overweight) children and children watching >=2 hours/day (v. <2) at baseline. Greater perceived helpfulness was associated with greater TV reduction. Conclusion Clinic-based motivational interviewing reduces TV in children. Low cost education approaches (e.g., printed materials) may be well-received. Parents of children at higher obesity risk could be more motivated to reduce TV. PMID:24518002

  17. The three-dimensional evolution of a plane mixing layer. Part 2: Pairing and transition to turbulence

    NASA Technical Reports Server (NTRS)

    Moser, Robert D.; Rogers, Michael M.

    1992-01-01

    The evolution of three-dimensional temporally evolving plane mixing layers through as many as three pairings was simulated numerically. Initial conditions for all simulations consisted of a few low-wavenumber disturbances, usually derived from linear stability theory, in addition to the mean velocity. Three-dimensional perturbations were used with amplitudes ranging from infinitesimal to large enough to trigger a rapid transition to turbulence. Pairing is found both to inhibit the growth of infinitesimal three-dimensional disturbances and to trigger the transition to turbulence in highly three dimensional flows. The mechanisms responsible for the growth of three-dimensionality as well as the initial phases of the transition to turbulence are described. The transition to turbulence is accompanied by the formation of thin sheets of span wise vorticity, which undergo a secondary roll up. Transition also produces an increase in the degree of scalar mixing, in agreement with experimental observations of mixing transition. Simulations were also conducted to investigate changes in span wise length scale that may occur in response to the change in stream wise length scale during a pairing. The linear mechanism for this process was found to be very slow, requiring roughly three pairings to complete a doubling of the span wise scale. Stronger three-dimensionality can produce more rapid scale changes but is also likely to trigger transition to turbulence. No evidence was found for a change from an organized array of rib vortices at one span wise scale to a similar array at a larger span wise scale.

  18. Vestibular coriolis effect differences modeled with three-dimensional linear-angular interactions.

    PubMed

    Holly, Jan E

    2004-01-01

    The vestibular coriolis (or "cross-coupling") effect is traditionally explained by cross-coupled angular vectors, which, however, do not explain the differences in perceptual disturbance under different acceleration conditions. For example, during head roll tilt in a rotating chair, the magnitude of perceptual disturbance is affected by a number of factors, including acceleration or deceleration of the chair rotation or a zero-g environment. Therefore, it has been suggested that linear-angular interactions play a role. The present research investigated whether these perceptual differences and others involving linear coriolis accelerations could be explained under one common framework: the laws of motion in three dimensions, which include all linear-angular interactions among all six components of motion (three angular and three linear). The results show that the three-dimensional laws of motion predict the differences in perceptual disturbance. No special properties of the vestibular system or nervous system are required. In addition, simulations were performed with angular, linear, and tilt time constants inserted into the model, giving the same predictions. Three-dimensional graphics were used to highlight the manner in which linear-angular interaction causes perceptual disturbance, and a crucial component is the Stretch Factor, which measures the "unexpected" linear component.

  19. Effect of Malmquist bias on correlation studies with IRAS data base

    NASA Technical Reports Server (NTRS)

    Verter, Frances

    1993-01-01

    The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.

  20. Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huan, Xun; Safta, Cosmin; Sargsyan, Khachik

    Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quanti cation analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several com- pressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers of l1 ls, SpaRSA, CGIST, FPC AS, and ADMM, we develop techniques to mitigate over tting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendationsmore » on parameter settings for these tech- niques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-cross flow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy and computational tradeoffs between polynomial bases of different degrees, and practi- cability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.« less

  1. Femoral anteversion assessment: Comparison of physical examination, gait analysis, and EOS biplanar radiography.

    PubMed

    Westberry, David E; Wack, Linda I; Davis, Roy B; Hardin, James W

    2018-05-01

    Multiple measurement methods are available to assess transverse plane alignment of the lower extremity. This study was performed to determine the extent of correlation between femoral anteversion assessment using simultaneous biplanar radiographs and three-dimensional modeling (EOS imaging), clinical hip rotation by physical examination, and dynamic hip rotation assessed by gait analysis. Seventy-seven patients with cerebral palsy (GMFCS Level I and II) and 33 neurologically typical children with torsional abnormalities completed a comprehensive gait analysis with same day biplanar anterior-posterior and lateral radiographs and three-dimensional transverse plane assessment of femoral anteversion. Correlations were determined between physical exam of hip rotation, EOS imaging of femoral anteversion, and transverse plane hip kinematics for this retrospective review study. Linear regression analysis revealed a weak relationship between physical examination measures of hip rotation and biplanar radiographic assessment of femoral anteversion. Similarly, poor correlation was found between clinical evaluation of femoral anteversion and motion assessment of dynamic hip rotation. Correlations were better in neurologically typical children with torsional abnormalities compared to children with gait dysfunction secondary to cerebral palsy. Dynamic hip rotation cannot be predicted by physical examination measures of hip range of motion or from three-dimensional assessment of femoral anteversion derived from biplanar radiographs. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Association between surgeon volume and hospitalisation costs for patients with oral cancer: a nationwide population base study in Taiwan.

    PubMed

    Lee, C-C; Ho, H-C; Jack, Lee C-C; Su, Y-C; Lee, M-S; Hung, S-K; Chou, Pesus

    2010-02-01

    Oral cancer leads to a considerable use of and expenditure on health care. Wide resection of the tumour and reconstruction with a pedicle flap/free flap is widely used. This study was conducted to explore the relationship between hospitalisation costs and surgeon case volume when this operation was performed. A population-based study. This study uses data for the years 2005-2006 obtained from the National Health Insurance Research Database published in the Taiwanese National Health Research Institute. From this population-based data, the authors selected a total of 2663 oral cancer patients who underwent tumour resection and reconstruction. Case volume relationships were based on the following criteria; low-, medium-, high-, very high-volume surgeons were defined by or= 56 resections with reconstruction, respectively. Hierarchical linear regression analysis was subsequently performed to explore the relationship between surgeon case volume and the cost and length of hospitalisation. The mean hospitalisation cost among the 2663 patients was US$ 9528 (all costs are given in US dollars). After adjusting for physician, hospital, and patient characteristics in a hierarchical linear regression model, the cost per patient for low-volume surgeons was found to be US$ 741 (P = 0.012) higher than that for medium-volume surgeons, US$ 1546 (P < 0.001) higher than that for high-volume surgeons, and US$ 1820 (P < 0.001) higher than that for very-high-volume surgeons. After adjustment for physician, hospital, and patient characteristics, the hierarchical linear regression model revealed that the mean length of stay per patient for low-volume surgeons was the highest (P < 0.001). After adjustment for physician, hospital, and patient characteristics, low-volume surgeons performing wide excision with reconstructive surgery in oral cancer patients incurred significantly higher costs and longer hospital stays per patient than did other surgeons. Treatment strategies adopted by high- and very-high-volume surgeons should be analysed further and utilised more widely.

  3. Neural Connectivity Evidence for a Categorical-Dimensional Hybrid Model of Autism Spectrum Disorder.

    PubMed

    Elton, Amanda; Di Martino, Adriana; Hazlett, Heather Cody; Gao, Wei

    2016-07-15

    Autism spectrum disorder (ASD) encompasses a complex manifestation of symptoms that include deficits in social interaction and repetitive or stereotyped interests and behaviors. In keeping with the increasing recognition of the dimensional characteristics of ASD symptoms and the categorical nature of a diagnosis, we sought to delineate the neural mechanisms of ASD symptoms based on the functional connectivity of four known neural networks (i.e., default mode network, dorsal attention network, salience network, and executive control network). We leveraged an open data resource (Autism Brain Imaging Data Exchange) providing resting-state functional magnetic resonance imaging data sets from 90 boys with ASD and 95 typically developing boys. This data set also included the Social Responsiveness Scale as a dimensional measure of ASD traits. Seed-based functional connectivity was paired with linear regression to identify functional connectivity abnormalities associated with categorical effects of ASD diagnosis, dimensional effects of ASD-like behaviors, and their interaction. Our results revealed the existence of dimensional mechanisms of ASD uniquely affecting each network based on the presence of connectivity-behavioral relationships; these were independent of diagnostic category. However, we also found evidence of categorical differences (i.e., diagnostic group differences) in connectivity strength for each network as well as categorical differences in connectivity-behavioral relationships (i.e., diagnosis-by-behavior interactions), supporting the coexistence of categorical mechanisms of ASD. Our findings support a hybrid model for ASD characterization that includes a combination of categorical and dimensional brain mechanisms and provide a novel understanding of the neural underpinnings of ASD. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  4. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  5. Comparison of dimensionality reduction methods to predict genomic breeding values for carcass traits in pigs.

    PubMed

    Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S

    2015-10-09

    A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.

  6. Two-dimensional mesh embedding for Galerkin B-spline methods

    NASA Technical Reports Server (NTRS)

    Shariff, Karim; Moser, Robert D.

    1995-01-01

    A number of advantages result from using B-splines as basis functions in a Galerkin method for solving partial differential equations. Among them are arbitrary order of accuracy and high resolution similar to that of compact schemes but without the aliasing error. This work develops another property, namely, the ability to treat semi-structured embedded or zonal meshes for two-dimensional geometries. This can drastically reduce the number of grid points in many applications. Both integer and non-integer refinement ratios are allowed. The report begins by developing an algorithm for choosing basis functions that yield the desired mesh resolution. These functions are suitable products of one-dimensional B-splines. Finally, test cases for linear scalar equations such as the Poisson and advection equation are presented. The scheme is conservative and has uniformly high order of accuracy throughout the domain.

  7. Using Remote Sensing Data to Evaluate Surface Soil Properties in Alabama Ultisols

    NASA Technical Reports Server (NTRS)

    Sullivan, Dana G.; Shaw, Joey N.; Rickman, Doug; Mask, Paul L.; Luvall, Jeff

    2005-01-01

    Evaluation of surface soil properties via remote sensing could facilitate soil survey mapping, erosion prediction and allocation of agrochemicals for precision management. The objective of this study was to evaluate the relationship between soil spectral signature and surface soil properties in conventionally managed row crop systems. High-resolution RS data were acquired over bare fields in the Coastal Plain, Appalachian Plateau, and Ridge and Valley provinces of Alabama using the Airborne Terrestrial Applications Sensor multispectral scanner. Soils ranged from sandy Kandiudults to fine textured Rhodudults. Surface soil samples (0-1 cm) were collected from 163 sampling points for soil organic carbon, particle size distribution, and citrate dithionite extractable iron content. Surface roughness, soil water content, and crusting were also measured during sampling. Two methods of analysis were evaluated: 1) multiple linear regression using common spectral band ratios, and 2) partial least squares regression. Our data show that thermal infrared spectra are highly, linearly related to soil organic carbon, sand and clay content. Soil organic carbon content was the most difficult to quantify in these highly weathered systems, where soil organic carbon was generally less than 1.2%. Estimates of sand and clay content were best using partial least squares regression at the Valley site, explaining 42-59% of the variability. In the Coastal Plain, sandy surfaces prone to crusting limited estimates of sand and clay content via partial least squares and regression with common band ratios. Estimates of iron oxide content were a function of mineralogy and best accomplished using specific band ratios, with regression explaining 36-65% of the variability at the Valley and Coastal Plain sites, respectively.

  8. A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water.

    PubMed

    Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil

    2015-12-07

    High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.

  9. A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water

    PubMed Central

    Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil

    2015-01-01

    High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190

  10. Surface-Sensitive Microwear Texture Analysis of Attrition and Erosion.

    PubMed

    Ranjitkar, S; Turan, A; Mann, C; Gully, G A; Marsman, M; Edwards, S; Kaidonis, J A; Hall, C; Lekkas, D; Wetselaar, P; Brook, A H; Lobbezoo, F; Townsend, G C

    2017-03-01

    Scale-sensitive fractal analysis of high-resolution 3-dimensional surface reconstructions of wear patterns has advanced our knowledge in evolutionary biology, and has opened up opportunities for translatory applications in clinical practice. To elucidate the microwear characteristics of attrition and erosion in worn natural teeth, we scanned 50 extracted human teeth using a confocal profiler at a high optical resolution (X-Y, 0.17 µm; Z < 3 nm). Our hypothesis was that microwear complexity would be greater in erosion and that anisotropy would be greater in attrition. The teeth were divided into 4 groups, including 2 wear types (attrition and erosion) and 2 locations (anterior and posterior teeth; n = 12 for each anterior group, n = 13 for each posterior group) for 2 tissue types (enamel and dentine). The raw 3-dimensional data cloud was subjected to a newly developed rigorous standardization technique to reduce interscanner variability as well as to filter anomalous scanning data. Linear mixed effects (regression) analyses conducted separately for the dependent variables, complexity and anisotropy, showed the following effects of the independent variables: significant interactions between wear type and tissue type ( P = 0.0157 and P = 0.0003, respectively) and significant effects of location ( P < 0.0001 and P = 0.0035, respectively). There were significant associations between complexity and anisotropy when the dependent variable was either complexity ( P = 0.0003) or anisotropy ( P = 0.0014). Our findings of greater complexity in erosion and greater anisotropy in attrition confirm our hypothesis. The greatest geometric means were noted in dentine erosion for complexity and dentine attrition for anisotropy. Dentine also exhibited microwear characteristics that were more consistent with wear types than enamel. Overall, our findings could complement macrowear assessment in dental clinical practice and research and could assist in the early detection and management of pathologic tooth wear.

  11. Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2013-01-01

    This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…

  12. Multidimensionally encoded magnetic resonance imaging.

    PubMed

    Lin, Fa-Hsuan

    2013-07-01

    Magnetic resonance imaging (MRI) typically achieves spatial encoding by measuring the projection of a q-dimensional object over q-dimensional spatial bases created by linear spatial encoding magnetic fields (SEMs). Recently, imaging strategies using nonlinear SEMs have demonstrated potential advantages for reconstructing images with higher spatiotemporal resolution and reducing peripheral nerve stimulation. In practice, nonlinear SEMs and linear SEMs can be used jointly to further improve the image reconstruction performance. Here, we propose the multidimensionally encoded (MDE) MRI to map a q-dimensional object onto a p-dimensional encoding space where p > q. MDE MRI is a theoretical framework linking imaging strategies using linear and nonlinear SEMs. Using a system of eight surface SEM coils with an eight-channel radiofrequency coil array, we demonstrate the five-dimensional MDE MRI for a two-dimensional object as a further generalization of PatLoc imaging and O-space imaging. We also present a method of optimizing spatial bases in MDE MRI. Results show that MDE MRI with a higher dimensional encoding space can reconstruct images more efficiently and with a smaller reconstruction error when the k-space sampling distribution and the number of samples are controlled. Copyright © 2012 Wiley Periodicals, Inc.

  13. Classification of Large-Scale Remote Sensing Images for Automatic Identification of Health Hazards: Smoke Detection Using an Autologistic Regression Classifier.

    PubMed

    Wolters, Mark A; Dean, C B

    2017-01-01

    Remote sensing images from Earth-orbiting satellites are a potentially rich data source for monitoring and cataloguing atmospheric health hazards that cover large geographic regions. A method is proposed for classifying such images into hazard and nonhazard regions using the autologistic regression model, which may be viewed as a spatial extension of logistic regression. The method includes a novel and simple approach to parameter estimation that makes it well suited to handling the large and high-dimensional datasets arising from satellite-borne instruments. The methodology is demonstrated on both simulated images and a real application to the identification of forest fire smoke.

  14. Breakdown of lung framework and an increase in pores of Kohn as initial events of emphysema and a cause of reduction in diffusing capacity.

    PubMed

    Yoshikawa, Akira; Sato, Shuntaro; Tanaka, Tomonori; Hashisako, Mikiko; Kashima, Yukio; Tsuchiya, Tomoshi; Yamasaki, Naoya; Nagayasu, Takeshi; Yamamoto, Hiroshi; Fukuoka, Junya

    2016-01-01

    Pulmonary emphysema is the pathological prototype of chronic obstructive pulmonary disease and is also associated with other lung diseases. We considered that observation with different approaches may provide new insights for the pathogenesis of emphysema. We reviewed tissue blocks of the lungs of 25 cases with/without emphysema and applied a three-dimensional observation method to the blocks. Based on the three-dimensional characteristics of the alveolar structure, we considered one face of the alveolar polyhedron as a structural unit of alveoli and called it a framework unit (FU). We categorized FUs based on their morphological characteristics and counted their number to evaluate the destructive changes in alveoli. We also evaluated the number and the area of pores of Kohn in FUs. We performed linear regression analysis to estimate the effect of these data on pulmonary function tests. In multivariable regression analysis, a decrease in the number of FUs without an alveolar wall led to a significant decrease in the diffusing capacity of the lung for carbon monoxide (DLCO) and DLCO per unit alveolar volume, and an increase in the area of pores of Kohn had a significant effect on an increase in residual capacity. A breakdown in the lung framework and an increase in pores of Kohn are associated with a decrease in DLCO and DLCO per unit alveolar volume with/without emphysema.

  15. Investigative clinical study on prostate cancer part III: exploring total PSA and free testosterone distributions and linear correlations in groups and subgroups of operated prostate cancer patients according to the total PSA/FT ratio.

    PubMed

    Porcaro, Antonio B; Petrozziello, Aldo; Romano, Mario; Sava, Teodoro; Ghimenton, Claudio; Caruso, Beatrice; Migliorini, Filippo; Zecchini Antoniolli, Stefano; Rubilotta, Emanuele; Lacola, Vincenzo; Monaco, Carmelo; Comunale, Luigi

    2010-01-01

    Prostate cancer is an interesting tumor for endocrine investigation. The prostate-specific antigen/free testosterone (PSA/FT) ratio has been shown to be effective in clustering patients in prognostic groups as follows: low risk (PSA/FT ≤0.20), intermediate risk (PSA/FT >0.20 and ≤0.40) and high risk (PSA/FT >0.40 and ≤1.5). In the present study we explored the total PSA and FT distributions, and linear regression of FT predicting PSA in the different groups (PSA/FT, pT and pG) and subgroups (pT and pG) of patients according to the prognostic PSA/FT ratio. The study included 128 operated prostate cancer patients. Pretreatment simultaneous serum samples were obtained for measuring free testosterone (FT) and total PSA levels. Patients were grouped according to the total PSA/FT ratio prognostic clusters (≤0.20, >0.20 and ≤0.40, >0.40), pT (2, 3a and 3b+4) and pathological Gleason score (pG) (≤6, = 7 >3 + 4, ≥7 >4 + 3). The pT and pG sets were subgrouped according to the prognostic PSA/FT ratio. Linear regression analysis of FT predicting total PSA was computed according to the different PSA/FT prognostic clusters for the: (1) total sample population, (2) pT and pG groups, (3) intraprostatic (pT2) and extraprostatic disease (pT3a/3b/4), and (4) low-intermediate grade (pG ≤6) and high-grade (pG ≥7) prostate cancer. Analysis of variance always showed highly significant different PSA distributions for (1) the different PSA/FT, pT and pG groups; and (2) the pT and pG prognostic subgroups. Significant FT distributions were detected for the (1) PSA/FT and pT groups; and (2) the pT2, pT3a and pG ≤6 prognostic PSA/FT subgroups. Correlation, variance and linear regression analysis of FT predicting total PSA was significant for (1) the PSA/FT prognostic clusters, (2) all the pT2 and pT3a subgroups, and (3) the pT3b/4 subgroup with PSA/FT >0.20 and ≤0.40, and (4) all the pG subsets. Linear regression analysis showed that the slopes of the predicting variable (FT) were always highly significant for patients with (1) intraprostate and extraprostate disease, and (2) low-grade and high-grade prostate cancer. According to the prognostic PSA/FT ratio, significantly lower levels of FT are detected in prostate cancer patients with extensive and high-grade disease. Also, significant linear correlations of FT predicting PSA are assessed in the different groups and subgroups of patients clustered according to the prognostic PSA/FT ratio. Confirmatory studies are needed. Copyright © 2010 S. Karger AG, Basel.

  16. Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2011-01-01

    Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…

  17. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    ERIC Educational Resources Information Center

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  18. Existence and Stability of Compressible Current-Vortex Sheets in Three-Dimensional Magnetohydrodynamics

    NASA Astrophysics Data System (ADS)

    Chen, Gui-Qiang; Wang, Ya-Guang

    2008-03-01

    Compressible vortex sheets are fundamental waves, along with shocks and rarefaction waves, in entropy solutions to multidimensional hyperbolic systems of conservation laws. Understanding the behavior of compressible vortex sheets is an important step towards our full understanding of fluid motions and the behavior of entropy solutions. For the Euler equations in two-dimensional gas dynamics, the classical linearized stability analysis on compressible vortex sheets predicts stability when the Mach number M > sqrt{2} and instability when M < sqrt{2} ; and Artola and Majda’s analysis reveals that the nonlinear instability may occur if planar vortex sheets are perturbed by highly oscillatory waves even when M > sqrt{2} . For the Euler equations in three dimensions, every compressible vortex sheet is violently unstable and this instability is the analogue of the Kelvin Helmholtz instability for incompressible fluids. The purpose of this paper is to understand whether compressible vortex sheets in three dimensions, which are unstable in the regime of pure gas dynamics, become stable under the magnetic effect in three-dimensional magnetohydrodynamics (MHD). One of the main features is that the stability problem is equivalent to a free-boundary problem whose free boundary is a characteristic surface, which is more delicate than noncharacteristic free-boundary problems. Another feature is that the linearized problem for current-vortex sheets in MHD does not meet the uniform Kreiss Lopatinskii condition. These features cause additional analytical difficulties and especially prevent a direct use of the standard Picard iteration to the nonlinear problem. In this paper, we develop a nonlinear approach to deal with these difficulties in three-dimensional MHD. We first carefully formulate the linearized problem for the current-vortex sheets to show rigorously that the magnetic effect makes the problem weakly stable and establish energy estimates, especially high-order energy estimates, in terms of the nonhomogeneous terms and variable coefficients. Then we exploit these results to develop a suitable iteration scheme of the Nash Moser Hörmander type to deal with the loss of the order of derivative in the nonlinear level and establish its convergence, which leads to the existence and stability of compressible current-vortex sheets, locally in time, in three-dimensional MHD.

  19. Classical Testing in Functional Linear Models.

    PubMed

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.

  20. Classical Testing in Functional Linear Models

    PubMed Central

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155

Top